How Linear Regression Works - A Simple Explanation - by Ravishek Singh - Sep, 2024 - Medium
How Linear Regression Works - A Simple Explanation - by Ravishek Singh - Sep, 2024 - Medium
Open in app
Search
Source: Facebook
Introduction
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 1/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
Applications:
Trend Analysis: Identifies trends and patterns in data. This is useful for
understanding market trends, economic indicators, or social phenomena.
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 2/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
Approaches:
2. Multiple Linear Regression: Handles multiple independent variables. The model fits
a hyperplane to the data points.
4. Weighted Least Squares: Assigns different weights to data points based on their
reliability or importance.
Key Concepts
1. Dependent and Independent Variables: In linear regression, the dependent
variable is what you want to predict, while the independent variables are the
factors you think influence that prediction.
2. The Line of Best Fit: Linear regression finds the line that minimizes the distance
between the observed data points and the predicted values on the line. This line
is known as the line of best fit.
3. Equation of the Line: The relationship between the dependent and independent
variables can be expressed as:
Where:
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 3/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
c is the y-intercept
4. Residuals: The difference between the observed values and the values predicted
by the model. Linear regression aims to minimize these residuals.
The Problem
Imagine you run a small business selling handmade candles online. You want to
predict your monthly sales based on how much you spend on advertising. This is
where linear regression can help.
The Data
Suppose you have the following data from the past five months:
In this table:
The Goal
You want to find the relationship between your advertising spend and sales.
Specifically, you want to predict future sales based on different levels of advertising
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 4/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
spend.
Where:
m is the slope of the line (how much sales increase for each dollar spent on advertising).
For every additional dollar spent on advertising, your sales increase by $7.
Even if you spent $0 on advertising, you’d still expect to make $1,500 in sales
(perhaps from word-of-mouth or returning customers).
Making Predictions
Now, you can use this equation to predict future sales. For example, if you plan to
spend $900 on advertising next month, you can predict your sales like this:
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 5/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
The line of best fit would pass through or near these points, showing the trend that
as advertising spending increases, so do sales.
Simple linear regression involves just one independent variable. For example,
predicting house prices based on square footage alone. The model tries to fit a
straight line that best represents the relationship between the two variables.
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 6/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
When there are two or more independent variables, the model is referred to as
multiple linear regression. For instance, predicting house prices based on square
footage, number of bedrooms, and age of the property. The model will try to fit a
plane or hyperplane in multi-dimensional space that best fits the data.
1. Linearity: The relationship between the independent and dependent variables should
be linear.
Linear regression serves as the foundation for more complex machine learning
algorithms. Understanding how linear regression works gives you a strong base to
explore other techniques like polynomial regression, logistic regression, and neural
networks. Many machine learning models are extensions or variations of linear
regression.
Real-World Applications
1. Predicting House Prices: Linear regression is widely used to estimate real estate
values based on various factors such as location, size, and condition of the
property.
2. Sales Forecasting: Businesses use linear regression to predict future sales based
on historical data and market trends.
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 7/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
Evaluation Metrics
To assess the accuracy of your linear regression model, you can use metrics such as:
Mean Absolute Error (MAE): Mean Absolute Error (MAE) is a measure of how
close the predictions of a linear regression model are to the actual outcomes. It
calculates the average of the absolute differences between predicted values and
actual values. MAE is particularly useful because it gives a clear indication of the
average error in the predictions, making it easier to understand how much the
predictions deviate from the actual values.
Mean Squared Error (MSE): Mean Squared Error (MSE) is another way to
measure the accuracy of a linear regression model. It calculates the average of
the squared differences between the actual and predicted values. The squaring
of the differences penalizes larger errors more than smaller ones, making MSE
sensitive to outliers.
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 8/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
These metrics help you understand how well your model is performing and whether
it’s making accurate predictions.
Data Preparation: Once you have the data, you need to prepare it for the model. This
involves:
Cleaning the Data: Removing or imputing missing values, correcting errors, and
eliminating outliers.
Splitting the Data: Dividing the data into training and testing sets. Typically, 70–
80% of the data is used for training, and the remaining 20–30% is used for
testing.
Here:
Understanding Model Coefficients: After fitting the model, you can examine the
coefficients and intercept to understand the relationship between the independent
and dependent variables.
print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
Coefficients: Represent the change in the target variable for a one-unit change
in the respective feature, holding all other features constant.
Intercept: The expected value of the target variable when all feature values are
zero.
4. Making Predictions
Once the model is trained, you can use it to make predictions on new data. This is
done using the predict method:
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 10/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
y_pred = model.predict(X_test)
Here, X_test is the test data, and y_pred contains the predicted values.
R-squared (R²)
using Scikit-learn:
print("MAE:", mae)
print("MSE:", mse)
print("R²:", r2)
MAE provides the average absolute difference between actual and predicted
values.
MSE gives more weight to larger errors, making it useful when large errors are
particularly costly.
R² indicates how well the independent variables explain the variability of the
dependent variable.
6. Tuning and Improving the Model
If the initial model performance is not satisfactory, you can try:
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 11/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
1. Overfitting: When your model is too complex, it may fit the training data too
closely, leading to poor generalization to new data.
Conclusion
Linear regression is more than just a simple algorithm; it’s a stepping stone to
understanding the broader world of machine learning. By mastering linear
regression, you gain a solid foundation that will support your journey through more
advanced topics in data science and machine learning. Whether you’re predicting
prices, forecasting trends, or assessing risks, linear regression remains a
fundamental tool in your data science toolkit
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 12/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium
Edit profile
Ravishek Singh
Aug 22
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 13/17