0% found this document useful (0 votes)
17 views13 pages

How Linear Regression Works - A Simple Explanation - by Ravishek Singh - Sep, 2024 - Medium

Uploaded by

Ravishek Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views13 pages

How Linear Regression Works - A Simple Explanation - by Ravishek Singh - Sep, 2024 - Medium

Uploaded by

Ravishek Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Open in app

Search

How Linear Regression Works: A Simple


Explanation
Ravishek Singh
10 min read · 1 day ago

Listen Share More

Source: Facebook

Linear regression, a core machine learning algorithm, discovers relationships between


variables and predicts outcomes. As a supervised learning technique, it learns from labeled
data by fitting a straight line to the points. This line enables predictions for new data.
Linear regression is crucial for understanding patterns and building predictive models in
diverse machine-learning applications.

Introduction
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 1/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Machine learning is revolutionizing how we interact with data, enabling


computers to learn from past experiences and make informed decisions. Among
the various machine learning algorithms, linear regression is one of the simplest
yet most powerful tools. This article will walk you through the fundamentals of
linear regression, explaining why it is considered the building block of machine
learning.

What is Linear Regression?

Linear regression is a supervised learning algorithm used to predict a continuous


target variable based on one or more predictor variables. It assumes a linear
relationship between the input variables (independent variables) and the output
variable (dependent variable). In simple terms, linear regression tries to find the
best-fitting straight line through the data points, allowing us to make predictions
based on that line.

Applications and Approaches of Linear Regression

Linear regression, a statistical technique that models the relationship between a


dependent variable and one or more independent variables, has found widespread
application across various fields. Here are some common applications and
approaches:

Applications:

Predictive Modeling: Used to forecast future values of a variable based on historical


data. For instance, predicting sales figures, stock prices, or weather patterns.

Trend Analysis: Identifies trends and patterns in data. This is useful for
understanding market trends, economic indicators, or social phenomena.

Relationship Analysis: Examines the relationship between variables. For example,


analyzing the correlation between education level and income or between advertising
expenditure and sales.

Quality Control: Monitors product quality by identifying deviations from expected


standards.

Risk Assessment: Evaluates risk factors in various domains, such as finance,


insurance, and healthcare.

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 2/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Approaches:

1. Simple Linear Regression: Involves a single independent variable and a dependent


variable. The model fits a straight line to the data points.

2. Multiple Linear Regression: Handles multiple independent variables. The model fits
a hyperplane to the data points.

3. Polynomial Regression: Models non-linear relationships by fitting a polynomial


curve to the data.

4. Weighted Least Squares: Assigns different weights to data points based on their
reliability or importance.

5. Robust Regression: Handles outliers and non-normal data distributions.

6. Generalized Linear Models (GLMs): Extend linear regression to accommodate non-


normal response variables (e.g., count data, binary data)

Key Concepts
1. Dependent and Independent Variables: In linear regression, the dependent
variable is what you want to predict, while the independent variables are the
factors you think influence that prediction.

2. The Line of Best Fit: Linear regression finds the line that minimizes the distance
between the observed data points and the predicted values on the line. This line
is known as the line of best fit.

3. Equation of the Line: The relationship between the dependent and independent
variables can be expressed as:

Where:

y is the dependent variable

x is the independent variable

m is the slope of the line

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 3/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

c is the y-intercept

4. Residuals: The difference between the observed values and the values predicted
by the model. Linear regression aims to minimize these residuals.

Understanding Linear Regression with an Example


Linear regression is a powerful yet simple technique in machine learning used to
predict a continuous outcome based on one or more input variables. Let’s break it
down with an example.

The Problem
Imagine you run a small business selling handmade candles online. You want to
predict your monthly sales based on how much you spend on advertising. This is
where linear regression can help.

The Data
Suppose you have the following data from the past five months:

In this table:

Advertising Spend is the independent variable (input).

Sales is the dependent variable (output).

The Goal
You want to find the relationship between your advertising spend and sales.
Specifically, you want to predict future sales based on different levels of advertising
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 4/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

spend.

The Line of Best Fit


Linear regression helps us draw a straight line through the data points that best
represent the relationship between advertising spend and sales. This line is called
the line of best fit.

The Linear Regression Equation


The relationship between advertising spend (x) and sales (y) can be described by the
equation of a line:

Where:

y is the predicted sales.

x is the advertising spend.

m is the slope of the line (how much sales increase for each dollar spent on advertising).

c is the y-intercept (the value of sales when advertising spend is zero).

Calculating the Line of Best Fit


Let’s say, after performing linear regression on the data, you find that the equation
of the line is:

Sales= 7 × (Advertising Spend)+1500


This means:

For every additional dollar spent on advertising, your sales increase by $7.

Even if you spent $0 on advertising, you’d still expect to make $1,500 in sales
(perhaps from word-of-mouth or returning customers).

Making Predictions
Now, you can use this equation to predict future sales. For example, if you plan to
spend $900 on advertising next month, you can predict your sales like this:
https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 5/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Sales= 7 × 900 + 1500 = 6,300 + 1500 = 7,800


So, with $900 in advertising, you can expect to make around $7,800 in sales.

Visualizing the Data


Here’s how the data might look on a graph:

The line of best fit would pass through or near these points, showing the trend that
as advertising spending increases, so do sales.

Types of Linear Regression


Simple Linear Regression

Simple linear regression involves just one independent variable. For example,
predicting house prices based on square footage alone. The model tries to fit a
straight line that best represents the relationship between the two variables.

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 6/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Multiple Linear Regression

When there are two or more independent variables, the model is referred to as
multiple linear regression. For instance, predicting house prices based on square
footage, number of bedrooms, and age of the property. The model will try to fit a
plane or hyperplane in multi-dimensional space that best fits the data.

Assumptions of Linear Regression

For linear regression to work effectively, certain assumptions must be met:

1. Linearity: The relationship between the independent and dependent variables should
be linear.

2. Independence: The observations should be independent of each other.

3. Homoscedasticity: The residuals (errors) should have constant variance.

4. Normality: The residuals should be normally distributed.

The Importance of Linear Regression in Machine Learning

Linear regression serves as the foundation for more complex machine learning
algorithms. Understanding how linear regression works gives you a strong base to
explore other techniques like polynomial regression, logistic regression, and neural
networks. Many machine learning models are extensions or variations of linear
regression.

Real-World Applications

1. Predicting House Prices: Linear regression is widely used to estimate real estate
values based on various factors such as location, size, and condition of the
property.

2. Sales Forecasting: Businesses use linear regression to predict future sales based
on historical data and market trends.

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 7/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

3. Risk Management: Financial institutions use linear regression to assess risks by


predicting potential losses and gains.

Evaluation Metrics
To assess the accuracy of your linear regression model, you can use metrics such as:

Mean Absolute Error (MAE): Mean Absolute Error (MAE) is a measure of how
close the predictions of a linear regression model are to the actual outcomes. It
calculates the average of the absolute differences between predicted values and
actual values. MAE is particularly useful because it gives a clear indication of the
average error in the predictions, making it easier to understand how much the
predictions deviate from the actual values.

Mean Squared Error (MSE): Mean Squared Error (MSE) is another way to
measure the accuracy of a linear regression model. It calculates the average of
the squared differences between the actual and predicted values. The squaring
of the differences penalizes larger errors more than smaller ones, making MSE
sensitive to outliers.

R-squared (R²): R-squared (R²), also known as the coefficient of determination,


measures the proportion of the variance in the dependent variable that is
predictable from the independent variables. It essentially indicates how well the
independent variables explain the variation in the dependent variable.

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 8/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

These metrics help you understand how well your model is performing and whether
it’s making accurate predictions.

How to Implement Linear Regression


Training a Linear Regression Model
Training a linear regression model involves a few key steps, from preparing the data
to evaluating the model’s performance. Below is a step-by-step guide to training a
linear regression model.
1. Data Collection and Preparation
Data Collection: The first step is to gather the data relevant to the problem you want
to solve. This data should include the dependent variable (the target you want to
predict) and one or more independent variables (the features or predictors).

Data Preparation: Once you have the data, you need to prepare it for the model. This
involves:

Cleaning the Data: Removing or imputing missing values, correcting errors, and
eliminating outliers.

Splitting the Data: Dividing the data into training and testing sets. Typically, 70–
80% of the data is used for training, and the remaining 20–30% is used for
testing.

Feature Scaling (if needed): Normalizing or standardizing the features,


especially if the features have different units or scales.

2. Selecting the Linear Regression Algorithm


Linear regression can be implemented using various tools and libraries. In Python,
the most common libraries are:

Scikit-learn: Provides a simple interface for training linear regression models.

Statsmodels: Offers more detailed statistical summaries of the model.


https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 9/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

3. Training the Model


Fitting the Model: To train the model, you fit it to the training data. This process
involves finding the best-fitting line (or hyperplane in the case of multiple linear
regression) that minimizes the error between the actual and predicted values.

In Scikit-learn, this is done as follows:

from sklearn.linear_model import LinearRegression

# Create the model


model = LinearRegression()

# Fit the model to the training data


model.fit(X_train, y_train)

Here:

X_train is the matrix of feature variables (independent variables).

y_train is the vector of the target variable (dependent variable).

Understanding Model Coefficients: After fitting the model, you can examine the
coefficients and intercept to understand the relationship between the independent
and dependent variables.

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)

Coefficients: Represent the change in the target variable for a one-unit change
in the respective feature, holding all other features constant.

Intercept: The expected value of the target variable when all feature values are
zero.
4. Making Predictions
Once the model is trained, you can use it to make predictions on new data. This is
done using the predict method:

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 10/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

y_pred = model.predict(X_test)

Here, X_test is the test data, and y_pred contains the predicted values.

5. Evaluating the Model


After making predictions, you need to evaluate how well the model performs.
Common evaluation metrics for linear regression include:

Mean Absolute Error (MAE)

Mean Squared Error (MSE)

R-squared (R²)

using Scikit-learn:

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Calculate MAE, MSE, and R²


mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("MAE:", mae)
print("MSE:", mse)
print("R²:", r2)

MAE provides the average absolute difference between actual and predicted
values.

MSE gives more weight to larger errors, making it useful when large errors are
particularly costly.

R² indicates how well the independent variables explain the variability of the
dependent variable.
6. Tuning and Improving the Model
If the initial model performance is not satisfactory, you can try:

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 11/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Adding more features: If relevant data is available.

Removing irrelevant or noisy features: To reduce overfitting.

Polynomial Regression: If the relationship between variables is not linear.

7. Finalizing the Model


Once you’re satisfied with the model’s performance, you can use it to make
predictions on new, unseen data or integrate it into a larger system for automated
decision-making.

Common Challenges and Pitfalls

While linear regression is straightforward, it’s not without challenges:

1. Overfitting: When your model is too complex, it may fit the training data too
closely, leading to poor generalization to new data.

2. Multicollinearity: When independent variables are highly correlated, it can


distort the results and lead to unreliable predictions.

3. Outliers: Extreme values can have a disproportionate impact on the model,


leading to skewed predictions.

Conclusion

Linear regression is more than just a simple algorithm; it’s a stepping stone to
understanding the broader world of machine learning. By mastering linear
regression, you gain a solid foundation that will support your journey through more
advanced topics in data science and machine learning. Whether you’re predicting
prices, forecasting trends, or assessing risks, linear regression remains a
fundamental tool in your data science toolkit

Machine Learning Linear Regression Python Artificial Intelligence

Linear Regression Python

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 12/17
9/2/24, 11:01 PM How Linear Regression Works: A Simple Explanation | by Ravishek Singh | Sep, 2024 | Medium

Edit profile

Written by Ravishek Singh


0 Followers

More from Ravishek Singh

Ravishek Singh

How to Learn Web Development in Two Months


Accelerate Your Web Development Skills: A Two-Month Learning Roadmap with free resource

Aug 22

See all from Ravishek Singh

https://medium.com/@ravishekstar/how-linear-regression-works-a-simple-explanation-d0db3ed67cdd 13/17

You might also like