0% found this document useful (0 votes)
6 views

Linear-Regression ML

Linear Regression

Uploaded by

7984789201abc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Linear-Regression ML

Linear Regression

Uploaded by

7984789201abc
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Linear Regression

What is Linear Regression?


• Linear regression is an algorithm that provides a linear relationship
between an independent variable and a dependent variable to predict
the outcome of future events.
• It is a statistical method used in data science and machine learning
for predictive analysis.
Simple Linear Regression
• Linear regression shows the linear
relationship between the
independent(predictor) variable i.e. X-
axis and the dependent(output)
variable i.e. Y-axis, called linear
regression.
• If there is a single input
variable X(independent variable), such
linear regression is called simple
linear regression.
Simple Linear Regression
• The graph presents the linear
relationship between the output(y)
variable and predictor(X) variables. The
blue line is referred to as the best
fit straight line. Based on the given data
points, we attempt to plot a line that fits
the points the best.
• This algorithm explains the linear
relationship between the
dependent(output) variable y and the
independent(predictor) variable X using
a straight line Y= B0 + B1 X.
Simple Linear Regression
• The slope indicates the steepness of a line and the intercept
indicates the location where it intersects an axis. The slope and
the intercept define the linear relationship between two
variables, and can be used to estimate an average rate of
change. The greater the magnitude of the slope, the steeper the
line and the greater the rate of change.
Simple Linear Regression: Example
Simple Linear Regression: Example

So how do we know which of these lines is the best fit line? That’s the
problem that we will solve in this article. For this, we will first look at the cost
function.
Simple Linear Regression
But how the linear regression finds out
which is the best fit line?
• The goal of the linear regression algorithm is to get the best
values for B0 and B1 to find the best fit line. The best fit line is a
line that has the least error which means the error between
predicted values and actual values should be minimum.
• In regression, the difference between the observed value of the
dependent variable(yi) and the predicted value(predicted) is
called the residuals.
• εi = ypredicted – yi
• where ypredicted = B0 + B1 Xi
Cost Function for Linear Regression
• The cost function helps to work out the optimal values for
B0 and B1, which provides the best fit line for the data points.
• In Linear Regression, generally Mean Squared Error (MSE) cost
function is used, which is the average of squared error that
occurred between the ypredicted and yi.
Cost Function for Linear Regression
• The cost function helps to work out the optimal values for
B0 and B1, which provides the best fit line for the data points.
• In Linear Regression, generally Mean Squared Error (MSE) cost
function is used, which is the average of squared error that
occurred between the ypredicted and yi.

Using the MSE function, we’ll update the values of B0 and


B1 such that the MSE value settles at the minima. These
parameters can be determined using the gradient descent
method such that the value for the cost function is minimum.
Gradient Descent
Gradient Descent

B1

B0
Gradient Descent
Convex and Non-Convex cost function
Gradient Descent
Then there are 2 things that we need
1.Which direction to go (Direction of update)
2.How big step to take (Amount of update )
Gradient Descent

•positive derivative -> reduce


•negative derivative -> increase
•high absolute derivative -> large step
•low absolute derivative -> small step
Gradient Descent Algorithm

Correct: Simultaneous update Incorrect:


Check Gradient Descent
Check Gradient Descent
Types of Gradient Descent
Batch Gradient Descent:
• Let there be ‘n’ observations in a dataset. Using all these ‘n’ observations to update
the coefficient values B0 and B1 is called batch gradient descent. It requires the
entire dataset to be available in memory to the algorithm.
Stochastic Gradient Descent (SGD):
• SGD, in contrast, updates the values of B0 and B1 for each observation in the
dataset. These frequent updates of the coefficient provide a good rate of
improvement. However, they are more computationally expensive than batch
gradient descent.
Mini-Batch Gradient Descent:
• Mini-batch gradient descent is a combination of the SGD and batch gradient
descent. It splits the dataset into batches and the coefficients are updated at the
end of each of these batches.
Gradient Descent
To summarize, the steps are:
• Estimate θ (Parameters)
• Compute cost function / loss function
• Tweak θ
• Repeat 2 and 3 until you reach convergence.
Preparing Data For Linear Regression
• Linear Assumption: Linear regression assumes that the relationship between
your input and output is linear. It does not support anything else. This may
be obvious, but it is good to remember when you have a lot of attributes. You
may need to transform data to make the relationship linear (e.g. log
transform for an exponential relationship).
• Remove Noise: Linear regression assumes that your input and output
variables are not noisy. Consider using data cleaning operations that let you
better expose and clarify the signal in your data. This is most important for
the output variable and you want to remove outliers in the output variable (y)
if possible.
• Remove Collinearity: Linear regression will over-fit your data when you have
highly correlated input variables. Consider calculating pairwise correlations
for your input data and removing the most correlated
Preparing Data For Linear Regression
• Gaussian Distributions: Linear regression will make more
reliable predictions if your input and output variables have a
Gaussian distribution. You may get some benefit using
transforms (e.g. log or BoxCox) on you variables to make their
distribution more Gaussian looking.
• Rescale Inputs: Linear regression will often make more reliable
predictions if you rescale input variables using standardization
or normalization.
Key benefits of linear regression
• Easy implementation
The linear regression model is computationally simple to
implement as it does not demand a lot of engineering overheads,
neither before the model launch nor during its maintenance.
• Interpretability
Unlike other deep learning models (neural networks), linear
regression is relatively straightforward. As a result, this algorithm
stands ahead of black-box models that fall short in justifying
which input variable causes the output variable to change.
Key benefits of linear regression
• Scalability
Linear regression is not computationally heavy and, therefore, fits well in
cases where scaling is essential. For example, the model can scale well
regarding increased data volume (big data).
• Optimal for online settings
The ease of computation of these algorithms allows them to be used in
online settings. The model can be trained and retrained with each new
example to generate predictions in real-time, unlike the neural networks or
support vector machines that are computationally heavy and require plenty
of computing resources and substantial waiting time to retrain on a new
dataset. All these factors make such compute-intensive models expensive
and unsuitable for real-time applications.
Multiple Linear Regression
• Multiple Linear Regression (MLR) is basically indicating that we will
have many features Such as f1, f2, f3, f4, and our output feature f5. If
we take the same example as above we discussed, suppose:
• f1 is the size of the house,
• f2 is bad rooms in the house,
• f3 is the locality of the house,
• f4 is the condition of the house, and
• f5 is our output feature, which is the price of the house.

• y = A+B1x1+B2x2+B3x3+B4x4
Multiple Linear Regression
Multiple Linear Regression
Multiple Linear Regression
• Jobs we loose due to ML: https://www.youtube.com/watch?
v=gWmRkYsLzB4&list=PLobzMSC-
raKifQd9vHHPkMam_jrQEyzCX&index=7
• How AI can enhance our memory, work and social lives: https://
youtu.be/DJMhz7JlPvA

You might also like