0% found this document useful (0 votes)
17 views

5.linear Regression

Linear regression is a machine learning algorithm used to predict numeric or continuous target variables by finding the linear relationship between independent variables and the dependent variable. It fits a linear equation to the data to find the relationship between the variables. The linear regression line provides the best fit to the data by minimizing the sum of the squared residuals between the actual and predicted values of the dependent variable.

Uploaded by

LUV ARORA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

5.linear Regression

Linear regression is a machine learning algorithm used to predict numeric or continuous target variables by finding the linear relationship between independent variables and the dependent variable. It fits a linear equation to the data to find the relationship between the variables. The linear regression line provides the best fit to the data by minimizing the sum of the squared residuals between the actual and predicted values of the dependent variable.

Uploaded by

LUV ARORA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

LINEAR

REGRESSION
Linear regression is one of the easiest and most
popular Machine Learning algorithms.
WHAT IS Linear regression makes predictions for
LINEAR continuous/real or numeric variables

REGRESSI
Three main uses of regression analysis are:
1. Predicting share price
ON? 2. Analyzing the impact of price changes
LINEAR REGRESSION
Linear regression shows the linear relationship,
which means it finds how the value of the
dependent variable (y) is changing according to
the value of the independent variable (x).
The linear regression model provides a sloped
straight line representing the relationship
between the variables.
LINEAR REGRESSION
Mathematically, we can represent a linear regression
as:
y= a0+a1x+ ε
Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of
freedom)
a1 = Linear regression coefficient (scale factor to each
input value).
ε = random error
TYPES OF LINEAR
REGRESSION
Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Simple Linear
Regression.
Multiple Linear regression:
If more than one independent variables are used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Multiple
Linear Regression.
LINEAR REGRESSION LINE
Positive Linear Relationship: Negative Linear Relationship:
If the dependent variable increases on If the dependent variable decreases on
the Y-axis and independent variable the Y-axis and independent variable
increases on X-axis, then such a increases on the X-axis, then such a
relationship is termed as a Positive relationship is called a negative linear
linear relationship. relationship.
COST FUNCTION
Goal: find the best fit line that means the error between predicted values and actual values should be
minimized. The best fit line will have the least error.
The different values for weights or the coefficient of lines (a0, a1) gives a different line of regression,
so we need to calculate the best values for a0 and a1 to find the best fit line, so to calculate this we use
cost function.
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the average
of squared error occurred between the predicted values and actual values. It can be written as:
MSE can be calculated as:
Where,
N=Total number of observation
Yi = Actual value
(a1xi+a0)= Predicted value.
GRADIENT DESCENT
Gradient descent is a method of updating a_0 and a_1 to reduce the cost
function(MSE). The idea is that we start with some values for a_0 and a_1 and then
we change these values iteratively to reduce the cost.

In the gradient descent algorithm, the number


of steps you take is the learning rate. This
decides on how fast the algorithm converges to
the minima.
GRADIENT DESCENT
GOODNESS OF FIT
The Goodness of fit determines how the line of regression fits the set of observations.
R-squared method:
R-squared is a statistical method that determines the goodness of fit.
It measures the strength of the relationship between the dependent and independent
variables on a scale of 0-100%.
The high value of R-square determines the less difference between the predicted values
and actual values and hence represents a good model.
It is also called a coefficient of determination, or coefficient of multiple
determination for multiple regression.
Formula:
EXAMPLE
EXAMPLE
MEAN SQUARE ERROR
MEAN SQUARE ERROR
MEAN SQUARE ERROR
FIND THE BEST FIT LINE
GOODNESS OF FIT-R2
GOODNESS OF FIT

2
𝑅 =
∑ ( 𝑦 𝑝 − 𝑦)2
, 𝑅
2
=
1.6
, 𝑅
2
= 0.3
∑ ( 𝑦 −𝑦) 2
5.2
GOODNESS OF FIT
GOODNESS OF FIT
MULTIVARIATE LINEAR
REGRESSION
More than one independent variables are used to predict the value of a numerical
dependent variable.
MULTIVARIATE LINEAR
REGRESSION
Y=f(x,z)
y = m1.x + m2.z+ c
y is the dependent variable i.e. the variable that needs to be estimated and
predicted.
x is the first independent variable i.e. the variable that is controllable. It is the
first input.
m1 is the slope of x1. It determines what will be the angle of the line (x).
z is the second independent variable i.e. the variable that is controllable. It is
the second input.
m2 is the slope of z. It determines what will be the angle of the line (z).
c is the intercept. A constant that determines the value of y when x and z are 0.
MULTIVARIATE LINEAR
REGRESSIONREGRESSION
 model with two input variables can be expressed as:
y = β0 + β1.x1 + β2.x2
In machine learning world, there can be many dimensions. A model with three
input variables can be expressed as:
y = β0 + β1.x1 + β2.x2 + β3.x3
A generalized equation for the multivariate regression model can be:
y = β0 + β1.x1 + β2.x2 +….. + βn.xn
MULTIVARIATE LINEAR
REGRESSION
When we have multiple features
and we want to train a model that
can predict the price given those
features, we can use a
multivariate linear regression.
The model will have to learn the
parameters(theta 0 to theta n) on
the training dataset below such
that if we want to predict the
price for a house that is not sold
yet, it can give us prediction that
is closer to what it will get sold
for.
MULTIVARIATE LINEAR
REGRESSION
COST FUNCTION AND GRADIENT
DESCENT FOR MULTIVARIATE LINEAR
REGRESSION
GRADIENT DESCENT
PRACTICAL IDEAS FOR MAKING
GRADIENT DESCENT WORK WELL
Use feature scaling to help gradient descent converge faster. Get every feature
between -1 and +1 range. It doesn’t have to be exactly in the -1 to +1 range but it
should be close to that range.
PRACTICAL IDEAS FOR
MAKING GRADIENT DESCENT
WORK WELL
PRACTICAL IDEAS FOR
MAKING GRADIENT DESCENT
WORK WELL
MODEL INTERPRETATION
y = -85090 + 102.85 * X1 + 43.79 * X2+ 1.52 * x3 - 37.91 * x4 + 908.12 * x5 + 364.33 * x6
x1: With all other predictors held constant, if the x1 is increased by one unit, the average price increases by
$102.85.
x2: With all other predictors held constant, if the x2 is increased by one unit, the average price increases by
$43.79.
x3: With all other predictors held constant, if the x3 is increased by one unit, the average price increases by
$1.52.
x4: With all other predictors held constant, if the x4 is increased by one unit, the average price decreases by
$37.91 (length has a -ve coefficient).
x5: With all other predictors held constant, if the x5 is increased by one unit, the average price increases by
$908.12
x6t: With all other predictors held constant, if the x6 is increased by one unit, the average
price increases by $364.33
REGULARIZATION
32
PROBLEM OF OVERFITTING

Regularisation is a technique used to reduce the errors by fitting the function appropriately on the given training set and
avoid overfitting.

33
REGULARISATION IN ML
L1 regularisation
L2 regularisation
A regression model which uses L1 Regularisation technique is called LASSO (Least
Absolute Shrinkage and Selection Operator) regression.

A regression model that uses L2 regularisation technique is called Ridge regression.

34
REGULARISATION
Lasso Regression adds “absolute value of magnitude” of coefficient as penalty term to the
loss function(L).

Ridge regression adds “squared magnitude” of coefficient as penalty term to the loss


function(L).

During Regularisation the output function(y_hat) does not change. The change is only in the loss function.

35
LOSS FUNCTION

The loss function after regularisation:

lambda is a Hyperparameter Known as regularisation constant and it is greater than zero.

36
PROBELM
A clinical trial gave the following data for BMI and
17 140
Cholesterol level for 10 patients. Predict the likely
value of cholesterol level for someone who has a 21 189
BMI of 27. 24 210

28 240

14 130

16 100

19 135

22 166

15 130

18 170
SOLUTION:

17 140 -2.4 -21 50.4 5.76

21 189 1.6 28 44.8 2.56

24 210 4.6 49 225.4 21.16

28 240 8.6 79 679.4 73.96

14 130 -5.4 -31 167.4 29.16

16 100 -3.4 -61 207.4 11.56

19 135 -0.4 -26 10.4 0.16

22 166 2.6 5 13 6.76

15 130 -4.4 -31 136.4 19.36

18 170 -1.4 9 -12.6 1.96

=161 Total=1522 Total=172.4


The repression line equation is

Hence the regression line is +X


Hence someone having a BMI of 27 would likely have cholesterol level as
+27
Cholesterol =228

You might also like