LinearRegression
LinearRegression
Linear Regression
LINEAR REGRESSION
• In simple linear regression, we predict scores on
one variable from the scores on a second
variable.
• The variable we are predicting is called
the criterion variable and is referred to as Y. The
variable we are basing our predictions on is called
the predictor variable and is referred to as X.
• When there is only one predictor variable, the
prediction method is called simple regression.
LINEAR REGRESSION
LINEAR REGRESSION
• Linear regression consists of finding the best-fitting
straight line through the points. The best-fitting line
is called a regression line.
• Regression Line consists of the predicted score on Y
for each possible value of X.
• The vertical lines from the points to the regression
line represent the errors of prediction.
• The error of prediction for a point is the value of
the point minus the predicted value (the value on
the line).
LINEAR REGRESSION
• What is meant by "best-fitting line”?
• By far, the most commonly-used criterion for
the best-fitting line is the line that minimizes
the sum of the squared errors of prediction.
Statistics for computing Regression Line
• Y’ = bX + a (regression line)
• slope(b) = r* Sy/Sx
• Intercept (a) = My – b* Mx
Sum Of Squares
• One useful aspect of regression is that it can
divide the variation in Y into two parts
• The variation of the predicted scores and The
variation of the errors of prediction.
• The variation of Y is called the sum of squares
Y and is defined as the sum of the squared
deviations of Y from the mean of Y.
Sum Of Squares
• It is sometimes convenient to use formulas
that use deviation scores rather than raw
scores. Deviation scores are simply deviations
from the mean.
Sum Of Squares
Sum Of Squares
• SSY can be partitioned into two parts: the sum of squares
predicted (SSY') and the sum of squares error (SSE).
• SSY = SSY' + SSE
4.597 = 1.806 + 2.791
• The SSY is the total variation, the SSY' is the variation
explained, and the SSE is the variation unexplained.
Therefore, the proportion of variation explained can be
computed as:
• Proportion explained = SSY'/SSY
• Similarly, the proportion not explained is:
• Proportion not explained = SSE/SSY
Standard Error of the Estimate
• The standard error of the estimate is a
measure of the accuracy of predictions.
• The standard error of the estimate is closely
related to this quantity and is defined below:
Standard Error of the Estimate
What is A Simple Linear Regression &
Multiple Linear Regression?
• In simple linear regression, just one independent variable X is
used to predict the value of the criterion variable Y. The
multiple linear regression contain more than one independent
variable is used to predict Y.
• Of course, in both cases, there is just one variable Y. The only
difference is in the number of independent variables.
• For example, if we predict the rent of an apartment based on
just the square footage, it is a simple linear regression. On the
other hand, if we predict rent based on a number of factors;
square footage, the location of the property, and age of the
building, then it becomes an example of multiple linear
regression.
Linear Basis Function Models
• The simplest linear model for regression is
one that involves a linear combination of the
input variables.
y(x,w) = w0 + w1x1 + . . . + wDxD
• This is often simply known as linear
regression. The key property of this model is
that it is a linear function of the parameters
w0, . . . , wD.
Linear Basis Function Models
• It is also, however, a linear function of the
input variables xi, and this imposes significant
limitations on the model.
• We therefore extend the class of models by
considering linear combinations of fixed
nonlinear functions of the input variables, of
the form
Linear Basis Function Models
• φj(x) is basis functions, the total number of
parameters in this model will be M.
• W0 is bias parameter
• we will apply some form of fixed pre-
processing, or feature extraction, to the
original data variables. If the original variables
comprise the vector x, then the features can
be expressed in terms of the basis functions
{φj(x)}.
Example
• Lets take the example of the housing prices.
Suppose you collect the size of different
houses in your locality and their respective
prices. The hypothesis can be represented as