0% found this document useful (0 votes)
60 views

SST

Regression is used to explain variation in a dependent variable based on an independent variable. R2 measures how well the regression line explains this variation. It is calculated by comparing the total variation from the mean (SST) to the variation explained by the regression line (SSR) and the residual variation not explained (SSE). Specifically, R2 is derived from the equation SST = SSR + SSE, where SST is the total sum of squares, SSR is the regression sum of squares, and SSE is the error sum of squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

SST

Regression is used to explain variation in a dependent variable based on an independent variable. R2 measures how well the regression line explains this variation. It is calculated by comparing the total variation from the mean (SST) to the variation explained by the regression line (SSR) and the residual variation not explained (SSE). Specifically, R2 is derived from the equation SST = SSR + SSE, where SST is the total sum of squares, SSR is the regression sum of squares, and SSE is the error sum of squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 7

Regression

Explaining Variation
I. Explaining Variation: R2
 A. Breaking
Y
Down the Distances
Money Spent 8 x x
on Health Care 7 x Y=a+bx
x x
6 Y
x x
5
4 x

10 20 30 40 50 60 70 X
Income
 How well does the predicted line explain the variation in
the independent variable money spent?
I. Explaining Variation: R2
 Total Variation
$ Spent on
Health Care
Y
Y-Y=deviation unexplained by regression
8 x x
(x,y)
7 x
x x Y-Y=deviation explained by regression
6
5.9 Y
x x Y-Y=total deviation around Y
5
4 x
Y=a+bx

10 20 30 40 50 60 70 X
Income
I. Explaining Variation: R2
 Total Deviation
Y Y  (Y  Y )  (Y  Y ) .
Total = Explained + Unexplained
Deviation Deviation Deviation

 The total distance from any point to Y is the sum of


the distance from Y to the regression line plus the
distance from the regression line to Y .
I. Explaining Variation: R2
 B. Sums of Squares
 We can sum this equation across all the Y's and
square both sides to get:
2
 (Y  Y ) 2
  (Y  Y )  (Y  Y )
  (Y  Y )2  2 (Y  Y )(Y  Y )   (Y  Y )2
  (Y  Y )2   (Y  Y )2 ,
I. Explaining Variation: R2
 1. Total Sum of Squares (SST).
 The term on the left-hand side of this equation is the
sum of the squared distances from all points to Y .
We call this the total variation in the Y's, or the
Total Sum of Squares (SST).
 2. Regression Sum of Squares
 The first term on the right hand side is the sum of
the squared distances from the regression line to Y .
We call it the Regression Sum of Squares, or
SSR.
I. Explaining Variation: R2
 3. Error Sum of Squares
 Finally, the last term is the sum of the squared
distances from the points to the regression line.
Remember, this is the quantity that least squares
minimizes. We call it the Error Sum of Squares, or
SSE.
 We can rewrite the previous equation as:
SST = SSR + SSE.

You might also like