Lecture Note 2: Simple Linear Regression Model: Changsu Ko
Lecture Note 2: Simple Linear Regression Model: Changsu Ko
Changsu Ko
UCLA
Econ 103
This is not the same as the sample mean, which is simply the
arithmetic average of numerical values
E [y | x] = µy |x
= β1 + β2 x
∆E [y | x] dE [y | x]
β2 = =
∆x dx
where ∆ denotes “change in” and “dE [y | x]/dx” denotes the
derivative of the expected value of y with respect to x
e = y − E [y | x]
= y − β1 − β2 x
Rearranging gives
y = β1 + β2 x + e
E [e | x] = E [y | x] − (β1 + β2 x)
= β1 + β2 x − β1 − β2 x
=0
That is, the mean value of the error term, given x, is zero
Assumption SR1
The value of y , for each value of x, is:
y = β1 + β2 x + e
Assumption SR2
The expected value of the random error e is:
E (e) = 0
Assumption SR3
The variance of the random error e is:
The random variables y and e have the same variance because they
differ only by a constant
Assumption SR4
The covariance between any pair of random errors, ei and ej is:
The stronger version of this assumption is that the random errors e’s
are statistically independent, in which case the values of the
dependent variable y are also statistically independent
Assumption SR5
The variable x is not random, and must take at least two different
values
yˆi = b1 + b2 xi
The least squares line has the smaller sum of squared residuals:
N N
eˆi2 and SSE = eˆi∗2
X X
If SSE =
i=1 i=1
We have the following two first order condition (try to get them by
yourself!)
N
X N
X
(yi − b1 − b2 xi ) = êi = 0
i=1 i=1
N
X N
X
(yi − b1 − b2 xi )xi = êi xi = 0
i=1 i=1
1 PN 1 PN
where x̄ = N i=1 xi and ȳ = N i=1 yi
Percent change in y ∆y /y ∆y x
ηyx = = =
Percent change in x ∆x/x ∆x y
∆E (y )/E (y ) ∆E (y ) x x
ηyx = = = β2
∆x/x ∆x E (y ) E (y )
The least squares estimators are random variables. So, what are their
expected values, variances, covariances, and probability distributions?
where
(xi − x̄)
wi = PN
2
j=1 (xj − x̄)
or alternatively as
N
X
b2 = β 2 + wi ei
i=1
We can find the expected value of b2 using the fact that the expected
value of a sum is the sum of the expected values:
N
X
E (b2 ) = E [β2 + wi ei ]
i=1
= E [β2 + w1 e1 + · · · + wN eN ]
= E [β2 ] + E [w1 e1 ] + · · · + E [wN eN ]
N
X
= E [β2 ] + E [wi ei ]
i=1
= β2
Unbiasedness does not say that an estimate from any one sample is
close to the true parameter value
But, we can say that the least squares estimation procedure (or the
least squares estimator) is unbiased
Theorem
Under the assumptions SR1-SR5 of the linear regression model, the
estimators b1 and b2 have the smallest variance of all linear and unbiased
estimator of β1 and β2 . That is, b1 and b2 are the Best Linear Unbiased
Estimator (BLUE) for β1 and β2 , respectively.
The theorem does not say that b1 and b2 are the best of all possible
estimators
The estimators b1 and b2 are best within their class because they
have the minimum (i.e., smallest possible) variance
This explains why we are studying these estimators and why they are
so widely used in research, not only in economics but in all social and
physical sciences as well
Changsu Ko (UCLA) Summer 2016 Econ 103
The Probability Distribution of the least squares Estimator
σ2 N x2
P
b1 ∼ N(β1 , PN i=1 i )
N i=1 (xi − x̄)2
σ2
b2 ∼ N(β2 , PN )
i=1 (xi − x̄)2
Theorem
If assumptions SR1-SR5 hold, and if the sample size N is sufficiently large,
then the least squares estimators have a distribution that approximates the
normal distributions in the previous slide
eˆi = yi − yˆi = yi − b1 − b2 xi
so that
N
1 X 2
σ̃ 2 = eˆi
N
i=1
so that:
E [σ̂ 2 ] = σ 2
To obtain estimates for Var (b1 ), Var (b2 ), Cov (b1 , b2 ), just replace
the unknown error variance σ 2 by σ̂ 2 to obtain:
PN 2
i=1 xi
\
Var (b1 ) = σ̂ 2 PN
N i=1 (xi − x̄)2
1
\
Var (b2 ) = σ̂ 2 PN
i=1 (xi − x̄)2
−x̄
(b1 , b2 ) = σ̂ 2 PN
Cov\
2
i=1 (xi − x̄)
The square roots of the estimated variances are the “standard errors”
of b1 and b2 , respectively
q
\
se(b1 ) = Var (b1 )
q
\
se(b2 ) = Var (b2 )
For the food expenditure data, the estimated covariance matrix is:
1, 884.44 −85.9032)
−85.9032 4.38175)
where
σ2
Var (b2 ) = PN
i=1 (xi − x̄)2
The larger σb2 is the more variation in the least squares estimates b2
we see from sample to sample. If σb2 is large, then the estimates
might change a great deal from sample to sample
They can also be indicator variables that take only the values zero
and one
This means that the simple linear regression model can be used to
account for nonlinear relationship between variables
PRICE = β1 + β2 SQFT + e
In each case we will find that the slope of the relationship between
PRICE and SQFT is not constant, but changes from point to point
PRICE = α1 + α2 SQFT 2 + e
\)
d(PRICE
= 2αˆ2 SQFT
d(SQFT )
if αˆ2 > 0, then larger houses will have larger slope, and a larger
estimated price per additional square foot
For 1,080 houses sold in Baton Rouge, LA, during mid-2005, the
estimated quadratic equation is:
[× SQFT
η̂ = slope
PRICE
[ × SQFT
ˆ 2 × slope
= 2alpha
PRICE
d(y )
= by
d(x)
(d ŷ /y )
η̂ ∗ = 100 × = 100b
dx
Consider again the model for the price of a house as a function of the
square footage, but now written in semi-log form:
ln(PRICE ) = γ1 + γ2 SQFT + e
Using the Baton Rouge data, the fitted log-linear model is:
\)
d(PRICE
= γˆ2 PRICE = 0.00041 × PRICE
d(SQFT )
For a house with 2,000 square feet, the estimated elasticity is 0.823.
That is, a 1% increase in house size is estimated to increase selling
price by 0.823%
For a house with 4,000 square feet, the estimated elasticity is 1.645
Using the “semi-elasticity”, we can say that, for a one-square-foot
increase in size, we estimate a price increase of 0.04%
Or, perhaps more usefully, we estimate that a 100-square-foot
increase will increase price by approximately 4%
For example
(
1 if house is in University Town
UTOWN =
0 if house is in Golden Oaks
PRICE = β1 + β2 UTOWN + e
PRICE = b1 + b2 UTOWN
= 215.733 + 61.51 × UTOWN
(
277.242 if UTOWN = 1
=
215.733 if UTOWN = 0
b1 = PRICE GoldenOaks
b2 = PRICE UniversityTown − PRICE GoldenOaks