MLR Note
MLR Note
•
Multiple Linear Regression
.
Estimation
• the model
Y =
Bo t
B, X ,
t . . . +
BKXK 1- U
with
Bo :
intercept and BK :
parameter associated with XK
• advantages of MLR
\ explicitly
I
> hold fixed other factors that otherwise would be in u
'
key assumption
• the key assumption for the general multiple regression model is :
E ( UIX ,
, Xz ,
. . . , XK ) = 0 > Zero conditional mean assumption
×-
\ all factors in the unobserved term uncorrelated with the explanatory variables
I
> error are
I
> for the s
•
Estimation
•
Sample regression function ( SRF )
=
Rot Bix ,
t . . .
+
pink
↳ OLS estimates are chosen to minimize the residuals
sum of squared
.(
'
ñi )
minpi.pl ,
. . .
. pic
pi;§ 5;)
'
pi pi
min ( Yi -
, ,
. . .
,
'
minsi.pe?....pi.&lyi-pi-pix ,
-
. . .
-
pick ) .
Interpretation
general case :
D= pi pix +
,
+
pixzt . . - +
pic ✗k
•
p^o represents the average value
of y^ when ✗ 1=0 ,
✗ 2=0 . . . XK = 0 > all explanatory variables =
0
•
pic for K >0 have partial effect interpretations
iy =p? ×, + pi xzt . . .
+ pi k
↳ if y^ =p
? ceteris paribus interpretation
(
fixed
>
we hold ✗k , ×,
= 0 > ×
,
•
example
wage equation
:
Wa^ge =
pi pieductpiexper
+
(
the assumption ,
> this estimator is quite powerful
> can we change more than one independent variable simultaneously ? Yes ,
but we wouldnt be able to identify the effect that comes from a
specific variable
-
Goodness of fit
•
decomposition of total variation
Sst
£7 , ( Yi IT
Sst
↳ SST =
.
-
SSE =L? ( Yi F) -
' ( > doesn't reflect on a causal estimate
= ,
of the model
( Yi Yi ) :{Intuit
"
SSR =
.
Eli : ,
-
Multiple Regression Model
· R-squared
I high R-squared/c
I
there is a causal interpretation
· adjusted R-squared
= SSR/(n-x 1) -
R 1 -
adjusted R2
=
SS/cn --)
↳ adjusted R-squared imposes penalty for adding new
regressors
I testatistic than
1.
increases ifand only if the of a newly added regressor is greater one in absolute value
·
When there is no
explanatory power atall due the
to
adding of new datas
Gauss-Markov assumption
·
assumption linearin parameters
Y Bu iX1
= +
...
+
BxXx
+ +
u
·
assumption random sampling
·
assumption no perfectcolinearity
·
example
>
non-linear relationship
s
equals to
2pzlog(inc) > linear function Beloglinc)
to >VIOLATION
I
>much
·
assumption homoskedasticity
Var(ulx, xx)
2
.
. .
., 8
=
Cov(X1,x2) 0
=
·
example
bright o+ B: cigs
=
maybe assumed
correlated -
B2 f0
Multiple Regression Model
• scenarios
"
> Bi is biased
•
example
✗z =
Jot 8, X, + V
Positive
estimate model > E- ( pi ) = Elfin + Bisi ) =
Elpi ) +
Elpisi ) =p ,
+
past > omitted variable bias because we exclude Xz to the
regression model
negative
} determined by the value of Bz S 81
with pi pi pisi
-
- + where Xi I = + six , > pi si = -
,
vice versa > underestimate
> relationship between the variable that is omitted s the outcome variable ( relationship between ✗ S Y ) > effect of Xz on Y
*
if there omitted variables the most
Note : are more than I >
focus on
Homoskedasticity
if assumptions 1- 5 are
fulfilled
> variance of the error term
2
g-
var (pi ) =
sstj 11 RF ) -
> R squared
-
from a
regression of explanatory variable
Xj on all other independent variables ( including a constant )
&? ( Xij
'
> - I;)
= ,
•
Components of OLS variances
high sampling
" "
• error variances increases the variance > due to more noise in the equation
•
large error variance necessarily makes estimates imprecise
•
error variance does not decrease with sample site
>
the total sample variation in the
explanatory variable : SST;
>
increasing sample size > more precise estimates > lesser sample variation
:(1- Rj ) 1)
'
( Rj
'
•
sampling variance
of pi > the higher the better >
explanatory variable ( Xj ) can be linearly explained by other
independent variables
key takeaway
•
full fills MLR I .
-
4 : unbiased estimate pi
fulfills MLR I .
-
5 =
best linear unbiased estimate