0% found this document useful (0 votes)

2 views33 pages

Metrics Topic6 Part1 Multipleregression

The document discusses omitted variable bias in econometrics, explaining how omitted variables can lead to biased estimates in regression analysis. It outlines the causal analysis framework, the implications of omitted variables, and methods to mitigate bias, such as randomized controlled experiments and including control variables in regression models. The document also provides examples, including the Project STAR study, to illustrate the concepts of conditional independence and the importance of controlling for confounding factors.

Uploaded by

David NICE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views33 pages

Metrics Topic6 Part1 Multipleregression

Uploaded by

David NICE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Econ 3334: Introduction to Econometrics

Omitted variable bias and control variables

1
Review the causal analysis framework in topic 4
• 𝑋𝑖 : individual i’s received treatment (such as years of education).
• 𝑌𝑖 : individual i’s outcome measure (such as earnings).
• 𝑌𝑥𝑖 : potential outcome if individual i receives treatment 𝑥.
• Assume that potential outcome is linear in 𝑥, where 𝑥 = 𝑥1 , 𝑥2 , … , 𝑥𝑀 .
𝑌𝑥𝑖 = 𝛽0 + 𝛽1 𝑥 + 𝑒𝑖 , 𝐸 (𝑒𝑖 ) = 0.
Δ𝑌𝑥𝑖
• The causal effect of the treatment on outcome is defined as 𝛽1 =
Δ𝑥
One more year of education changes earnings by 𝛽1 units.
• Note that when 𝑥 = 𝑋𝑖 , 𝑌𝑥𝑖 = 𝑌𝑖 : 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑒𝑖
• In general, 𝐸 (𝑒𝑖 |𝑋𝑖 ) ≠ 0.
➢ Think of 𝑒𝑖 as the innate ability of person i. which determines
person i’s potential outcomes.
➢ On average, those with more education tend to have higher ability:
𝐸 (𝑒𝑖 |𝑋𝑖 = 16) > 𝐸 (𝑒𝑖 |𝑋𝑖 = 9).
➢ The causal effect is β1 × 7. But the measured difference in means:
𝐸 (𝑌𝑖 |𝑋𝑖 = 16) − 𝐸 (𝑌𝑖 |𝑋𝑖 = 9) = 7𝛽1 + 𝐸 (𝑒𝑖 |𝑋𝑖 = 16) − 𝐸 (𝑒𝑖 |𝑋𝑖 = 9) >
7𝛽1
2
Omitted variable bias, the theory
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝑢
• The error u includes all omitted variables, other than X, that
influence Y. ( u might also reflect heterogeneous causal effect.)
• There are always omitted variables.
• If there are omitted variables that are correlated with X, then LSA#1
is violated and the OLS estimators converge in probability to the
causal parameter plus a bias. The bias is called “omitted variable
bias.”
• Recall that 𝑐𝑜𝑣(𝑌, 𝑋) = 𝑐𝑜𝑣(𝛽0 + 𝛽1 𝑋 + 𝑢, 𝑋) = 𝛽1 𝑣𝑎𝑟(𝑋) + 𝑐𝑜𝑣(𝑢, 𝑋)
• Then
𝑠𝑌𝑋 𝑝 𝑐𝑜𝑣(𝑌, 𝑋) 𝛽1 𝑣𝑎𝑟(𝑋) + 𝑐𝑜𝑣(𝑢, 𝑋) 𝑐𝑜𝑣(𝑢, 𝑋)
𝛽̂1 = → = = 𝛽1 +
𝑠𝑋2 𝑣𝑎𝑟(𝑋) 𝑣𝑎𝑟(𝑋) 𝑣𝑎𝑟(𝑋)
• Equivalently,
𝑝 𝑐𝑜𝑣 (𝑢, 𝑋 )
̂
𝛽1 − 𝛽1 → ≡ "𝐨𝐦𝐢𝐭𝐭𝐞𝐝 𝐯𝐚𝐫𝐢𝐚𝐛𝐥𝐞 𝐛𝐢𝐚𝐬"
(
𝑣𝑎𝑟 𝑋 )
• There is a downward bias if cov(u,X)<0, and upward bias if cov(u,X)>0.

3
The TestScore-STR example
𝑇𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒 = 𝛽0 + 𝛽1 𝑆𝑇𝑅 + 𝑢
• PctEL has a negative effect on TestScore, and thus enters u with a
negative sign
𝑢 = 𝛾𝑃𝑐𝑡𝐸𝐿 + 𝑣, 𝛾<0
• PctEL is positively correlated with STR
𝑐𝑜𝑣(𝑃𝑐𝑡𝐸𝐿, 𝑆𝑇𝑅) > 0
So
cov(u, STR) < 0
• Then there is a downward bias.
• The OLS estimators: 𝛽̂1 − 𝛽1 < 0 in large samples.

4
Omitted variables that satisfy LSA#1

𝐼𝑛𝑓𝑒𝑐𝑡𝑖𝑜𝑛𝑖 = 𝛽0 + 𝛽1 𝑉𝑎𝑐𝑐𝑖𝑛𝑒𝑖 + 𝑢𝑖

1 𝑖𝑓 𝑝𝑒𝑟𝑠𝑜𝑛 𝑖 𝑔𝑜𝑡 𝑖𝑛𝑓𝑒𝑐𝑡𝑒𝑑;

𝐼𝑛𝑓𝑒𝑐𝑡𝑖𝑜𝑛𝑖 = {
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
1 𝑖𝑓 𝑝𝑒𝑟𝑠𝑜𝑛 𝑖 𝑟𝑒𝑐𝑒𝑖𝑣𝑒𝑠 𝑡ℎ𝑒 𝑣𝑎𝑐𝑐𝑖𝑛𝑒;
𝑉𝑎𝑐𝑐𝑖𝑛𝑒𝑖 = {
0 𝑖𝑓 𝑝𝑒𝑟𝑠𝑜𝑛 𝑖 𝑟𝑒𝑐𝑒𝑖𝑣𝑒𝑠 𝑡ℎ𝑒 𝑝𝑙𝑎𝑐𝑒𝑏𝑜.

𝑢𝑖 might include person 𝑖’s age, original health condition,

individual treatment effects, etc.
The vaccine status is unknown to the subjects so that it will not
generate systematic difference between the treatment and control
groups in terms of the subjects’ behavior.
If the vaccine and placebo are randomly assigned,
E(ui |𝑉𝑎𝑐𝑐𝑖𝑛𝑒𝑖 ) = 0
So LSA#1 holds.

5
Three ways to overcome omitted variable bias
1. Run a randomized controlled experiment in which treatment (STR) is
randomly assigned: then PctEL is still a determinant of TestScore, but
PctEL is uncorrelated with STR. (This solution is not feasible.)

2. Adopt a “cross tabulation” approach (matching) – within each group, all

classes have about the same PctEL, so we control for PctEL (An
intuitive method. But become increasingly complicated when other
determinants like family income and parental education are involved)

3. Use a regression in which the omitted variable (PctEL) is no longer

omitted: include PctEL as an additional regressor in a multiple
regression.

6
Difference in means: holding constant omitted factors

• Among districts with comparable PctEL, the effect of class size is small than
the overall “test score gap” of 7.4.

7
The conditional independence assumption and control variables
• Conditional Independence Assumption: the treatment Xi is
independent of the potential outcomes Yxi conditional on Zi .
• Given that 𝑌𝑥𝑖 = 𝛽0 + 𝛽1 𝑥 + 𝑒𝑖 , the above assumption implies Xi is
independent of ei conditional on Zi . This implies
𝐸 (𝑒𝑖 |Xi , 𝑍𝑖 ) = 𝐸(𝑒𝑖 |𝑍𝑖 )
• We can always decompose a random variable as follows
ei = 𝐸 (𝑒𝑖 |Xi , 𝑍𝑖 ) + (𝒆𝒊 − 𝑬(𝒆𝒊 |𝐗 𝐢 , 𝒁𝒊 )) = 𝐸 (𝑒𝑖 |Xi , 𝑍𝑖 ) + 𝒖𝒊
where ui = 𝑒𝑖 − 𝐸 (𝑒𝑖 |Xi , 𝑍𝑖 ) and E(ui |𝑋𝑖 , 𝑍𝑖 ) = 0 by definition.
• Assume 𝐸 (𝑒𝑖 |𝑍𝑖 ) = 𝛾0 + 𝛾1 𝑍𝑖 , then
𝑬(𝒆𝒊 |𝐗 𝐢 , 𝒁𝒊 ) = 𝐸 (𝑒𝑖 |𝑍𝑖 ) = 𝜸𝟎 + 𝜸𝟏 𝒁𝒊 .
• Then ei = 𝛾0 + 𝛾1 𝑍𝑖 + 𝑢𝑖 , 𝑤𝑖𝑡ℎ 𝐸 (𝑢𝑖 |𝑋𝑖 , 𝑍𝑖 ) = 0.
• The original linear causal model becomes a linear regression model
with an additional regressor:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑒𝑖 = (𝛽0 + 𝛾0 ) + 𝛽1 𝑋𝑖 + 𝛾1 𝑍𝑖 + 𝑢𝑖 , 𝐸 (𝑢𝑖 |𝑋𝑖 , 𝑍𝑖 ) = 0
• In the new model, Zi is called the “control variable”.
8
Example of Conditional Independence: Project STAR
• Project STAR (Student-Teacher Achievement Ratio)
• 11,600 kindergartners in 1985-86. Study ran for 4 years until the
original cohort was in 3rd grade.
– Cost $12 million.
• Upon entering the school system, a student was randomly assigned
to one of three groups within the school:
– regular class (22 – 25 students, no aid)
– regular class + aide (with a full-time aid)
– small class (13 – 17 students)
• Y = Stanford Achievement Test scores
• Teachers were also randomly assigned within a school.

9
The Data Structure of STAR
• A random sample or an iid. sample. Use the data structure of STAR
as an example:
{(𝑌𝑖 , 𝑋𝑖 , 𝑆𝑖 )}300
𝑖=1
• The observed variables are:
– 𝑌𝑖 : student 𝑖’s test score.
– 𝑋𝑖 : a dummy indicating whether student 𝑖 was treated with
small class.
– 𝑆𝑖 : a factor that indicates which school student 𝑖 belongs to.
Assume that there are three schools: 𝑆𝑖 = 1,2,3.
– We may think of 𝑆𝑖 in terms of 3 dummies: 𝑆1𝑖 , 𝑆2𝑖 , 𝑆3𝑖 .
• The causal model
𝑌𝑖 = 𝑐 + 𝛽𝑋𝑖 + 𝑢𝑖
where 𝑢𝑖 is the causal error, which includes other determinants of scores
and also possibly reflects heterogeneous treatment effects.

10
The Implied Linear Regression
• The treatment of class sizes is randomly assigned within a school but
not between schools:
– 𝑋𝑖 is independent of the potential scores (𝑌1𝑖 , 𝑌0𝑖 ), conditional
on (𝑆1𝑖 , 𝑆2𝑖 )
– Conditional independence implies conditional mean
independence between 𝑋𝑖 and the causal error 𝑢𝑖 :
𝐸 (𝑢𝑖 |𝑋𝑖 , 𝑆1𝑖 , 𝑆2𝑖 ) = 𝐸 (𝑢𝑖 |𝑆1𝑖 , 𝑆2𝑖 )
– We have two dummies that span all three school effects. This
is the so called “saturated model”: the conditional mean must be
a linear function of the set of dummies:
𝑢𝑖 = 𝑎 + 𝛾1 𝑆1𝑖 + 𝛾2 𝑆2𝑖 + 𝑒𝑖 , 𝐸 (𝑒𝑖 |𝑋𝑖 , 𝑆1𝑖 , 𝑆2𝑖 ) = 0.
• This implies a linear regression model with school fixed effects as
controls:
𝑌𝑖 = 𝛽0 + 𝛽𝑋𝑖 + 𝛾1 𝑆1𝑖 + 𝛾2 𝑆2𝑖 + 𝑒𝑖 , 𝐸 (𝑒𝑖 |𝑋𝑖 , 𝑆1𝑖 , 𝑆2𝑖 ) = 0

11
The Population Multiple Regression Model
Consider the case of two regressors:
Yi = 0 + 1X1i + 2X2i + ui, i = 1,…,n

• Y is the dependent variable

• X1, X2 are the two regressors
• (Yi, X1i, X2i) denote the ith observation on Y, X1, and X2.
• 0 = unknown population intercept
• 1 = partial effect of a change in X1 on Y, holding X2 constant
• ui = the regression error (omitted variables)

12
Interpretation of coefficients in multiple regression
Yi = 0 + 1X1i + 2X2i + ui, i = 1,…,n, E(ui |𝑋1𝑖 , 𝑋2𝑖 ) = 0
Consider changing X1 by X1 while holding X2 constant:

Population regression line before the change:

E(Y|X1 , 𝑋2 ) = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2
Population regression line, after the change:
E(Y|X1 + Δ𝑋1 , 𝑋2 ) = 𝛽0 + 𝛽1 (𝑋1 + Δ𝑋1 ) + 𝛽2 𝑋2

Difference: ΔY = E(Y|X1 + Δ𝑋1 , 𝑋2 ) − E(Y|X1 , 𝑋2 ) = 𝛽1 Δ𝑋1

Y
1 = , holding X2 constant
X 1

13
The OLS Estimator in Multiple Regression
Regression of TestScore against STR:

TestScore = 698.9 – 2.28STR

Now include percent English Learners in the district (PctEL):

TestScore= 686.0 – 1.10STR – 0.65PctEL

• What happens to the coefficient on STR?

• Why? (Note: corr(STR, PctEL) = 0.19) Does the regression match your
intuition of the direction of bias? (what is u in the first regression?)
Interpretation of “-1.10”?

14
What’s special about multiple regression?

• R2 and R
̅2 (adjusted R2)
• One more least square assumption: LSA #4 (no perfect
multicollinearity)
• F-test (Wald-test): testing joint significance of more than one
regression coefficients; testing restrictions on regression coefficients.
(linking F-statistic and t-statistic in case of only one restriction in H0)
Motivation: imperfect multicollinearity…

Remark: other than the above three, the theory of multiple regression
is the same as that of simple regression with a single regressor.

15
Measures of Fit for Multiple Regression
Actual = predicted + residual: Yi = Yî + uî
n
1 1 n 2
SER = 
n − k − 1 i =1
ˆ
ui
2
, RMSE = 
n i =1
uî

where k is the number of regressors.

• RMSE = std. deviation of uˆi (without degree-of-freedom adjustment)

• R2 = fraction of variance of Y explained by X

• R
̅2 = “adjusted R2” = R2 with a degrees-of-freedom adjustment

16
R2
R2 always increases after adding additional regressors.
By adding additional regressors, the SSR becomes smaller.
𝑛 2 𝑛 2
∑ (𝑌𝑖 − 𝑏̂0 − 𝑏̂1 𝑋1𝑖 − 𝑏̂2 𝑋2𝑖 ) ≤ ∑ (𝑌𝑖 − 𝛽̂0 − 𝛽̂1 𝑋1𝑖 )
𝑖=1 𝑖=1

By definition of OLS
𝑛 2
∑ (𝑌𝑖 − 𝑏̂0 − 𝑏̂1 𝑋1𝑖 − 𝑏̂2 𝑋2𝑖 )
𝑖=1
𝑚𝑖𝑛 𝑛
≡ ∑ (𝑌𝑖 − 𝑏0 − 𝑏1 𝑋1𝑖 − 𝑏2 𝑋2𝑖 )2
{𝑏0 , 𝑏1 , 𝑏2 } 𝑖=1
𝑛 2
≤∑ (𝑌𝑖 − 𝛽̂0 − 𝛽̂1 𝑋1𝑖 + 0 ⋅ 𝑋2𝑖 )
𝑖=1
SSR
R2 = 1− becomes larger when adding more regressors. (TSS is the
TSS
same for all regression models)

17
̅2
R
̅2 (“adjusted R2”) makes some adjustment by “penalizing” the
R
̅2 does not necessarily increase
regression with more regressors – the R
when adding additional regressors.

 n − 1  SSR
Adjusted R :2 R = 1− 
2

 n − k − 1  TSS
• When 𝑘 ≥ 1, for the same regression
̅2 < R2
R
• For a regression with only the intercept,
̅2 = R2=0 (In this case k=0).
R
• If n is large,
𝑅̅2 ≈ 𝑅2 .

18
̅𝟐
An example of 𝑹𝟐 and 𝑹

(1) TestScore=698.9 – 2.28STR,

̅2 = .049, SER = 18.58, n=420
R2 = .051, R

(2) TestScore=686.0 – 1.10STR – 0.65PctEL,

̅2 = .424, SER = 14.46, n=420
R2 = .426, R

(3) TestScore=654.2,
̅2 = 0, SER = 19.05, n=420
R2 = 0, R

19
The Least Squares Assumptions for Multiple Regression

Yi = 0 + 1X1i + 2X2i + … + kXki + ui, i = 1,…,n

1. E(ui|X1i = x1,…, Xki = xk) = 0, for all x1,…,xk

2. (X1i,…,Xki,Yi), i =1,…,n, are i.i.d.
3. Large outliers are unlikely: X1,…, Xk, and Y have finite fourth
4 4 4
moments: E( X 1i ) < ,…, E( X ki ) < , E(Yi ) < .
4. There is no perfect multicollinearity.

20
Perfect multicollinearity

Perfect multicollinearity is when one of the regressors is an exact

linear function of the other regressors.
• Perfect multicollearity is a modeling error.
• If 𝑋2 = 𝑎 + 𝑏𝑋1 , then there is a logical error when interpreting 𝛽1
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝑢
= 𝛽0 + 𝛽1 𝑋1 + 𝛽2 (𝑎 + 𝑏𝑋1 ) + 𝑢
= (𝛽0 + 𝛽2 𝑎) + (𝛽1 + 𝛽2 𝑏)𝑋1 + 0 ⋅ 𝑋2 + 𝑢
• The logical error: is 𝛽1 𝑜𝑟 𝛽1 + 𝛽2 𝑏 the true effect of 𝑋1 on Y, holding
constant 𝑋2 ?

• Solution: remove the redundant regressor from regression: the

regression software will automatically do this.

21
An example of perfect multicollinearity
• Generate a new variable
STR2=2*STR+1
• Regress TestScore on STR and STR2 and PctEL

Model: OLS, using observations 1-420

Dependent variable: testscr
Omitted due to exact collinearity: str2

coefficient std. error t-ratio p-value

------------------------------------------------------------------------
const 686.032 7.41131 92.57 3.87e-280 ***
str −1.10130 0.380278 −2.896 0.0040 ***
el_pct −0.649777 0.0393425 −16.52 1.66e-047 ***

22
The Sampling Distribution of the OLS Estimator
Under the four Least Squares Assumptions,
• The sampling distribution of ˆ1 has mean 1
𝐸(𝛽̂1 ) = 𝛽1
• var( ˆ1 ) is inversely proportional to n.
• For large n
p
o ˆ1 is consistent: ˆ1 → 1 (law of large numbers)
̂1 −𝛽1
𝛽
o ̂1 ) is approximately N(0,1) (CLT)
𝑆𝐸(𝛽
o These statements hold for all 𝛽̂𝑗 , 𝑗 = 0,1, … , 𝑘

Conceptually, there is nothing new here!

23
The dummy variable trap
• Dummy variable trap is a special example of perfect multicollinearity.
• Consider a set of dummy variables, which are mutually exclusive and
exhaustive: there are multiple categories and every observation falls
in one and only one category.
Consider the four dummies for a college student:
Freshmen+Sophomores+Juniors+Seniors =1
• If your regression includes all these dummy variables and a constant,
you will have perfect multicollinearity – this is called the dummy
variable trap. Solutions:
1. Omit one of the groups (e.g. Seniors), or
2. Omit the intercept

24
An example of dummy variable trap
1, 𝑖𝑓 𝑖 𝑖𝑠 𝑎 𝑚𝑎𝑛 1, 𝑖𝑓 𝑖 𝑖𝑠 𝑎 𝑤𝑜𝑚𝑎𝑛
𝑀𝑖 = { , 𝑊𝑖 = {
0, 𝑒𝑙𝑠𝑒 0, 𝑒𝑙𝑠𝑒
• 𝑀𝑖 and 𝑊𝑖 are mutually exclusive: an individual cannot be both a man
and a woman.
• 𝑀𝑖 and 𝑊𝑖 are exhaustive: an individual must be either a man or a
woman
• In sum: 𝑀𝑖 + 𝑊𝑖 = 1 for all i.
• Consider the regression model
𝑊𝑎𝑔𝑒𝑖 = 𝛽0 + 𝛽1 𝑀𝑖 + 𝛽2 𝑊𝑖 + 𝑢𝑖
𝐸 (𝑊𝑎𝑔𝑒𝑖 |𝑖 𝑖𝑠 𝑎 𝑚𝑎𝑛) = 𝛽0 + 𝛽1
𝐸 (𝑊𝑎𝑔𝑒𝑖 |𝑖 𝑖𝑠 𝑎 𝑤𝑜𝑚𝑎𝑛) = 𝛽0 + 𝛽2
• Cannot separately interpret 𝛽0 , 𝛽1 , 𝛽2 .

25
Two correct models:
(1) 𝑊𝑎𝑔𝑒𝑖 = 𝛽0 + 𝛽1 𝑀𝑖 + 𝑢𝑖
(2) 𝑊𝑎𝑔𝑒𝑖 = 𝑏1 𝑀𝑖 + 𝑏2 𝑊𝑖 + 𝑢𝑖

The coefficients in (1) and (2) have different interpretations.

In (1), the wage difference between men and women is 𝛽1 :
𝐸 (𝑤𝑎𝑔𝑒|𝑀 = 0) = 𝛽0 , 𝐸 (𝑤𝑎𝑔𝑒|𝑀 = 1) = 𝛽0 + 𝛽1 .
In (2), the wage difference between men and women is 𝑏1 − 𝑏2 :
𝐸 (𝑤𝑎𝑔𝑒|𝑀 = 0) = 𝑏2 , 𝐸 (𝑤𝑎𝑔𝑒|𝑀 = 1) = 𝑏1 .

In sum, we have
𝛽0 = 𝑏2 , 𝛽1 = 𝑏1 − 𝑏2 .
The OLS estimators should follow the same relations
𝛽̂0 = 𝑏̂2 , 𝛽̂1 = 𝑏̂1 − 𝑏̂2 .
The corresponding standard errors are also the same:
𝑆𝐸(𝛽̂0 ) = 𝑆𝐸(𝑏̂2 ), 𝑆𝐸(𝛽̂1 ) = 𝑆𝐸(𝑏̂1 − 𝑏̂2 ).

26
Model 1: OLS, using observations 1-7986
Dependent variable: ahe
Heteroskedasticity-robust standard errors, variant HC1
Coefficient Std. Error t-ratio p-value
const 15.3586 0.133946 114.7 <0.0001 ***
male 2.41405 0.190958 12.64 <0.0001 ***

Mean dependent var 16.77115 S.D. dependent var 8.758696

Sum squared resid 601269.8 S.E. of regression 8.678095
R-squared 0.018443 Adjusted R-squared 0.018320
F(1, 7984) 159.8151 P-value(F) 2.76e-36
Log-likelihood −28586.81 Akaike criterion 57177.62
Schwarz criterion 57191.59 Hannan-Quinn 57182.40

Model 2: OLS, using observations 1-7986

Dependent variable: ahe
Heteroskedasticity-robust standard errors, variant HC1

Coefficient Std. Error t-ratio p-value

male 17.7726 0.136101 130.6 <0.0001 ***
female 15.3586 0.133946 114.7 <0.0001 ***

Mean dependent var 16.77115 S.D. dependent var 8.758696

Sum squared resid 601269.8 S.E. of regression 8.678095
R-squared 0.018443 Adjusted R-squared 0.018320
F(1, 7984) 15099.85 P-value(F) 0.000000
Log-likelihood −28586.81 Akaike criterion 57177.62
Schwarz criterion 57191.59 Hannan-Quinn 57182.40

27
Imperfect multicollinearity

Imperfect and perfect multicollinearity are quite different despite the

similarity of the names.

Imperfect multicollinearity occurs when two or more regressors are very

highly correlated.
• Why the term “multicollinearity”? If two regressors are very
highly correlated, then their scatterplot will pretty much look like
a straight line – they are “co-linear” – but unless the correlation is
exactly 1, that collinearity is imperfect.

28
Implications of imperfect multicollinearity
Theoretically, imperfect multicollinearity is not a problem.
Practically, imperfect multicollinearity implies that one or more of the
regression coefficients will be imprecisely estimated.
• 𝛽1 is the effect of X1 holding X2 constant; but if X1 and X2 are highly
correlated, there is very little variation in X1 once X2 is held constant
– so the data don’t contain much information about what happens
when X1 changes but X2 doesn’t.
• If so, the standard error of the OLS estimator of the coefficient on X1
will be large.

29
An example of imperfect multicollinearity
• Generate a new variable
STR2=2*STR+1
• Generate another new variable
STR3=STR2+rnorm()
Here rnorm() denotes iid.N(0,1) random variable.
• Perfect multicollinearity between STR and STR2
• Imperfect multicollinearity between STR and STR3
Corr(STR,STR3)=0.966
Model: OLS, using observations 1-420
Dependent variable: testscr
Omitted due to exact collinearity: str2

coefficient std. error t-ratio p-value

----------------------------------------------------------------------------
const 685.536 7.48948 91.53 9.55e-278 ***
str −1.77862 1.45801 −1.220 0.2232
el_pct −0.648585 0.0394567 −16.44 3.82e-047 ***
str3 0.341734 0.710105 0.4812 0.6306

30
Stronger correlation implies more imprecise estimates
• Generate
STR4=STR2+0.001*rnorm()
Corr(STR,STR3)=0.9999

Model: OLS, using observations 1-420

Dependent variable: testscr
Omitted due to exact collinearity: str2

coefficient std. error t-ratio p-value

------------------------------------------------------------
const 273.296 698.332 0.3914 0.6957
str −826.915 1397.16 −0.5919 0.5543
el_pct −0.649383 0.0393789 −16.49 2.25e-047 ***
str4 412.902 698.573 0.5911 0.5548

31
What to do if there is imperfect multicollinearity
• Suppose X is the variable of interest, such as STR.
• Assume another variable W in the regression function is highly
correlated with X.

• If SE of X’s coefficient is small, then there is no need to worry about

W.
• If SE of X’s coefficient is large, then drop W and find another control
variable.

• In sum, whether imperfect multicollinearity matters or not depends

on whether the estimates of X’s coefficient is precise or not.
• In theory, if the sample size is large enough, imperfect
multicollinearity will not pose any practical problem.

32
Summary

• Omitted variable bias

• Interpretation of coefficients in multiple regression
• Measure of fit (𝑅2 and 𝑅̅2 )
• OLS formula in matrix form
• Perfect and imperfect multicollinearity, implications of imperfect
multicollinearity in practice
• Dummy variable trap

Introduction To Econometrics - Stock & Watson - CH 4 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 4 Slides
84 pages
Sa1 Pu 14 PDF
No ratings yet
Sa1 Pu 14 PDF
212 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
TCH442E Quantitative Methods For Finance: Last Lecture: Next
No ratings yet
TCH442E Quantitative Methods For Finance: Last Lecture: Next
13 pages
Problem Set 3 SOLUTIONS
No ratings yet
Problem Set 3 SOLUTIONS
7 pages
Lecture Set 3
No ratings yet
Lecture Set 3
53 pages
M06 StockWatson123635 03 Econ Ch06
No ratings yet
M06 StockWatson123635 03 Econ Ch06
50 pages
Lecture 3a
No ratings yet
Lecture 3a
44 pages
2 Regression With Multiple Regressors 1
No ratings yet
2 Regression With Multiple Regressors 1
22 pages
ECONOMETRICS Summary 21:22
No ratings yet
ECONOMETRICS Summary 21:22
54 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
Lecture 3-1 - Introduction To Multiple Regression
No ratings yet
Lecture 3-1 - Introduction To Multiple Regression
48 pages
Econometrics Notes Heidelberg
No ratings yet
Econometrics Notes Heidelberg
62 pages
Class 2
No ratings yet
Class 2
53 pages
統計摘要
No ratings yet
統計摘要
12 pages
3-Econometrics-Linear Regression
No ratings yet
3-Econometrics-Linear Regression
13 pages
Chapter 6-Linear Regression With Multiple Regressors
No ratings yet
Chapter 6-Linear Regression With Multiple Regressors
68 pages
Econometrics - Review Sheet ' (Main Concepts)
No ratings yet
Econometrics - Review Sheet ' (Main Concepts)
5 pages
PEV Onesided
No ratings yet
PEV Onesided
322 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Linear Regression
No ratings yet
Linear Regression
73 pages
Chapter 6
No ratings yet
Chapter 6
36 pages
Eh426 At4 2024 Iv
No ratings yet
Eh426 At4 2024 Iv
28 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Cap2 Slides (1-14)
No ratings yet
Cap2 Slides (1-14)
14 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Module 4
No ratings yet
Module 4
36 pages
Metrics Topic6 Part2 Controlvariables
No ratings yet
Metrics Topic6 Part2 Controlvariables
30 pages
Problem Set 3
No ratings yet
Problem Set 3
2 pages
Lec2 Ase Iev
No ratings yet
Lec2 Ase Iev
32 pages
Econometrics Notes
No ratings yet
Econometrics Notes
15 pages
2 - Model Linear Jamak Dan OLS
No ratings yet
2 - Model Linear Jamak Dan OLS
11 pages
(Ebook PDF) Introduction To Econometrics 4Th Edition by James H. Stock Install Download
No ratings yet
(Ebook PDF) Introduction To Econometrics 4Th Edition by James H. Stock Install Download
52 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Multiple Regression Analysis: y + X + X + - . - X + U
No ratings yet
Multiple Regression Analysis: y + X + X + - . - X + U
43 pages
Econometrics
No ratings yet
Econometrics
13 pages
Additional Cheatsheet en
No ratings yet
Additional Cheatsheet en
3 pages
Introduction To Econometrics Ebook PDF
No ratings yet
Introduction To Econometrics Ebook PDF
89 pages
EE1 - 3 - Multiple Linear Regression
No ratings yet
EE1 - 3 - Multiple Linear Regression
30 pages
Simple Linear Regression Model I
No ratings yet
Simple Linear Regression Model I
83 pages
CH 03
No ratings yet
CH 03
17 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Oversikt ECN402
No ratings yet
Oversikt ECN402
40 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Iter 126 160
No ratings yet
Iter 126 160
35 pages
Topic 3 - Endogeneity
No ratings yet
Topic 3 - Endogeneity
53 pages
Lecture 2 - Regression - Multiple - Regressors
No ratings yet
Lecture 2 - Regression - Multiple - Regressors
30 pages
EC220/221 Introduction To Econometrics: Canh Thien Dang
No ratings yet
EC220/221 Introduction To Econometrics: Canh Thien Dang
30 pages
Linear Regression 101
No ratings yet
Linear Regression 101
20 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Wpp2022 Gen f01 Demographic Indicators Compact Rev1
No ratings yet
Wpp2022 Gen f01 Demographic Indicators Compact Rev1
3,602 pages
2.1 Regression Analysis
No ratings yet
2.1 Regression Analysis
28 pages
Advance Statistics (Simple Linear Regression)
No ratings yet
Advance Statistics (Simple Linear Regression)
3 pages
Simple Linear Regression Model Ordinary Least Square (OLS) Method
No ratings yet
Simple Linear Regression Model Ordinary Least Square (OLS) Method
18 pages
Topic 4
No ratings yet
Topic 4
15 pages
Wi3xBG41QGat8QRuNeBmXA Population Latin and Caribbean Countries 2010 2019
No ratings yet
Wi3xBG41QGat8QRuNeBmXA Population Latin and Caribbean Countries 2010 2019
10 pages
Professional Examination Exemptions-Actuarial Science-2020
No ratings yet
Professional Examination Exemptions-Actuarial Science-2020
10 pages
Multivariate Polynomial Regression
No ratings yet
Multivariate Polynomial Regression
4 pages
Lecture7 SIQ3003 PDF
No ratings yet
Lecture7 SIQ3003 PDF
23 pages
Employee Benefits 2 Employee Benefits 2
No ratings yet
Employee Benefits 2 Employee Benefits 2
4 pages
The Following Information Relates To The Defined Benefits Pension Scheme
No ratings yet
The Following Information Relates To The Defined Benefits Pension Scheme
1 page
PV Table
No ratings yet
PV Table
2 pages
Psy 234 Investigating Relationships Week 11
No ratings yet
Psy 234 Investigating Relationships Week 11
37 pages
6.life Tables
100% (1)
6.life Tables
18 pages
Employee Benefits P201
No ratings yet
Employee Benefits P201
17 pages
ACTUARIAL Science Career Path
No ratings yet
ACTUARIAL Science Career Path
5 pages
Fas 87
100% (1)
Fas 87
112 pages
Eviews 2
No ratings yet
Eviews 2
15 pages
Ias 26 Accounting and Reporting by Retirement Benefit Plans
No ratings yet
Ias 26 Accounting and Reporting by Retirement Benefit Plans
2 pages
Lesson 3 - Annuities
No ratings yet
Lesson 3 - Annuities
58 pages
Stata Output Panel Hsiao 1986 Example
No ratings yet
Stata Output Panel Hsiao 1986 Example
5 pages
Regression Analysis
No ratings yet
Regression Analysis
2 pages
Analisis Risiko Pada Proyek Pembangunan Parkir Basement Jalan Sulawesi Denpasar
No ratings yet
Analisis Risiko Pada Proyek Pembangunan Parkir Basement Jalan Sulawesi Denpasar
11 pages
Hubungan Kepadatan Penduduk DGN Kasus Covid
No ratings yet
Hubungan Kepadatan Penduduk DGN Kasus Covid
5 pages
04 - BIOE 221 - Basic Demog and Health Indicator Formula
No ratings yet
04 - BIOE 221 - Basic Demog and Health Indicator Formula
17 pages
M2 Ex 3.1 Chapter 3 Part 1
No ratings yet
M2 Ex 3.1 Chapter 3 Part 1
21 pages
A Robust Regression Method Based On Exponential-Type Kernel Functions - de Carvalho Et Al
No ratings yet
A Robust Regression Method Based On Exponential-Type Kernel Functions - de Carvalho Et Al
47 pages
Excel Perhitungan Laporan Keuangan
No ratings yet
Excel Perhitungan Laporan Keuangan
5 pages
Lecture - Multiple - Regression - Analysis
No ratings yet
Lecture - Multiple - Regression - Analysis
12 pages

Metrics Topic6 Part1 Multipleregression

Uploaded by

Metrics Topic6 Part1 Multipleregression

Uploaded by

Econ 3334: Introduction to Econometrics

Omitted variable bias and control variables

1 𝑖𝑓 𝑝𝑒𝑟𝑠𝑜𝑛 𝑖 𝑔𝑜𝑡 𝑖𝑛𝑓𝑒𝑐𝑡𝑒𝑑;

𝑢𝑖 might include person 𝑖’s age, original health condition,

2. Adopt a “cross tabulation” approach (matching) – within each group, all

3. Use a regression in which the omitted variable (PctEL) is no longer

• Y is the dependent variable

Population regression line before the change:

Difference: ΔY = E(Y|X1 + Δ𝑋1 , 𝑋2 ) − E(Y|X1 , 𝑋2 ) = 𝛽1 Δ𝑋1

TestScore = 698.9 – 2.28STR

Now include percent English Learners in the district (PctEL):

TestScore= 686.0 – 1.10STR – 0.65PctEL

• What happens to the coefficient on STR?

where k is the number of regressors.

• RMSE = std. deviation of uˆi (without degree-of-freedom adjustment)

• R2 = fraction of variance of Y explained by X

(1) TestScore=698.9 – 2.28STR,

(2) TestScore=686.0 – 1.10STR – 0.65PctEL,

Yi = 0 + 1X1i + 2X2i + … + kXki + ui, i = 1,…,n

1. E(ui|X1i = x1,…, Xki = xk) = 0, for all x1,…,xk

Perfect multicollinearity is when one of the regressors is an exact

• Solution: remove the redundant regressor from regression: the

Model: OLS, using observations 1-420

coefficient std. error t-ratio p-value

Conceptually, there is nothing new here!

The coefficients in (1) and (2) have different interpretations.

Mean dependent var 16.77115 S.D. dependent var 8.758696

Model 2: OLS, using observations 1-7986

Coefficient Std. Error t-ratio p-value

Mean dependent var 16.77115 S.D. dependent var 8.758696

Imperfect and perfect multicollinearity are quite different despite the

Imperfect multicollinearity occurs when two or more regressors are very

coefficient std. error t-ratio p-value

Model: OLS, using observations 1-420

coefficient std. error t-ratio p-value

• If SE of X’s coefficient is small, then there is no need to worry about

• In sum, whether imperfect multicollinearity matters or not depends

• Omitted variable bias

You might also like