0% found this document useful (0 votes)

7 views

Week_8_Multicollinearity

The document discusses multiple regression analysis, focusing on multicollinearity and testing multiple linear restrictions. It explains the implications of multicollinearity on coefficient estimates and standard errors, as well as methods for detecting and addressing it. Additionally, it covers the process of testing joint hypotheses using the F-test to determine the significance of a set of independent variables in a regression model.

Uploaded by

mali1102013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Week_8_Multicollinearity

Uploaded by

mali1102013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Quant II

Multiple Regression:
Multicollinearity &
Testing Multiple Linear Restrictions
Collinearity Assumption
• There are no exact linear relationships among
the independent variables
Yi = β0 + β1X1i + β2X2i + β3X3i + ui
X3 = X 1 + 2 X 2

• Would get 3 coefficients when trying to

estimate 4: β0* + β1*X1 + β2*X2 + ui
(The model is not identified; standard errors are infinite)
Collinearity Assumption
• The Gauss-Markov assumptions only require
that there is no perfect collinearity
– In practice, we hardly ever get perfect collinearity
– But we often encounter near-perfect collinearity
• Multicollinearity
Multicollinearity
• Multicollinearity- not perfect but “high”
collinearity between variables
– In a model with two explanatory variables, high
correlation between the two
– In model with more than two explanatory variables,
could also be that one variable is a linear
combination of the rest
• Will be harder to get good estimates because it
is harder to discriminate between the individual
effects of the explanatory variables
Different Degrees of Collinearity:
None
Multicollinearity
• Multicollinearity does not violate the Gauss-
Markov assumptions
– OLS estimators are still BLUE
– But won’t get as precise estimates as one would
in the absence of multicollinearity
• Harder to produce significant coefficients
Consequences
• Standard errors are valid, but they are larger
than they otherwise would be
• For a given σμ2 and sample variation in the Xs,
the smallest variance Var(bj) is obtained when
Xj has zero sample correlation with every other
independent variable
Variance of OLS Estimators
• Two explanatory variables:
Y = β0 + β1X1 + β2X2 + u
• If G-M assumptions are satisfied, then the
variance of the OLS estimators are:
Var(b1) = σ2b1 = *
where σ2 is the population variance of u and rX1X2
is the correlation between X1 and X2
Variance of OLS Estimators
• The general linear regression model:
Y = β0 + β1X1i + β2X2i + β3X3i + … + βk-1Xk-1i + ui

Var(bj) = σ2bj = *
where R j 2 is the proportion of the total variation
in Xi that can be explained by the other
independent variables in the model
Consequences
• High correlation among explanatory variables
does not necessarily lead to poor estimates:
– Will still get good estimates if the number of
observations and the variation in the explanatory
variables are high and the variance of the
disturbance term is small
• Model will be very sensitive to model
specification and outliers in the data
Detecting Multicollinearity
• A classic sign of the presence of a high degree
of collinearity is when you have a high R2 but
none of the variables show significant effects
– But possible to have high multicollinearity even if
low R2
Detecting Multicollinearity
• “High” pairwise correlations among the
independent variables
– High pairwise correlations is a sufficient, but not a
necessary, condition for near-perfect collinearity
• Might be more complex linear relationship among the Xs
• Regress each independent variable against the
other independent variables to see if there are
strong linear dependencies
• Calculate the Variance Inflation Factor (VIF)
– VIF>10 indicative of high collinearity
Dealing with Multicollinearity
• Worry about it when it affects your results
• Deal with the other factors that affect the
variance of your estimators
– Increase sample size, increase the sample
variation of Xi, etc…
• Not a good idea to drop one of the variables
– Will get bias
• Test for robustness
Dealing with Multicollinearity
• Bottom line: Multicollinearity is bad, but
almost always have it
– Standard errors are still correct, just larger
– All else being equal, for estimating βj it is better to
have less correlation between Xj and the other
independent variables
Testing Multiple Linear Restrictions
• The t statistic associated with any OLS coefficient can be used to
test whether the corresponding unknown parameter in the
population is equal to any given constant (usually zero)
Yi = β0 + β1X1i + β2X2i + β3X3i + ui
– This is a test of a hypothesis involving a single linear
restriction
HA0: β1 = 0 HB0: β2 = 0
HA1: β1 ≠ 0 HB1: β2 ≠ 0
• But we might be interested in testing the joint
explanatory power of a set of independent variables
H0: β1 = 0 and β2 = 0
H1: β1 ≠ 0 or β2 ≠ 0
Testing Multiple Linear Restrictions
• Testing multiple hypotheses about the
underlying parameters
– Determining whether or not the joint marginal
contribution of a group of variables is significant
– Whether a set of independent variables has no
partial effect on a dependent variable
• The F-test
– Different role in multiple regression analysis
Joint Hypothesis Test
• The null hypothesis is that a set of variables has
no partial effect on a dependent variable
• For example, the model:
Yi = β0 + β1X1i + β2X2i + β3X3i + β4X4i + β5X5i + ui
(this is called the unrestricted model)

– We are interested in whether X3, X4 and X5 have a

joint effect on Y

Yi = β0 + β1X1i + β2X2i + ui (restricted model)

Joint Hypothesis Test
• Null and research hypothesis:
– H0: β3 = 0, β4 = 0, and β5 = 0
– H1: β3 ≠ 0, β4 ≠ 0, or β5 ≠ 0 [H0 is not true ]
• How should we proceed in testing the null
against the alternative hypothesis?
– Separate t statistics would be misleading
– Need a way to test the exclusion restrictions jointly
Joint Hypothesis Test
• The residual sum of squares can be the basis
for testing multiple hypotheses
– How much the residual sum of squares increases
when we drop X3 , X4 , and X5 tells us something
• RSS always increases when variables are
dropped from the model
– The question is whether this increase is large
enough relative to the RSS in the model with all
the variables to warrant rejecting the null
Joint Hypothesis Test
F statistic = improvement in fit/extra degrees of freedom used up
Residual sum of squares remaining/degrees of
freedom remaining

Fq, n-k =

– Where RSSr is the residual sum of squares from the restricted model
– RSSur is the residual sum of squares from the unrestricted model
– q is the number of restrictions
– (n – k) is the denominator degrees of freedom (the degrees of
freedom in the unrestricted model)
Joint Hypothesis Test
Yi = β0 + β1X1i + β2X2i + β3X3i + β4X4i + β5X5i + ui
(unrestricted model)
Yi = β0 + β1X1i + β2X2i + ui (restricted model)

F3, n-6 =
Joint Hypothesis Test
• General model:
Y = β0 + β1X1 + β2X2 + … + βk-1Xk-1 + u
• Suppose we have q exclusion restrictions to
test (null is that q of the variables have zero
coefficients):
H0: βk-q = 0, … βk-1 = 0
• Restricted model:
Y = β0 + β1X1 + … + βk-q-1Xk-q-1 + u
Joint Hypothesis Test
• Reject H0 when F statistic is “sufficiently” large
– depends on the significance level
– F statistic > Fcrit
– Fcrit depends on 10%, 5%, 1% critical values
– If H0 is rejected, we say that Xk-q, … , Xk-1 are jointly
statistically significant at the appropriate
significance level
Example: Testing Multiple Linear
Restrictions
log(salary)i = β0 + β1yearsi + β2gamesyri + β3bavgi +
β4hrunsyri + β5rbisyri + ui , where
• salary = total salary
• years = total years in the major league
• gamesyr = average games played per year
• bavg = career batting average
• hrunsyr = homeruns per year
• rbisyr = runs batted in per year
Example
Source SS df MS Number of obs = 353
F( 5, 347) = 117.06
Model 308.989208 5 61.7978416 Prob > F = 0.0000
Residual 183.186327 347 .527914487 R-squared = 0.6278
Adj R-squared = 0.6224
Total 492.175535 352 1.39822595 Root MSE = .72658

logsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

years .0688626 .0121145 5.68 0.000 .0450355 .0926898

gamesyr .0125521 .0026468 4.74 0.000 .0073464 .0177578
bavg .0009786 .0011035 0.89 0.376 -.0011918 .003149
hrunsyr .0144295 .016057 0.90 0.369 -.0171518 .0460107
rbisyr .0107657 .007175 1.50 0.134 -.0033462 .0248776
_cons 11.19242 .2888229 38.75 0.000 10.62435 11.76048
Example

years games hruns rbis bavg

years 1.0000
games 0.9413 1.0000
hruns 0.6744 0.7711 1.0000
rbis 0.8223 0.9243 0.9320 1.0000
bavg 0.1973 0.2674 0.1990 0.2787 1.0000
Example
Log(salary)i = β0 + β1yearsi + β2gamesyri + β3bavgi +
β4hrunsyri + β5rbisyri + ui

– Suppose that we want to test the null hypothesis that once

years in the league and games per year are controlled for,
the statistics measuring performance (batting average,
homeruns and RBIs) have no effect on salary

– H0: β3 = 0, β4 = 0, and β5 = 0
– H1: H0 is not true
Example
Source SS df MS Number of obs = 353
F( 2, 350) = 259.32
Model 293.864058 2 146.932029 Prob > F = 0.0000
Residual 198.311477 350 .566604221 R-squared = 0.5971
Adj R-squared = 0.5948
Total 492.175535 352 1.39822595 Root MSE = .75273

logsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

years .071318 .012505 5.70 0.000 .0467236 .0959124

gamesyr .0201745 .0013429 15.02 0.000 .0175334 .0228156
_cons 11.2238 .108312 103.62 0.000 11.01078 11.43683
Example
Fq, n-k =

=
Fstat > Fcrit(3, 347) at the 1% significance level = 3.78, so we
reject the null hypothesis that bavg, hrunsyr and rbisyr
have no effect on salary
F-statistic
• Can also use the F-statistic to test the “overall
significance of the regression”
– Null would be: H0: β1 = β2 = … βk-1 = 0 (model has
no explanatory power
– H1: at least one different from zero
• In this case, you will have k-1 restrictions
F(k-1, n-k) = ESS/k-1
RSS/(n-k)

Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Mult Regression
No ratings yet
Mult Regression
28 pages
L10.2_2023
No ratings yet
L10.2_2023
64 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Complete Business Statistics: Multiple Regression
No ratings yet
Complete Business Statistics: Multiple Regression
64 pages
C2 English
No ratings yet
C2 English
34 pages
Notes 9
No ratings yet
Notes 9
57 pages
Bivariate
No ratings yet
Bivariate
28 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Brief Notes #11 Linear Regression
No ratings yet
Brief Notes #11 Linear Regression
6 pages
UNIT 3 For ACfn & MGT
No ratings yet
UNIT 3 For ACfn & MGT
28 pages
Financial Statistics - Formula Sheet
No ratings yet
Financial Statistics - Formula Sheet
26 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
Econometric estimation BETA
No ratings yet
Econometric estimation BETA
36 pages
Chapter 05 Inferences in GLRM
No ratings yet
Chapter 05 Inferences in GLRM
20 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
After Midterm Slides
No ratings yet
After Midterm Slides
134 pages
2024 1 Metrics 6 Multipleols 1
No ratings yet
2024 1 Metrics 6 Multipleols 1
35 pages
Estadística Clase 7
No ratings yet
Estadística Clase 7
24 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
HW1 (1)
No ratings yet
HW1 (1)
7 pages
Econometrics Chapter 8 PPT Slides
100% (1)
Econometrics Chapter 8 PPT Slides
42 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
Lecture set 5
No ratings yet
Lecture set 5
54 pages
2024 1 Metrics 6 Multipleols 4
No ratings yet
2024 1 Metrics 6 Multipleols 4
18 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
Econometric Project - Permanent Income Hypothesis
No ratings yet
Econometric Project - Permanent Income Hypothesis
9 pages
Part 8 Linear Regression
No ratings yet
Part 8 Linear Regression
6 pages
Elementary Regression Analysis
No ratings yet
Elementary Regression Analysis
25 pages
Introduction To Econometrics - Stock & Watson - CH 5 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 5 Slides
71 pages
Econ140 Spring2016 Section07 Handout Solutions
No ratings yet
Econ140 Spring2016 Section07 Handout Solutions
6 pages
L11_2023
No ratings yet
L11_2023
99 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Chapter 11 Lecture Notes .
No ratings yet
Chapter 11 Lecture Notes .
22 pages
economatrics 3
No ratings yet
economatrics 3
32 pages
Statistical Methods
No ratings yet
Statistical Methods
7 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
��
No ratings yet
��
3 pages
Exam preparation
No ratings yet
Exam preparation
8 pages
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
No ratings yet
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
6 pages
Notes 1017 Part1
No ratings yet
Notes 1017 Part1
50 pages
Linear Models
No ratings yet
Linear Models
35 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
C1 English
No ratings yet
C1 English
26 pages
BS Classes V2
No ratings yet
BS Classes V2
70 pages
econometrics-cheat-sheet
No ratings yet
econometrics-cheat-sheet
4 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Lecture4 Linearregression Oneregressor
No ratings yet
Lecture4 Linearregression Oneregressor
37 pages
4.1 Multiple Regression Models
No ratings yet
4.1 Multiple Regression Models
6 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Foundations of Elementary Analysis
From Everand
Foundations of Elementary Analysis
Roshan Trivedi
No ratings yet
Edt, D50, C50
No ratings yet
Edt, D50, C50
5 pages
Klintoe PDF
No ratings yet
Klintoe PDF
31 pages
Cost Estimation: True / False Questions
No ratings yet
Cost Estimation: True / False Questions
216 pages
DP-100 Designing and Implementing A
No ratings yet
DP-100 Designing and Implementing A
12 pages
4.79 F.Y.B.com Mathematics Statistics
No ratings yet
4.79 F.Y.B.com Mathematics Statistics
8 pages
MinE424 - 02statistical Applications in Projects
No ratings yet
MinE424 - 02statistical Applications in Projects
54 pages
Nike Socialdata Main
No ratings yet
Nike Socialdata Main
11 pages
Business Analytics, Global Edition James Evans instant download
No ratings yet
Business Analytics, Global Edition James Evans instant download
43 pages
Mathews Paul G. Design of Experiments With MINITAB PDF
100% (3)
Mathews Paul G. Design of Experiments With MINITAB PDF
521 pages
M Tech Biopharmaceutical
No ratings yet
M Tech Biopharmaceutical
38 pages
AMA Tutorial 1 (A)
No ratings yet
AMA Tutorial 1 (A)
9 pages
Bety Pro (Repaired)
No ratings yet
Bety Pro (Repaired)
23 pages
Jurnal Internasional 12
No ratings yet
Jurnal Internasional 12
12 pages
Chapter 10 How Costs Behave Practice Questions
No ratings yet
Chapter 10 How Costs Behave Practice Questions
7 pages
UNIT-1 Polynomial Regression
No ratings yet
UNIT-1 Polynomial Regression
7 pages
STA 114 Test 2 - 29th April 2023
No ratings yet
STA 114 Test 2 - 29th April 2023
7 pages
Chapter 10
No ratings yet
Chapter 10
89 pages
Div Class Title Explaining Fixed Effects Random Effects Modeling of Time Series Cross Sectional and Panel Data A Href fn2606 Ref Type FN A Div
No ratings yet
Div Class Title Explaining Fixed Effects Random Effects Modeling of Time Series Cross Sectional and Panel Data A Href fn2606 Ref Type FN A Div
21 pages
Multiple and Partial Correlation: R 1 R R R
No ratings yet
Multiple and Partial Correlation: R 1 R R R
3 pages
Interpreting The Basic Outputs (SPSS) of Multiple Linear Regression
No ratings yet
Interpreting The Basic Outputs (SPSS) of Multiple Linear Regression
6 pages
R Programming For Advanced Genomic Statistical Analysis and Biological Data Visualization-1
No ratings yet
R Programming For Advanced Genomic Statistical Analysis and Biological Data Visualization-1
6 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
63 pages
Introduction To Correlation and Regression Analysis
No ratings yet
Introduction To Correlation and Regression Analysis
23 pages
Interpolation Methods: Paul Bourke
No ratings yet
Interpolation Methods: Paul Bourke
12 pages
DS Lab Manual Final
No ratings yet
DS Lab Manual Final
49 pages
Business Statistics Unit 4 Correlation and Regression
No ratings yet
Business Statistics Unit 4 Correlation and Regression
27 pages
Regression
No ratings yet
Regression
8 pages
National P.G. College, Lucknow
No ratings yet
National P.G. College, Lucknow
16 pages
Medical Biostatistics 2
No ratings yet
Medical Biostatistics 2
278 pages
The Liquidity and Profitability of Small Scale Business 2020 Completed by Mike Bongalonta
No ratings yet
The Liquidity and Profitability of Small Scale Business 2020 Completed by Mike Bongalonta
10 pages

Week_8_Multicollinearity

Uploaded by

Week_8_Multicollinearity

Uploaded by

Quant II

• Would get 3 coefficients when trying to

– We are interested in whether X3, X4 and X5 have a

Yi = β0 + β1X1i + β2X2i + ui (restricted model)

logsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

years .0688626 .0121145 5.68 0.000 .0450355 .0926898

years games hruns rbis bavg

– Suppose that we want to test the null hypothesis that once

logsalary Coef. Std. Err. t P>|t| [95% Conf. Interval]

years .071318 .012505 5.70 0.000 .0467236 .0959124

You might also like