0% found this document useful (0 votes)
6 views

Lecture 7

Uploaded by

Nawal Tahaaa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture 7

Uploaded by

Nawal Tahaaa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Econometrics

Lecture 7: OLS Assumptions


check and remedy
DR . N OUR A A N WA R A BDE L - FAT TAH
A S S OCIATE P ROF ESSOR OF STATISTICS
This lecture will cover the
following topics:
1. What is meant by “controlling the effect of other variables” or “Adjusted for”
??? By example

2. How the outliers in the outcome variable and in the predictor space (X – space)
affect the model?

3. Effects of an additional predictor.

4. Assumption underlying the OLS method of estimation.

5. Quick check of the validity of the first two OLS assumptions.


What do we mean by
controlling the effect of
a predictor in a
regression model?
Interpretation of estimated coefficients in Multiple
regression model

𝛽1 𝑖𝑠 𝑡ℎ𝑒 𝑒𝑓𝑓𝑒𝑐𝑡 𝑜𝑓 𝑋1 𝑜𝑛 𝑌 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑙𝑖𝑛𝑔 𝑡ℎ𝑒 𝑒𝑓𝑓𝑒𝑐𝑡 𝑜𝑓 𝑋2 , … , 𝑋𝑝


What do we mean by controlling or adjusting
for the effect of some variable??

Let’s answer by a simple example.

When we have two predictors,


What do we mean by controlling or adjusting
for the effect of some variable??

 Fitted Model:

 𝑌෡ = 15.3276 + 0.7803 𝑋1 − 0.0502 𝑋2


What do we mean by controlling or adjusting
for the effect of some variable??
 This interpretation can be easily understood when we
consider the fact that multiple regression equation can
be obtained through a series of simple regression
equation, for example, the coefficient of 𝑋2 can be
obtained as follows:
 1) fit the regression model that relates Y to 𝑋1 , then
save the residuals 𝑒𝑌.𝑋1 .

 𝑌෡ = 14.37 + 0.75 𝑋1
What do we mean by controlling or adjusting
for the effect of some variable??

2) Fit the simple regression model that relates 𝑋2


(considered as a response) to 𝑋1 , then save the
residuals 𝑒𝑋2 .𝑋1 .
𝑋෢2 = 18.96 + 0.513 𝑋1

3) Fit a simple regression model where the response


variable is 𝑒𝑌.𝑋1 and the predictor is 𝑒𝑋2 .𝑋1
𝑒ෟ
𝑌.𝑋1 = 0 − 0.0502 𝑒𝑋2.𝑋1
What do we mean by controlling or adjusting
for the effect of some variable??
In the first step, the residuals represent the residual of variation
in Y after partialling out the effect of X1.

In the second step, the residuals represent the residual of


variation in X2 after partialling out the effect of X1 .

The third step represents the effect of X2 on Y after partialling


out the effect of X1 from both of them.
Effects of an
additional
predictor.
Effects of an additional
predictor.
Effects of an additional
predictor.
Data distribution
problems.
Outliers in the response
variable
Outliers in the predictors
Measures of Influence
check of the validity
of the first two OLS
assumptions
Assumption underlying the
OLS method of estimation.
1) Assumption about the form of the model:
• Linearity Assumption

2) Assumptions about the error:


• The error has normal distribution with mean = 0 and a constant
variance = 𝜎 2 , The constant variance assumption sometimes
called “homoscedasticity” assumption
• The errors should also be iid, independent and identically
distributed. This assumption is also called “No Autocorrelation”
assumption.
Assumption underlying the
OLS method of estimation.
3) Assumptions about the predictors:
• No measurement error.
• The predictor variables are assumed to not have a linear
multicollinearity.
Quick check of the validity of
the first two OLS assumptions.
First: Linearity Assumption

We can check for this assumption by plotting Y against X in simple


regression model.

In multiple regression model, we draw scatter plot of unstandardized 𝑦;



which approximates all X’s, against unstandardized residuals.
Quick check of the validity of
the first two OLS assumptions.
Apply on your data set

If the pattern of the scatter plot is symmetric around the main diagonal,
the assumption hold.
If not,
◦ Make transformation of Y.
◦ Curve estimation of Y with each X to detect the source of nonlinearity.
Quick check of the validity of
the first two OLS assumptions.
Quick check of the validity of
the first two OLS assumptions.
Quick check of the validity of
the first two OLS assumptions.
Quick check of the validity of
the first two OLS assumptions.
Checking
Autocorrelation and
Multicollinearity
assumptions:
Autocorrelation of errors.

You can draw an index plot, where the residuals are drawn against the
observation number.

If the residuals are independent, the points should be scattered randomly


within a horizontal band around zero.

Otherwise, the errors are correlated.


Autocorrelation of errors.
You can also run Durbin-Watson Test
The output show a calculated value of DW and two other reference
values; Dwlow , and DWupp.

if DW < Dwlow , hence there is a positive


autocorrelation.
If DW> 4*Dwupp , hence there is a negative
autocorrelation.
Autocorrelation of errors.
+ve -ve

+ve: Where the values of consecutive residuals are of the same sign.
-ve: Where the values of consecutive residuals are of different signs.
Autocorrelation of errors.
Remedy: by using lagged predictors.

y1 = y t - ρ y t-1
x1 = x t - ρ x t-1

we may have a priori information about the value of ρ.


Otherwise, calculate it as follows:
ρ = 1-DW/2
Then run the regression using the new variables and check the autocorrelation
again.
Multicollinearity of predictors.
When the predictors are highly correlated linearly.

We can check for this assumption by running collinearity diagnostics in SPSS.

High multicollinearity yields into insignificant predictors’ coefficients, very high


𝑅2 and instable regression coefficients. This can also serve as an indicator of
collinearity.
Multicollinearity of predictors.
When VIF > 10, this is an indication of collinearity

Another criterion is the condition index which should not exceed 30.
Multicollinearity of predictors.
Remedy:
◦ Stepwise regression
◦ Principle component regression
◦ Ridge regression
◦ Factor analysis
Assignment # 4:
Diagnose the autocorrelation and the multicollinearity on your example
data.

You might also like