0% found this document useful (0 votes)
9 views

2SLS Notes

Uploaded by

moira142560
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

2SLS Notes

Uploaded by

moira142560
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Instrumental Variable (IV) Regression:

Two-Stage Least Square (2SLS)

Dr. Elya Nabila Abdul Bahri

1
Organization

• Introduction
• Endogeneity problem
• 2SLS
• Hands on using Stata
• Interpretation

2
Regression with Instrumental Variables

What are instrumental variables (IV) methods?


• Most widely known as a solution to endogeneous
regressors: explanatory variables correlated with the
regression error term, IV methods provide a way to
nonetheless obtain consistent parameter estimates.

3
Regression with Instrumental Variables (cont.’)

Although IV estimators address issues of endogeneity, the


violation of the zero conditional mean assumption caused
by endogeneous regressors can also arise for two other
common causes: measurement error in regresors (errors-
in-variables) and omitted-variables bias.

4
Endogeneity: An Example
Standard regression: y = xb + u
no association between x and u; OLS consistent
x y
u
Endogeneity: y = xb + u
correlation between x and u; OLS inconsistent
x y

5
Endogeneity

• The correlation between x and u (or the failure of the zero


conditional mean assumption E[u|x] =0) can be caused by
any of several factors.
• This arises naturally in the context of a simultaneous
equations model. For example: supply-demand system in
economics, in which price and quantity are jointly
determined in the market for that good or service.

6
Instrumental Variables Regression
The solution provided by IV methods may be viewed as:

Instrumental variables regression: y = xb + u


z uncorrelated with u, correlated with x
z x y

u
The additional variable z is termed an instrument for x. In general, we
have many variables in x, and more than one x correlated with u.
In that case, we shall need at least that many variables in z.
7
Choice of instruments

To deal with the problem of endogeneity in a supply-


demand system, a candidate z will affect x but not directly
impact y.
An example for an agricultural commodity might be
temperature or rainfall: clearly exogenous to the market,
but likely to be important in the production prosess.

8
Instrumental Variables Models

Not all of the available variation in X is used


Only that portion of X which is explained by Z is used to explain Y

9
Instrumental Variables Terminology

Three different models to be familiar with:

An interesting equality:

10
Different Types of Instrumental Variables Estimators
Wald estimator for binary instrument:

 Difference in response / Difference in treatment


Instrumental variables (IV) estimator:

 Shows that biv can be recovered from two samples


Two-stage least squares (2SLS) estimator:

 X represents “fitted” value from first-stage model


11
2SLS estimator
Standard IV estimator:

Set of instruments:

Two-stage least quares (2SLS) estimator:

It compute by single matrix equation.

12
Different Types of Instrumental Variables Estimators

Single binary instrument and no control variables..

Single instrument (binary or continuous) with or without


control variables..

Multiple instruments (binary or continuous) with or


without control variables..
13
Two-Stage Least Squares

Step 1: First-Stage Least Square

Obtain fitted values ( ) from the first-stage model

Step 2: Two-Stage Least Squares

Substitute the fitted in place of the original X

14
Including Control Variables in an 2SLS Model

Control variables (W’s) should be entered into the model at


both stages
First stage:

Second stage:

Control variables are considered “instruments”, they are


just not “excluded instruments”
They serve as their own instruments

15
Technical Conditions Required for Model Identification

Order condition = At least the same of IV’s as endogenous


X’s
Just-identified model: IV’s = X’s
Over-identified model: IV’s > X’s

Rank condition = At least one IV must be significant in the


first-stage model
Number of linearly independent columns in a matrix
E(X|Z, W) cannot be perfectly correlated with E(X|W)

16
Instrumental Variables and Randomized Experiments

Two different measures of treatment (X)


Treatment assigned = Exogenous
Intention-to-treat (ITT) analysis
Reduced-form model:
Often leads to underestimation of treatment effect

Treatment delivered = Endogenous


Individuals who do not comply probably differ in ways that can undermine
the study
Self-selection – bias and inconsistency
17
An Example: 2SLS in Cross-Sectional Data

Demand and Supply


Command in STATA: ivregress

18
Results for 2SLS (Cross-section)

19
Post-estimation: Tests of Over-identifying Restriction

Ho: Overidentifying restrictions


H1: Underidentifying restrictions

Failed to reject Ho means the instruments are valid.

20
Post-estimation: First-Stage Regression

The first-stage
regression should
be significant.

21
Post-estimation: Tests of Endogeneity

Durbin-Hausman-Wu tests

Ho: Variables are exogenous


H1: Variables are endogenous

Failed to reject Ho means the variables are exogenous.

22
2SLS in Cross-Section using ivreg2 in STATA

Command in STATA: ivreg2

ivreg2 Squantity praw (price = pcompete income),


endog(price) first

23
2SLS from ivreg2 Results: First-Stage

24
Post-estimation: First-Stage
• Sanderson-Windmeijer multivariate F test of excluded
instruments
• Under-identification test: Anderson canonical correlation
LM statistic
• Weak identification test: Cragg-Donald Wald F statistic
• Weak-instruments-robust inference: Anderson-Rubin
Wald test

25
Sanderson-Windmeijer Multivariate F-test of Excluded
Instruments
• The Sanderson-Windmeijer (SW) first-stage chi-squared and F statistics
are tests of underidentification and weak identification, respectively, of
individual endogenous regressors.

• SW Chi-squared:
• Ho: Individual endogenous regressor is under-identified.
• H1: Individual endogenous regressor is over-identified.

• SW F-statistics:
• Ho: Individual endogenous regressor is weak identification.
• H1: Individual endogenous regressor is strong identification.
26
Anderson Canonical Correlation LM Statistics

• The Anderson Canonical Correlation LM Statistics is test


of under-identification in matrix with rank

• Ho: Matrix of reduced form coefficients has rank=K1-1


(under-identified)
• H1: Matrix has rank=K1 (identified)

27
First-Stage Post-estimation (1)

28
Weak identification test: Cragg-Donald Wald F-statistic
H0: Equation is weakly identified
H1: Equation is strongly identified

If Cragg-Donald (1993) statistic (multivariate version of the


Wald F statistic) larger than Stock-Yogo (2005) critical
value, it means the null hypothesis of equation is weakly
identified is rejected.

29
Stock and Yogo (2005) Critical Values
Stock and Yogo (2005) provide critical values that depend
on:
• The number of endogeneous regressors,
• The number of instruments
• The maximum bias
• The estimation procedure (e.g. 2SLS, LIML, …)

ivreg2 in STATA can provide the critical values.

30
Weak-instruments-robust inference:
Anderson-Rubin Wald test

Anderson-Rubin Wald test is the test of joint significance


of endogenous regressors B1 in main equation

H0: B1 = 0 and orthogonality conditions are valid


H1: B1 ≠ 0 and orthogonality conditions are not valid

31
First-Stage Post-estimation (2)

32
2SLS from ivreg2 Results

33
Post-estimation: Two-Stage Least Squares
• Under-identification test: Anderson canonical correlation
LM statistic
• Weak identification test: Cragg-Donald Wald F statistic
• Over-identification test of all instruments: Sargan
statistic
• Endogeneity test: Wu-Hausman F test

34
Sargan Test
• Sargan test is the test of over-identification for all
instruments

• Ho: over-identification for all instruments


• H1: under-identification for all instruments

35
Endogeneity Test: Wu-Hausman Test
• Wu-Hausman test is the test of endogeneity

• Ho: over-identification for all instruments


• H1: under-identification for all instruments

36
Post-estimation: Two-Stage Least Squares

37
An Example for Two-Stage Least Squares: Panel Data
• 64 Countries
• Year 2009-2013
• GDP equation

• xtivreg2 lgdppc lfdi lcpi avschl (lfcapital=lpop lim lex), fe


endog(lfcapital) cluster(code year)

38
Result for First-Stage Least Squares: Panel Data

39
Post-Estimation (1) First-Stage Least Squares: Panel Data

40
Post-Estimation (2) First-Stage Least Squares: Panel Data

41
Result for Two-Stage Least Squares: Panel Data

42
Post-Estimation Two-Stage Least Squares: Panel Data

43
Q&A

44

You might also like