Instrumental-variables-slides-2021
Instrumental-variables-slides-2021
Class Notes
Manuel Arellano
Revised: March 1, 2021
Introduction
So far we have studied regression models; that is, models for a conditional
expectation or a linear approximation to it.
Now we wish to study relations between random variables that are not regressions.
An example is the relationship between yt and yt 1 in an ARMA(1,1) model.
A linear regression is as a linear relationship between observable and unobservable
variables with the property that regressors are orthogonal to the unobservable term.
For example, given two variables (yi , xi ), the regression of y on x is
yi = α + βxi + ui (1)
where β = Cov (yi , xi ) /Var (xi ), therefore Cov (xi , ui ) = 0.
Similarly, the regression of x on y is:
xi = γ + δyi + εi
where δ = Cov (yi , xi ) /Var (yi ), and Cov (yi , εi ) = 0. Solving the latter for yi also:
yi = α† + β† xi + ui† (2)
†
with α† = γ/δ, β = 1/δ, ui† = εi /δ.
2
Introduction (continued)
Both (1) and (2) are statistical relationships between y and x . If we are interested in
some economic relation between y and x , how should we choose?
If the goal is to describe means, we would opt for (1) if interested in the mean of y
for given values of x , and for (2) if interested in the mean of x for given values of y .
However, in the ARMA(1,1) model both the left-hand side and the right-hand side
variables are correlated with the error term.
To respond a question of this kind we need a prior idea about the nature of the
unobservables in the relationship.
3
Measurement error
4
Measurement error in an exact relationship
yi = α + βxi
yi = yi + vi
yi = α + βxi + vi .
In this case β coincides with the slope coe¢ cient in the regression of yi on xi :
Cov (xi , yi )
β= .
Var xi
5
Measurement error in an exact relationship (continued)
Now suppose that we observe yi without error but xi is measured with an error εi
independent of (yi , xi ):
xi = xi + εi .
The relation between the observed variables is
yi = α + βxi + ζ i (3)
where ζ i = βεi .
In this case the error is independent of yi but is correlated with xi .
Thus, β coincides with the inverse slope coe¢ cient in the regression of xi on yi :
Var (yi )
β= . (4)
Cov xi , yi
In general, inverse regression makes sense if one suspects that the error term in the
relationship between y and x is essentially driven by measurement error in x .
As it will become clear later, (4) can be interpreted as an instrumental-variable
parameter in the sense that yi is used as an instrument for xi in (3).
Next, we consider measurement error in regression models.
6
Regression model with measurement error
yi = α + βxi + ui
where ui is independent of xi .
We distinguish two cases: one in which there is measurement error in yi and another
in which there is measurement error in xi .
7
Measurement error in yi
yi = α + βxi + (ui + vi ) ,
so that
Cov (xi , yi ) Cov (xi , yi )
β= = .
Var xi Var xi
The only di¤erence with the original regression is that the variance of the error term
is larger due to the measurement error, which means that the R 2 will be smaller:
R2
R2 = σ2v
.
1+
β2 Var (xi )+σ2u
8
Measurement error in xi
Then
Cov (xi , yi ) Cov (xi , yi ) β λ
β= = = =β β
Var (xi ) Var xi + σ2ε σ2
1 + Var (εx ) 1+λ
i
Thus, OLS estimates will be biased for β with a bias that depends on the noise to
signal ratio λ.
For example, if λ = 1 the regression coe¢ cient will be half the size of the e¤ect of
interest.
9
Identi…cation using λ
yi = xi0 β + ui εi0 β
xi = xi + εi , E εi εi0 = Ω,
a vector-valued generalization of (5) takes the form:
1
β = E xi xi0 Ω E (xi yi ) .
10
Instrumental-variable model
11
Identi…cation
The set-up is as follows. We observe fyi , xi , zi gni=1 with dim (xi ) = k, dim (zi ) = r
such that
yi = xi0 β + ui E (zi ui ) = 0.
Typically there will be overlap between variables contained in xi and zi , for example a
constant term (“control” variables).
Variables in xi that are absent from zi are endogenous explanatory variables.
Variables in zi that are absent from xi are external instruments.
The assumption E (zi ui ) = 0 implies that β solves the system of r equations:
E zi yi xi0 β =0
or
E zi xi0 β = E (zi yi ) . (6)
If r < k, system (6) has a multiplicity of solutions, so that β is not point identi…ed.
If r k and rank E (zi xi0 ) = k then β is identi…ed.
In estimation we will distinguish between the just-identi…ed case (r = k) and the
over-identi…ed case (r > k).
If r = k and the rank condition holds we have
1
β = E zi xi0 E (zi yi ) . (7)
12
Identi…cation (continued)
In the simple case where xi = (1, xoi )0 , zi = (1, zoi )0 and β = ( β1 , β2 )0 we get
Cov (zoi , yi )
β2 =
Cov (zoi , xoi )
and
β1 = E (yi ) β2 E (xoi ) .
In general, the OLS parameters will di¤er from the parameters in the IV model.
E (ui j zi ) = 0.
13
Examples
Demand equation
In this example the units are markets across space or over time, yi is quantity, the
endogenous explanatory variable is price and the external instrument is a supply
shifter, such as weather variation in the case of an agricultural product.
Here the units are workers, the endogenous explanatory variable is an indicator of
participation in a training program and yi is some subsequent labor market outcome,
such as wages or employment status.
In this example we would expect the coe¢ cient in the instrumental-variable line to be
positive, whereas the coe¢ cient in the OLS line could be negative.
14
Examples (continued)
Measurement error
Consider the measurement error regression model:
yi = β1 + β2 xi + vi
where we observe two measurements of xi with independent errors:
x1i = xi + ε1i
x2i = xi + ε2i .
All unobservables xi , vi , ε1i , ε2i are mutually independent.
In this example, we could have xi = (1, x1i )0 , zi = (1, x2i )0 and ui = vi β2 ε1i ; or
alternatively xi = (1, x2i )0 , zi = (1, x1i )0 and ui = vi β2 ε2i .
15
Estimation
Simple IV estimator
Also,
p d
n b
β β ! N 0, H 1
WH 0 1
d
if n 1/2 ∑ni=1 zi ui ! N (0, W ).
When fyi , xi , zi gni=1 is a random sample W = E ui2 zi zi0 .
16
Overidenti…ed IV
17
Asymptotic normality
with
1 1
VG = GE zi xi0 GE ui2 zi zi0 G 0 E xi zi0 G 0 . (11)
18
Optimality
1
For G = E (xi zi0 ) E ui2 zi zi0 the matrix VG equals
h i 1 1
V0 = E xi zi0 E ui2 zi zi0 E zi xi0 .
19
Two-stage least squares
b = ( ∑n xi z 0 ) ( ∑n zi z 0 ) 1
Letting Π i =1 i i =1 i be the sample counterpart of Π, the two-stage
least squares estimator is
1
b b i xi0
β2SLS = ∑ni=1 Πz b i yi
∑ni=1 Πz (13)
or in short
b 1
β2SLS = ∑ni=1 xbi xi0 ∑ni=1 xbi yi (14)
b i is the vector of …tted values in the (“…rst-stage”) regressions of the
where xbi = Πz
xi variables on zi :
xi = Πzi + vi (15)
If a variable in xi is also contained in zi its …tted value will coincide with the variable
itself and the corresponding element of vi will be equal to zero.
Sometimes it is convenient to use matrix notation as follows:
b = X 0Z Z 0Z 1
Π
so that h i
1 1 1
b
β2SLS = X 0Z Z 0Z Z 0X X 0Z Z 0Z Z 0y
b = Z (Z 0 Z ) 1
and letting X (Z 0 X ) also:
1
b b 0X
β2SLS = X b 0y .
X
20
Two-stage least squares (continued)
b b:
β2SLS is also the OLS regression of y on X
1
b b 0X
β2SLS = X b b 0y .
X
This interpretation of the 2SLS estimator is the one that originated its name.
Consistency of b
β2SLS relies on n ! ∞ for …xed r .
21
Robust standard errors
Although its optimality requires homoskedasticity, 2SLS (like OLS) remains a popular
estimator under more general conditions.
22
Robust standard errors (continued)
b2 = n
where σ 1
∑ni=1 u
bi2 .
bb
Note that if the residual variance is calculated from …tted-value residuals y X β2SLS
b = y Xb
instead of u β2SLS , we would get an inconsistent estimate of σ2 and
therefore also of VΠ in (17).
23
Testing overidentifying restrictions
GE zi xi0 β = GE (zi yi )
Thus, there remains r k linearly independent combinations that are not set to zero
in estimation but should be close to zero under correct speci…cation.
24
Testing overidentifying restrictions (continued)
f d
e0 Z W
SR = u 1
Z 0u
e ! χ2r k (20)
f = ∑n u
where W 2 0
i =1 bi zi zi and u
e=y Xb f
βG † with Gn† = (X 0 Z ) W 1.
n
Contrary to b
β2SLS , the IV estimator b
βG † given by
n
h i 1
b
βG † = f
X 0Z W 1
Z 0X f
X 0Z W 1
Z 0y (21)
n
This improved IV estimator was studied by Halbert White in 1982 under the name
two-stage instrumental variables (2SIV) estimator.
26