0% found this document useful (0 votes)
19 views49 pages

Econometrics Review

This document provides an overview of instrumental variables and the two-stage least squares estimation technique. It defines what an instrumental variable is and the two key requirements for a valid instrument: relevance and exogeneity. It then explains the two-stage process for two-stage least squares, where the first stage uses the instrument to predict the endogenous variable, and the second stage uses the predicted values in a regression of the outcome variable. The document provides an example applying two-stage least squares to estimate the effect of education on wages, using parents' education as instruments for years of own education.

Uploaded by

kjustmailme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views49 pages

Econometrics Review

This document provides an overview of instrumental variables and the two-stage least squares estimation technique. It defines what an instrumental variable is and the two key requirements for a valid instrument: relevance and exogeneity. It then explains the two-stage process for two-stage least squares, where the first stage uses the instrument to predict the endogenous variable, and the second stage uses the predicted values in a regression of the outcome variable. The document provides an example applying two-stage least squares to estimate the effect of education on wages, using parents' education as instruments for years of own education.

Uploaded by

kjustmailme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Culture, Institutions, and Development

Econometric Methods
A Quick Review

Francesco Cinnirella

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 1 / 49


Literature

Stock, James H. and Mark W. Watson (2011), Introduction to Econometrics, 3rd


Edition, Pearson (Chapter 13)

Angrist, J. and J.S. Pischke (2009), Mostly Harmless Econometrics: An Empiricist’s


Companion, Princeton University Press

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 2 / 49


Causality vs. correlation

Correlation does not imply causation!

By exploiting “natural experiment” and modern econometric techniques (e.g. IV,


Diff-in-Diff) the literature considered in this course attempts to establish causal
effects of institutions, geography, and culture on economic development

For example in the case of institutions: Natural experiments are “unusual


historical events where, while other fundamental causes of economic growth are
held constant, institutions change because of potentially-exogenous reasons”
(Acemoglu, 2009)

Tractability of variables: A lot of effort has been put into collecting and compiling
new variables based on original or secondary historical sources

Importance of explaining the mechanisms: The new research paradigm is not


really if history matters, but rather why and how it matters

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 3 / 49


Instrumental variables

Instrumental Variables

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 4 / 49


Instrumental variables

The OLS model

The population multiple regression model:

Yi = β0 + β1 X1i + β2 X2i + . . . + βk Xki + ui , i = 1, . . . , n (1)

The Least Squares Assumptions:


1 The conditional distribution of ui given X1i , X2i , . . . , Xki has a mean of zero.
E(ui |X1i , X2i , . . . , Xki ) = 0
2 (Yi , X1i , X2i , . . . , Xki ), i = 1, 2, . . . , n are i.i.d. random variables.
3 Large outliers are unlikely → X1i , X2i , . . . , Xki and Yi have nonzero finite fourth
moments.
4 There is no perfect multicollinearity.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 5 / 49


Instrumental variables

Introduction

There are three important threats to internal validity:

Omitted variable bias: the omitted variable is correlated with X but


unobservable, and therefore cannot be included in the regression;

Simultaneity bias (X affects Y , Y affects X);

Bias due to measurement error (“errors-in-variables”: X is wrongly measured).

A regression with instrumental variables Z can eliminate the bias caused by


E(u|X) 6= 0.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 6 / 49


Instrumental variables

Introduction

The instrumental variable (IV) estimator is very often used in applied research.

Advantage: with the IV estimator, we can obtain consistent estimates with any
type of correlation between the error term and the explanatory
variables.
Disadvantage: an appropriate instrument needs to be found.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 7 / 49


Instrumental variables

Introduction
Intuition

The OLS estimator exploits the total variation of the dependent and independent
variables.

The variation of one endogenous explanatory variable can be decomposed into


two parts:
one part which is correlated with the error term (endogenous variation);
one part which is uncorrelated with the error term (exogenous variation).

The IV estimator is based on this decomposition.

Only the exogenous variation in the explanatory variables is exploited to estimate


the parameters of interest.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 8 / 49


Instrumental variables

Introduction
Validity

Let’s focus on the bivariate regression case.

We want to estimate the following model:

Y = β0 + β1 X + u (2)

Unfortunately X and u happen to be correlated.

Then we search for an instrument (Z).

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 9 / 49


Instrumental variables

Introduction
Validity

For a variable Z to be an appropriate instrument for X, it must fulfill two assumptions:


IV1 - Relevance: Cov(X, Z) 6= 0
IV2 - Exogeneity: Cov(Z, u) = 0

The latter property is also called exclusion restriction.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 10 / 49


Instrumental variables

Two-Stage Least Squares

If the instrument Z satisfies the conditions of instrument relevance and


exogeneity, then the coefficient β1 can be estimated using an IV estimator called
Two Stage Least Squares (2SLS or TSLS).

2SLS is calculated in two stages:

1 The first stage decomposes X into two components: a problematic component that
may be correlated with the regression error, and another problem-free component
that is uncorrelated with the error.

2 The second stage uses the problem-free component to estimate β1 .

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 11 / 49


Instrumental variables

Two-Stage Least Squares

The first stage begins with a population regression linking X and Z (reduced form
equation):

Xi = π0 + π1 Zi + vi (3)

where π0 is the intercept, π1 is the slope, and vi is the error term.

The idea behind 2SLS is to use the problem-free component of Xi , π0 + π1 Zi and to


disregard vi .

The first stage of 2SLS uses the predicted value from the OLS regression,
X̂i = π̂0 + π̂1 Zi .

The second stage of 2SLS is easy: regress Yi on X̂i using OLS. The resulting
estimators from the second stage regression are the 2SLS estimators.

Yi = β0 + β1 X̂i + ui (4)

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 12 / 49


Instrumental variables

Two-Stage Least Squares


Example

Suppose we have the following (structural) wage equation:

ln(wage) = β0 + β1 educ + β3 exp + β4 exp2 + u (5)

where
ln(wage): natural logarithm of wage,
educ: years of education,
exp: years of working experience.

It is plausible to assume that u is uncorrelated both with exp and with exp2 .

educ is correlated with u, since u contains unobserved ability.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 13 / 49


Instrumental variables

Two-Stage Least Squares


Example

We also believe (for the moment, at least) that mother’s and father’s education
(meduc and f educ) are uncorrelated with u.

Therefore we can use them as instruments for educ.

Then the reduced form for educ is:

educ = γ0 + γ1 exp + γ2 exp2 + γ3 meduc + γ4 f educ + v (6)

Remember: IV1 implies that γ3 6= 0 or γ4 6= 0 (or both).

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 14 / 49


Instrumental variables

Two-Stage Least Squares


Example

First, we estimate (5) by OLS (we also want to compare OLS and 2SLS results):

\ = −0.522 + 0.107 educ + 0.042 exp − 0.001 exp2


ln(wage) (7)
(0.199) (0.014) (0.013) (0.0004)

Now we estimate (5) by 2SLS:

\ = 0.048 + 0.061 educ + 0.044 exp − 0.001 exp2


ln(wage) (8)
(0.400) (0.031) (0.013) (0.0004)

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 15 / 49


Instrumental variables

Two-Stage Least Squares


Example

Therefore:

β
c1
OLS = 0.107 , versus β1 2SLS = 0.061
c
(0.014) (0.031)

Note: The IV standard error is relatively large.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 16 / 49


Instrumental variables

Testing the IV Assumptions


Relevance of Instruments (IV1)

The correlation between the instrument(s) and the endogenous variable can be tested
statistically.

In the bivariate model (2) with one instrument Z, we estimate:

X = γ0 + γ1 Z + v (9)

and test then the Hypothesis that γ1 = 0 (t-test).

If γ1 is statistically significant, assumption IV1 is satisfied.

If, together with X, the model contains further explanatory variables, they all have to
be included in (9) as well.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 17 / 49


Instrumental variables

Testing the IV Assumptions


Relevance of Instruments (IV1)

If we have two instrumental variables (Z1 and Z2 ) for X, instead of (9), we estimate
the regression
X = γ0 + γ1 Z1 + γ2 Z2 + v (10)
and then test the hypothesis γ1 = γ2 = 0 with an F -test.

Here too, we need to include the additional exogenous explanatory variables.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 18 / 49


Instrumental variables

Testing the IV Assumptions


Exogeneity of Instruments (IV2)

In a model with one endogenous explanatory variable and one instrument, assumption
IV2 cannot be statistically tested.

The reason being that the test would involve a correlation between the instrumental
variable and an unobservable error.

It is therefore the case of a non-testable assumption.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 19 / 49


Instrumental variables

Testing the IV Assumptions


Exogeneity of Instruments (IV2)

The plausibility of an assumption which cannot be tested must be assessed in each


individual case using common sense (or economic theory).

Whether an IV estimation is convincing depends on whether it is plausible that the


instrument is not correlated with the error term.

If several instruments are available for one single endogenous explanatory variable, we
can test the overidentifying restrictions (but in this case we have a remaining
non-testable core of assumption IV2).

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 20 / 49


Instrumental variables

Testing the IV Assumptions


Test for Overidentifying Restrictions

If the number of instruments is larger than the number of endogenous explanatory


variables, we can use a standard test to determine whether the instruments are
exogenous.

To test “H0 : all instruments are exogenous”, we proceed as follows:


Step 1 Estimate the 2SLS residuals, û;
Step 2 obtain the R2 from the regression of û on all exogenous variables
(instruments and exogenous explanatory variables);
Step 3 nR2 v χ2q , where q is the number of overidentifying restrictions.

Then we reject H0 if nR2 is larger than the critical value of the χ2q of our choice.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 21 / 49


Instrumental variables

Testing the IV Assumptions


Violation of the IV Assumptions

Assumption IV1 states that instruments and endogenous variable must be correlated.
A correlation equal to 0 is improbable in practice.

But a serious problem can arise if the correlation between the instrument and the
endogenous variable is small.

In this case the IV or 2SLS estimators cannot provide reliable results. In the literature,
this is called the problem of weak instruments.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 22 / 49


Instrumental variables

Testing the IV Assumptions


Violation of the IV Assumptions

A serious problem can arise if the correlation between the instrument and the
endogenous variable is small.

In this case the IV or 2SLS estimators cannot provide reliable results. This is
called the problem of weak instruments.

If the relationship between the instrument and the explanatory variable is weak,
but assumption IV2 is satisfied, then:
the bias of the IV estimator can be very large, even in large samples;.
the IV estimator is not normally distributed, even in large samples.

In the case of one endogenous explanatory variable and several instruments, the
following holds as “rule of thumb”: in the first-stage regression, if the F -statistic
for overall significance of the instrument(s) is larger than 10, the instruments are
not considered as being weak.

In the case that assumption IV2 is violated, then the IV estimator is always
inconsistent.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 23 / 49


Instrumental variables

How Can (Good) Instruments Be Found?

In principle, there are two possibilities to find an instrument.

One possibility is to use economic theory.


A theoretical model can suggest which variables are correlated with the
endogenous explanatory variable, but uncorrelated with the error term.

The other possibility is to use experiments and, in particular, natural


experiments as a source of instruments.

At present, this last option is very popular.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 24 / 49


Experiments and Quasi-Experiments

Natural experiments

Experiments and “natural” experiments are very popular in empirical research.

We will first deal with the concept of true (“ideal”) randomised experiment.

The causal effect of a variable (e.g., a policy intervention) on other variables can be
reliably estimated using a true randomised experiment.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 25 / 49


Experiments and Quasi-Experiments

Experiments
Terminology: experiments and quasi-experiments

Definition An experiment is designed and implemented consciously by human


researchers. An experiment entails conscious use of a treatment and
control group with random assignment (e.g. clinical trials of a drug).

Definition A quasi-experiment or natural experiment has a source of


randomization that is “as if” randomly assigned, but this variation was
not part of a conscious randomized treatment and control design.

Definition Program evaluation is the field of statistics/economics aimed at


evaluating the effect of a program or policy, for example, an ad
campaign to cut smoking.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 26 / 49


Experiments and Quasi-Experiments

True Randomised Experiments

An ideal randomized controlled experiment randomly assigns subjects to treatment and


control groups.

A true randomised experiment is the best way to identify causal effects.

For example, we want to estimate the effect of a training programme for unemployed.

For this purpose, we randomly choose a group of unemployed for the programme and
we compare their outcome after the programme with a group of unemployed
individuals who did not take part to the program.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 27 / 49


Experiments and Quasi-Experiments

True Randomised Experiments

Treatment group: group of individuals chosen to participate to the programme.

Control group: group of individuals not chosen to participate to the programme.

After the experiment we can evaluate the effect of the programme simply comparing
the treatment and control groups.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 28 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


Estimate with Experimental Data

How do we best estimate a causal effect with data from an experiment?

In principle, there are three ways:

(the differences estimator)

the differences-in-differences estimator

the IV estimator

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 29 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The Differences-in-Differences Estimator

Often panel data contain information on experiments

For example, we observe the outcome of a group of people both before and after
an experiment

The (simple) differences estimator compares average outcomes of the treatment


and control groups after the experiment.

The differences-in-differences estimator (DiD or Diff-in-diff) compares instead


the change in outcome in the treatment group with the change in outcome in
the control group.

The differences-in-differences estimator is defined as:

β̂ DiD = [Ȳ treat,after − Ȳ treat,before ] − [Ȳ control,after − Ȳ control,before ] (11)

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 30 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The Differences-in-Differences Estimator

The sample is broken down into four groups: the control group before and after the
treatment, and the treatment group before and after the treatment.

We use the following notation:

Ȳ control,before : average outcome of the control group before the experiment

Ȳ control,after : average outcome of the control group after the experiment

Ȳ treat,before : average outcome of the treatment group before the experiment

Ȳ treat,after : average outcome of the treatment group after the experiment

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 31 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The Differences-in-Differences Estimator

The differences-in-differences estimator can be estimated via the following regression:

Y = β0 + β1 af ter + β2 treat + β3 (treat ∗ af ter) + u (12)

where:
treat is a dummy variable taking value 1 if the person belongs to the treatment
group.
af ter is a dummy variable taking value 1 if the observation takes place after the
treatment.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 32 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The Differences-in-Differences Estimator

We can show that in model (12) the following holds:

βb3 = βbDiD (13)

βbDiD is also called the average treatment effect (ATE), because it measures the
effect of the “treatment” on the average outcome Y .

Identifying assumption: the treatment units have similar trend to the control units in
the absence of treatment

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 33 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The Differences-in-Differences Estimator

Figure 13.1 The Differences-in-Differences Estimator, Stock and Watson, p. 482.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 34 / 49


Experiments and Quasi-Experiments

True Randomised Experiments


The IV Estimator

If an experiment was based on a randomisation which, however, was not fully


implemented, we can use an IV estimator.

In an experiment with partial compliance, the assigned treatment level can serve
as an instrumental variable for the actual treatment level.

Relevance: as long as the protocol is partially followed, the actual treatment level
is partially determined by the assigned treatment level (instrument is relevant).

Exogeneity: If the assigned treatment is determined randomly (random


assignment) ⇒then the assigned treatment level is exogenous.

Thus in an experiment with partial compliance and randomly assigned treatment,


the original random assignment is a valid instrumental variable.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 35 / 49


Natural Experiments

Natural Experiments

For many research issues, it is too expensive or even impossible to conduct a


controlled randomised experiment.

In this case, a possible alternative is represented by “quasi-” or “natural”


experiments.

Natural experiments occur when some exogenous event (usually a policy change)
changes the environment where subjects operate.

The logic is to try to imitate a controlled randomised experiment via a random


variation in one variable.

How well this works depends on the details of each application.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 36 / 49


Natural Experiments

Natural Experiments

The econometric methods used are similar to those outlined for true experiments.

The DiD estimator is usually preferred: quasi-experiments do not usually have


true randomization. Therefore it is important to include observable pre-treatment
characteristics.

In order to control for systematic differences between the control and treatment
groups, we need at least two years of data.

Often panel data is not available, therefore we need to use repeated cross-sections.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 37 / 49


Natural Experiments

Conclusion

If a randomized controlled experiment is available or can be performed, it can


provide compelling evidence on the causal effect under study.

The insights of experimental methods can be applied to quasi-experiment, in


which special circumstances make it seem “as if” randomization occurred.

In quasi-experiments, the causal effect can be estimated using a DiD estimator,


possibly augmented with additional regressors.

When the “as if” randomization only partly influences the treatment, then
instrumental variables regression can be used instead.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 38 / 49


Regression Discontinuity Design

Regression Discontinuity Design (RDD)

Use a discontinuity in the level of treatment related to some observable to get a


consistent estimate, by comparing those just eligible for the treatment to those
just ineligible.
Example: Attendance or non-attendance in a program or intervention is
determined by whether a subject falls above or below a certain cut-off value of a
specified assignment variable.
The comparison of units that are in a sufficiently small neighborhood below and
above the threshold therefore comes close to an experimental setting with random
assignment to treatment and control groups
Any jump or discontinuity in outcomes that can be observed at the threshold can
then be interpreted as the causal effect of the intervention

In addition, the fact that the assignment to the treatment and control groups
follows a non-linear pattern (the discontinuity at exactly the cut-off value) allows
to control for any smooth function of the variable determining eligibility.

Assumption for causal effect: there are no other discontinuities around the
cut-off.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 39 / 49


Regression Discontinuity Design

Sharp regression discontinuity

Sharp RD is used when treatment status is a deterministic and discontinuous


function of an assignment variable, xi .
Suppose for example that
(
1, if xi ≥ x0
Di = (14)
0, if xi < x0

where x0 is a known threshold or cutoff.

This assignment mechanism is a deterministic function of xi because once we know xi


we know Di .

It’s a discontinuous function because no matter how close xi gets to x0 , treatment is


unchanged until xi = x0 .

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 40 / 49


Regression Discontinuity Design

Sharp regression discontinuity

The next figure illustrates a hypothetical RD scenario where those with xi ≥ 0.5
are treated.

In Panel A, the trend relationship between Yi and Xi is linear

In Panel B, it’s nonlinear

In both cases, there is a discontinuity in the relation between E[Y0i |xi ] and xi
around the point x0 .

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 41 / 49


Regression Discontinuity Design

A. Linear E[Y0i| Xi]


1.5
51
Outcome
.5 0

0 .2 .4 .6 .8 1
X

B. Nonlinear E[Y
[ 0i| Xi]
1 1.5
e

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 42 / 49


Regression Discontinuity Design

0
0 .2 .4 .6 .8 1
X

B. Nonlinear E[Y
[ 0i| Xi]
1.51
Outcome
.5
O
0

0 .2 .4 .6 .8 1
X

C Nonlinearity mistaken for discontinuity


C.
1.5
1
e

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 43 / 49


Regression Discontinuity Design

Sharp regression discontinuity

A simple model formalizes the RD idea.

Suppose that in addition to the assignment mechanism above, potential outcomes can
be described by a linear, constant-effects model.

E[Y0i |Xi ] = α + βXi (15)


Y1i = Y0i + ρ

This leads to the regression,

Yi = α + βXi + ρDi + ηi (16)

where ρ is the causal effect of interest.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 44 / 49


Regression Discontinuity Design

Sharp regression discontinuity

The key difference between this regression and others we’ve used to estimate
treatment effects is that Di , the regressor of interest, is not only correlated with
Xi , it is a deterministic function of Xi .

RD captures causal effects by distinguishing the nonlinear and discontinuous


function, 1(xi ≥ x0 ), from the smooth function xi .

But what if the trend relation, E[Y0i |xi ], is nonlinear?

Suppose that E[Y0i |xi ] = f (Xi ) for some reasonably smooth function, f (Xi ) (as
in Panel B)

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 45 / 49


Regression Discontinuity Design

Sharp regression discontinuity

Now we can construct RD estimates by fitting

Yi = f (Xi ) + ρDi + ηi (17)


where again, Di = 1(Xi ≥ x0 ) is discontinuous in Xi at x0 .
As long as f (Xi ) is continuous in a neighborhood of x0 , it should be possible to
estimate a model like (17), even with a flexible functional form for f (Xi ).
For example, modeling f (Xi ) with a pth order polynomial, RD estimates can be
constructed from the regression

Yi = α + β1 Xi + β2 Xi2 + . . . + βp Xip + ρDi + ηi (18)

The validity of RD estimates based on (18) turns on whether polynomial models


provide an adequate description of E[Y0i |Xi ]
If not, then what looks like a jump due to treatment might simply be an
unaccounted-for nonlinearity in the counterfactual conditional mean function.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 46 / 49


Regression Discontinuity Design

Sharp regression discontinuity


0
Panel C shows how a sharp turn in E[Y0i |Xi ] might be mistaken for a jump from one
0 .2 .4 .6 .8 1
regression line to another. X

C Nonlinearity mistaken for discontinuity


C.
1.5 1
Outcome
.5
0
-.5

0 .2 .4 .6 .8 1
X

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 47 / 49


Regression Discontinuity Design

Sharp regression discontinuity

To reduce the likelihood of such mistakes, we can look only at data in a neighborhood
around the discontinuity, say the interval [x0 − ∆, x0 + ∆] for some small number ∆.
Then we have

E[Yi |x0 − ∆ < Xi < x0 ] = E[Y0i |Xi = x0 ] (19)


E[Yi |x0 ≤ Xi < x0 + ∆] = E[Y1i |Xi = x0 ]

so that

lim E[Yi |x0 ≤ Xi < x0 + ∆] − E[Yi |x0 − ∆ < Xi < x0 ] (20)


∆→0
= E[Y1i − Y0i |Xi = x0 ]

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 48 / 49


Regression Discontinuity Design

Sharp regression discontinuity

In other words, comparisons of average outcomes in a small enough neighborhood


to the left and right of x0 should provide an estimate of the treatment effect that
does not depend on the correct specification of a model for E[Y0i |Xi ]

Moreover, the validity of this nonparametric estimation strategy does not turn on
the constant effects assumption, Y1i − Y0i = ρ

the estimand in (20) is the average causal effect, E[Y1i − Y0i |Xi = x0 ]

The nonparametric approach to RD requires good estimates of the mean of Yi in


small neighborhoods to the right and left of x0

The problem is that working in a small neighborhood of the cutoff means that you
don’t have many data points.

Francesco Cinnirella Culture, Institutions, and Development Academic year 2022/23 49 / 49

You might also like