Panel 2 Up
Panel 2 Up
Consider the multiple linear regression model for individual i = 1, ..., N The model is linear in parameters α, β, γ, effect ci and error uit .
who is observed at several time periods t = 1, ..., T PL2: Independence
yit = α + x0it β + zi0 γ + ci + uit {Xi , zi , yi }N
i=1 i.i.d. (independent and identically distributed)
where yit is the dependent variable, x0it is a K-dimensional row vector of The observations are independent across individuals but not necessarily
time-varying explanatory variables and zi0 is a M -dimensional row vector across time. This is guaranteed by random sampling of individuals.
of time-invariant explanatory variables excluding the constant, α is the
PL3: Strict Exogeneity
intercept, β is a K-dimensional column vector of parameters, γ is a M -
dimensional column vector of parameters, ci is an individual-specific effect E[uit |Xi , zi , ci ] = 0 (mean independent)
and uit is an idiosyncratic error term.
The idiosyncratic error term uit is assumed uncorrelated with the ex- RE3: Identifiability
planatory variables of all past, current and future time periods of the a) rank(W ) = K + M + 1 < N T and E[Wi0 Wi ] = QW W is p.d. and
same individual. This is a strong assumption which e.g. rules out lagged 0
finite. The typical element wit = [1 x0it zi0 ].
dependent variables. PL3 also assumes that the idiosyncratic error is
uncorrelated with the individual specific effect. b) rank(W ) = K + M + 1 < N T and E[Wi0 Ω−1
v,i Wi ] = QW OW is p.d.
and finite.
PL4: Error Variance where W = [ιN T X Z], Wi = [ιT Xi Zi ], ιN T a N T × 1 vector of ones,
a) V [ui |Xi , zi , ci ] = σu2 I, σu2 > 0 and finite
ιT a T × 1 vector of ones and Ωv,i is defined below. RE3 assumes that
(homoscedastic and no serial correlation)
the regressors including a constant are not perfectly collinear, that all
2
b) V [uit |Xi , zi , ci ] = σu,it > 0, finite and regressors (but the constant) have non-zero variance and not too many
Cov[uit , uis |Xi , zi , ci ] = 0 ∀s 6= t (no serial correlation) extreme values.
c) V [ui |Xi , zi , ci ] = Ωu,i (Xi , zi ) is p.d. and finite The random effects model can be written as
The remaining assumptions are divided into two sets of assumptions: the
random effects model and the fixed effects model. yit = α + x0it β + zi0 γ + vit
where vit = ci + uit . Assuming PL2, PL4 and RE1 in the special versions
2.1 The Random Effects Model
PL4a and RE2a leads to
In the random effects model, the individual-specific effect is a random
Ωv,1 ··· 0 ··· 0
variable that is uncorrelated with the explanatory variables. .. .. ..
. . .
RE1: Unrelated effects Ωv = V [v|X, Z] = 0 Ωv,i 0
.. .. ..
E[ci |Xi , zi ] = 0
.
. .
RE1 assumes that the individual-specific effect is a random variable that 0 ··· 0 ··· Ωv,N N T ×N T
is uncorrelated with the explanatory variables of all past, current and
with typical element
future time periods of the same individual.
RE2: Effect Variance σv2 σc2 ··· σc2
σc2 σv2 ··· σc2
a) V [ci |Xi , zi ] = σc2 < ∞ (homoscedastic) Ωv,i
= V [vi |Xi , zi ] =
.. .. .. ..
.
b) V [ci |Xi , zi ] = 2
σc,i (Xi , zi ) < ∞ (heteroscedastic) . . .
σc2 σc2 ··· σv2 T ×T
RE2a assumes constant variance of the individual specific effect.
where σv2 = σc2 + σu2 . This special case under PL4a and RE2a is therefore
called the equicorrelated random effects model.
5 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 6
2.2 The Fixed Effects Model Random effects model : The pooled OLS estimator of α, β and γ is un-
biased under PL1, PL2, PL3, RE1, and RE3a in small samples. Addition-
In the fixed effects model, the individual-specific effect is a random vari-
ally assuming PL4 and normally distributed idiosyncratic and individual-
able that is allowed to be correlated with the explanatory variables.
specific errors, it is normally distributed in small samples. It is consistent
FE1: Related effects and approximately normally distributed under PL1, PL2, PL3, PL4, RE1,
– and RE3a in samples with a large number of individuals (N → ∞). How-
ever, the pooled OLS estimator is not efficient. More importantly, the
FE1 explicitly states the absence of the unrelatedness assumption in RE1.
usual standard errors of the pooled OLS estimator are incorrect and tests
FE2: Effect Variance (t-, F -, z-, Wald-) based on them are not valid. Correct standard errors
– can be estimated with the so-called cluster-robust covariance estimator
FE2 explicitly states the absence of the assumption in RE2. treating each individual as a cluster. Cluster-robust covariance matrix is
consistent when the number of clusters N → ∞. In practice we should
FE3: Identifiability have at least 50 clusters (see the handout on “Clustering in the Linear
rank(Ẍ) = K < N T and E(Ẍi0 Ẍi ) is p.d. and finite Model”).
Fixed effects model : The pooled OLS estimators of α, β and γ are
P
where the typical element ẍit = xit − x̄i and x̄i = 1/T t xit
biased and inconsistent, because the variable ci is omitted and potentially
FE3 assumes that the time-varying explanatory variables are not perfectly
correlated with the other regressors.
collinear, that they have non-zero within-variance (i.e. variation over time
for a given individual) and not too many extreme values. Hence, xit
cannot include a constant or any time-invariant variables. Note that only 4 Random Effects Estimation
the parameters β but neither α nor γ are identifiable in the fixed effects The random effects estimator is the feasible generalized least squares
model. (GLS) estimator
αbRE
3 Estimation with Pooled OLS
0 b −1
−1
b v −1 y.
βRE = W Ω W W 0Ω
b
v
The pooled OLS estimator ignores the panel structure of the data and γ
bRE
simply estimates α, β and γ as where W = [ιN T X Z] and ιN T is a N T × 1 vector of ones.
The error covariance matrix Ωv is assumed block-diagonal with equicor-
αbP OLS
0 −1 0 related diagonal elements Ωv,i as in section 2.1 which depend on the two
βP OLS = (W W ) W y
b
unknown parameters σv2 and σc2 only. There are many different ways to
γ
bP OLS
estimate these two parameters. For example,
where W = [ιN T X Z] and ιN T is a N T × 1 vector of ones. T
1 XX 2
N
bv2 =
σ vb , bc2 = σ
σ bv2 − σ
bu2
N T t=1 i=1 it
7 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 8
where biased and inconsistent, because the variable ci is omitted and potentially
T X N
1 X correlated with the other regressors.
bu2 =
σ vit − vbi )2
(b
N T − N t=1 i=1
γ
bRE mator of β is unbiased under PL1, PL2, PL3, and FE3 in small samples.
Additionally assuming PL4 and normally distributed idiosyncratic errors,
Allowing for arbitrary conditional variances and for serial correlation in
it is normally distributed in small samples. Assuming homoscedastic er-
Ωv,i (PL4c and RE2b), the asymptotic variance can be consistently es- h i
rors with no serial correlation (PL4a), the variance V βF E |X can be
b
timated with the so-called cluster-robust covariance estimator treating
unbiasedly estimated as
each individual as a cluster (see the handout on “Clustering in the Linear
h i −1
Model”). In both cases, the usual tests (z-, Wald-) for large samples can bu2 Ẍ 0 Ẍ
Vb βbF E |X = σ
be performed.
In practice, we can rarely be sure about equicorrelated errors and bu2 = u
where σ b̈0 u/(N
b̈ b̈it = ÿit − ẍ0it βbF E . Note the non-usual
T −N −K) and u
better always use cluster-robust standard errors for the RE estimator. degrees of freedom correction. The usual z- and F -tests can be performed.
Fixed effects model : Under the assumptions of the fixed effects model The FE estimator is consistent and asymptotically normally distributed
(FE1, i.e. RE1 violated), the random effects estimators of α, β and γ are under PL1 - PL4 and FE3 when the number of individuals N → ∞ even
9 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 10
P
if T is fixed. It can therefore be approximated in samples with many where xi = 1/T t xit are the time averages of all time-varying regressors.
individual observations N as Include time fixed δt if they are included in the RE and FE estimation.
A
h i A joint Wald-test (or F -test) on H0 : λ = 0 tests RE1. Use cluster-robust
βbF E ∼ N β, Avar βbF E
standard errors to allow for heteroscedasticity and serial correlation.
Assuming homoscedastic errors with no serial correlation (PL4a), the Note: Assumption RE1 is an extremely strong assumption and the FE
asymptotic variance can be consistently estimated as estimator is almost always much more convincing than the RE estimator.
h i −1 Not rejecting RE1 does not mean accepting it. Interest in the effect of a
Avar
[ βbF E = σ bu2 Ẍ 0 Ẍ time-invariant variable is no sufficient reason to use the RE estimator.
0
where σbu2 = u
b̈ u/(N
b̈ T − N ).
Allowing for heteroscedasticity and serial correlation of unknown form 7 Least Squares Dummy Variables Estimator (LSDV)
(PL4c), the asymptotic variance Avar[βbk ] can be consistently estimated
The least squares dummy variables (LSDV) estimator is pooled OLS in-
with the so-called cluster-robust covariance estimator treating each indi-
cluding a set of N − 1 dummy variables which identify the individuals and
vidual as a cluster (see the handout on “Clustering in the Linear Model”).
hence an additional N − 1 parameters. Note that one of the individual
In both cases, the usual tests (z-, Wald-) for large samples can be per-
dummies is dropped because we include a constant. Time-invariant ex-
formed.
planatory variables, zi , are dropped because they are perfectly collinear
In practice, the idiosyncratic errors are often serially correlated (vio-
with the individual dummy variables.
lating PL4a) when T > 2. Bertrand, Duflo and Mullainathan (2004) show
The LSDV estimator of β is numerically identical with the FE esti-
that the usual standard errors of the fixed effects estimator are drastically
mator and therefore consistent under the same assumptions. The LSDV
understated in the presence of serial correlation. It is therefore advisable
estimators of the additional parameters for the individual-specific dummy
to always use cluster-robust standard errors for the fixed effects estimator.
variables, however, are inconsistent as the number of parameters goes to
infinity as N → ∞. This so-called incidental parameters problem gener-
6 Random Effects vs. Fixed Effects Estimation ally biases all parameters in non-linear fixed effects models like the probit
model.
The random effects model can be consistently estimated by both the RE
estimator or the FE estimator. We would prefer the RE estimator if we
can be sure that the individual-specific effect really is an unrelated effect 8 First Difference Estimator
(RE1 ). This is usually tested by a (Durbin-Wu-)Hausman test. However,
Subtracting the lagged value yi,t−1 from the initial model
the Hausman test is only valid under homoscedasticity and cannot include
time fixed effects. yit = α + x0it β + zi0 γ + ci + uit
The unrelatedness assumption (RE1 ) is better tested by running an
auxiliary regression (Wooldridge 2010, p. 332, eq. 10.88, Mundlak, 1978): yields the first-difference model
where ẏit = yit − yi,t−1 , ẋit = xit − xi,t−1 and u̇it = uit − ui,t−1 . Note that We can estimate this extended model by including a dummy variable
the individual-specific effect ci , the intercept α and the time-invariant for T − 1 time periods with one period serving as the reference period.
regressors zi cancel. The first-difference estimator (FD) of the slope co- Assuming a fixed number of time periods T and the number of individuals
efficient β estimates the first-difference model by OLS. N → ∞, both the RE estimator and the FE estimator are consistent
−1 using time dummy variables under above conditions. Estimation with
βbF D = Ẋ 0 Ẋ Ẋ 0 ẏ both individual fixed effects and time fixed effects is called two-way fixed
effects estimation.
Note that the parameters α and γ are not estimated by the FD estimator.
In the special case T = 2, the FD estimator is numerically identical to the
FE estimator. 11 Heterogeneous Effects
Random effects model and fixed effects model : The FD estimator is a
PL1 assumes that the parameters βk are constant across individuals i
consistent estimator of β under the same assumptions as the FE estimator.
and time t. However, in reality, effects likely differ across i and t, i.e.
It is less efficient than the FE estimator if uit is not serially correlated
the effects are heterogeneous and researchers seek to estimate an average
(PL4a).
treatment effect AT Ek = E[βitk ]. Unfortunately, the linear panel estima-
tors discussed in this handout βbk are in general not unbiased estimators
9 Fixed Effects vs. First Difference Estimation for AT Ek (see e.g. de Chaisemartin and D’Haultfœuille, 2020).
An exception is the two-way fixed effects estimation in a panel with
Given the fixed effects model (PL1, PL2, PL3, FE3 ), both the fixed ef-
two time periods t = 1, 2 with a dependent variable yit and a single ex-
fects and the first difference estimator of β are consistent. Hence, the two
planatory variable dit which takes the value di2 = 1 if an individual i is
estimators should be similar in large samples. In practice, however, the
treated in period 2 and dit = 0 otherwise: yit = β0 + β1 dit + δt + ci + uit
two estimator often differ substantially. The reason for this is typically
with δ1 = 0. In this case, the two-way fixed effects estimator is equivalent
a misspecification of the timing in the linear model. PL1 assumes that
to the average first difference ∆yi2 = yi2 − yi1 in the treated group mi-
changes in xit have only an instantaneous effect on yit at time t. In prac-
nus the average difference in the control group (differences-in-differences
tice, effects often need several periods to materialize. Such patterns are
estimator). βb1 can be interpreted as the average treatment effect on the
called dynamic treatment effects. In this situation, the first difference es-
treated (AT ET ) even if the individual effects βit1 are heterogeneous pro-
timator will only pick up the instantaneous effect at time t while the fixed
vided that the expected change from period 1 to 2 in the treated group
effects estimator picks up an average of the dynamic treatment effects.
would have been identical to the expected change in the control group
(common trends assumption).
10 Time Fixed Effects
We often also suspect that there are time-specific effects δt which affect
all individuals in the same way
Stata reports asymptotic z- and Wald-tests with random effects estima- Note that the time averages are generate with the sample used in both
tion. Cluster-robust standard errors are reported with: the random effects and the pooled OLS estimation.
xtreg ln_wage grade ttl_exp ttl_exp2, re vce(cluster idcode)
Since version 10, Stata always assumes clustering with robust standard
errors in random and fixed effects estimations. So we could also just use
15 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 16
References
Introductory textbooks
Advanced textbooks
Articles