0% found this document useful (0 votes)
48 views9 pages

Panel 2 Up

This document provides an overview of panel data analysis, focusing on fixed and random effects models. It outlines the assumptions and econometric models used for analyzing panel data, emphasizing the importance of unobserved variables and their correlation with observed explanatory variables. The document also discusses estimation techniques, including pooled OLS and the implications of using fixed versus random effects models.

Uploaded by

rsanti1992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views9 pages

Panel 2 Up

This document provides an overview of panel data analysis, focusing on fixed and random effects models. It outlines the assumptions and econometric models used for analyzing panel data, emphasizing the importance of unobserved variables and their correlation with observed explanatory variables. The document also discusses estimation techniques, including pooled OLS and the implications of using fixed versus random effects models.

Uploaded by

rsanti1992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Short Guides to Microeconometrics Kurt Schmidheiny Panel Data: Fixed and Random Effects 2

Fall 2024 University of Basel

We will assume throughout this handout that each individual i is ob-


served in all time periods t. This is a so-called balanced panel. The
treatment of unbalanced panels is straightforward but tedious.
Panel Data: Fixed and Random Effects The T observations for individual i can be summarized as
   0   0   
yi1 xi1 zi ui1
 .   .   .   . 
 ..   ..   ..   .. 
1 Introduction   
 0 
 
 0 
  
yi =  yit  Xi =  xit  Zi =  zi  ui =  uit 
   
 ..   ..   ..   .. 
       
In panel data, individuals (persons, firms, cities, ... ) are observed at  .   .   .   . 
several points in time (days, years, before and after treatment, ...). This yiT T ×1 x0iT T ×K zi0 T ×M uiT T ×1
handout focuses on panels with relatively few time periods (small T ) and
many individuals (large N ). and N T observations for all individuals and time periods as
This handout introduces the two basic models for the analysis of panel        
y1 X1 Z1 u1
data, the fixed effects model and the random effects model, and presents  ..   ..   ..   .. 
consistent estimators for these two models. The handout does not cover

 . 


 . 


 . 


 . 

y= yi X= Xi Z= Zi u= ui
       
so-called dynamic panel data models.    
.. .. .. ..
       
Panel data are most useful when we suspect that the outcome variable
       
 .   .   .   . 
depends on explanatory variables which are not observable but correlated yN XN ZN uN
N T ×1 N T ×K N T ×M N T ×1
with the observed explanatory variables. If such omitted variables are
constant over time, panel data estimators allow to consistently estimate The data generation process (dgp) is described by:
the effect of the observed explanatory variables.
PL1: Linearity
2 The Econometric Model yit = α + x0it β + zi0 γ + ci + uit where E[uit ] = 0 and E[ci ] = 0

Consider the multiple linear regression model for individual i = 1, ..., N The model is linear in parameters α, β, γ, effect ci and error uit .
who is observed at several time periods t = 1, ..., T PL2: Independence
yit = α + x0it β + zi0 γ + ci + uit {Xi , zi , yi }N
i=1 i.i.d. (independent and identically distributed)

where yit is the dependent variable, x0it is a K-dimensional row vector of The observations are independent across individuals but not necessarily
time-varying explanatory variables and zi0 is a M -dimensional row vector across time. This is guaranteed by random sampling of individuals.
of time-invariant explanatory variables excluding the constant, α is the
PL3: Strict Exogeneity
intercept, β is a K-dimensional column vector of parameters, γ is a M -
dimensional column vector of parameters, ci is an individual-specific effect E[uit |Xi , zi , ci ] = 0 (mean independent)
and uit is an idiosyncratic error term.

Version: 21-11-2024, 17:29


3 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 4

The idiosyncratic error term uit is assumed uncorrelated with the ex- RE3: Identifiability
planatory variables of all past, current and future time periods of the a) rank(W ) = K + M + 1 < N T and E[Wi0 Wi ] = QW W is p.d. and
same individual. This is a strong assumption which e.g. rules out lagged 0
finite. The typical element wit = [1 x0it zi0 ].
dependent variables. PL3 also assumes that the idiosyncratic error is
uncorrelated with the individual specific effect. b) rank(W ) = K + M + 1 < N T and E[Wi0 Ω−1
v,i Wi ] = QW OW is p.d.
and finite.
PL4: Error Variance where W = [ιN T X Z], Wi = [ιT Xi Zi ], ιN T a N T × 1 vector of ones,
a) V [ui |Xi , zi , ci ] = σu2 I, σu2 > 0 and finite
ιT a T × 1 vector of ones and Ωv,i is defined below. RE3 assumes that
(homoscedastic and no serial correlation)
the regressors including a constant are not perfectly collinear, that all
2
b) V [uit |Xi , zi , ci ] = σu,it > 0, finite and regressors (but the constant) have non-zero variance and not too many
Cov[uit , uis |Xi , zi , ci ] = 0 ∀s 6= t (no serial correlation) extreme values.
c) V [ui |Xi , zi , ci ] = Ωu,i (Xi , zi ) is p.d. and finite The random effects model can be written as
The remaining assumptions are divided into two sets of assumptions: the
random effects model and the fixed effects model. yit = α + x0it β + zi0 γ + vit

where vit = ci + uit . Assuming PL2, PL4 and RE1 in the special versions
2.1 The Random Effects Model
PL4a and RE2a leads to
In the random effects model, the individual-specific effect is a random  
Ωv,1 ··· 0 ··· 0
variable that is uncorrelated with the explanatory variables.  .. .. .. 

 . . . 

RE1: Unrelated effects Ωv = V [v|X, Z] =  0 Ωv,i 0
 

.. .. ..
 
E[ci |Xi , zi ] = 0 
.

 . . 
RE1 assumes that the individual-specific effect is a random variable that 0 ··· 0 ··· Ωv,N N T ×N T
is uncorrelated with the explanatory variables of all past, current and
with typical element
future time periods of the same individual.
 
RE2: Effect Variance σv2 σc2 ··· σc2
σc2 σv2 ··· σc2
 
a) V [ci |Xi , zi ] = σc2 < ∞ (homoscedastic) Ωv,i

= V [vi |Xi , zi ] = 

.. .. .. .. 
.
 
b) V [ci |Xi , zi ] = 2
σc,i (Xi , zi ) < ∞ (heteroscedastic)  . . . 
σc2 σc2 ··· σv2 T ×T
RE2a assumes constant variance of the individual specific effect.
where σv2 = σc2 + σu2 . This special case under PL4a and RE2a is therefore
called the equicorrelated random effects model.
5 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 6

2.2 The Fixed Effects Model Random effects model : The pooled OLS estimator of α, β and γ is un-
biased under PL1, PL2, PL3, RE1, and RE3a in small samples. Addition-
In the fixed effects model, the individual-specific effect is a random vari-
ally assuming PL4 and normally distributed idiosyncratic and individual-
able that is allowed to be correlated with the explanatory variables.
specific errors, it is normally distributed in small samples. It is consistent
FE1: Related effects and approximately normally distributed under PL1, PL2, PL3, PL4, RE1,
– and RE3a in samples with a large number of individuals (N → ∞). How-
ever, the pooled OLS estimator is not efficient. More importantly, the
FE1 explicitly states the absence of the unrelatedness assumption in RE1.
usual standard errors of the pooled OLS estimator are incorrect and tests
FE2: Effect Variance (t-, F -, z-, Wald-) based on them are not valid. Correct standard errors
– can be estimated with the so-called cluster-robust covariance estimator
FE2 explicitly states the absence of the assumption in RE2. treating each individual as a cluster. Cluster-robust covariance matrix is
consistent when the number of clusters N → ∞. In practice we should
FE3: Identifiability have at least 50 clusters (see the handout on “Clustering in the Linear
rank(Ẍ) = K < N T and E(Ẍi0 Ẍi ) is p.d. and finite Model”).
Fixed effects model : The pooled OLS estimators of α, β and γ are
P
where the typical element ẍit = xit − x̄i and x̄i = 1/T t xit
biased and inconsistent, because the variable ci is omitted and potentially
FE3 assumes that the time-varying explanatory variables are not perfectly
correlated with the other regressors.
collinear, that they have non-zero within-variance (i.e. variation over time
for a given individual) and not too many extreme values. Hence, xit
cannot include a constant or any time-invariant variables. Note that only 4 Random Effects Estimation
the parameters β but neither α nor γ are identifiable in the fixed effects The random effects estimator is the feasible generalized least squares
model. (GLS) estimator
 
αbRE
3 Estimation with Pooled OLS 
0 b −1
−1
b v −1 y.
 βRE  = W Ω W W 0Ω
 b 
v

The pooled OLS estimator ignores the panel structure of the data and γ
bRE
simply estimates α, β and γ as where W = [ιN T X Z] and ιN T is a N T × 1 vector of ones.
  The error covariance matrix Ωv is assumed block-diagonal with equicor-
αbP OLS
0 −1 0 related diagonal elements Ωv,i as in section 2.1 which depend on the two
 βP OLS  = (W W ) W y
 b 
unknown parameters σv2 and σc2 only. There are many different ways to
γ
bP OLS
estimate these two parameters. For example,
where W = [ιN T X Z] and ιN T is a N T × 1 vector of ones. T
1 XX 2
N
bv2 =
σ vb , bc2 = σ
σ bv2 − σ
bu2
N T t=1 i=1 it
7 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 8

where biased and inconsistent, because the variable ci is omitted and potentially
T X N
1 X correlated with the other regressors.
bu2 =
σ vit − vbi )2
(b
N T − N t=1 i=1

and vbit = yit − αP OLS − x0it βbP OLS − zi0 γ


PT
bP OLS and vbi = 1/T t=1 vbit . The 5 Fixed Effects Estimation
degree of freedom correction in σ bu2 is also asymptotically important when P
Subtracting time averages ȳi = 1/T t yit from the initial model
N → ∞.
Random effects model : We cannot establish small sample properties yit = α + x0it β + zi0 γ + ci + uit
for the RE estimator. The RE estimator is consistent and asymptotically
normally distributed under PL1 - PL4, RE1, RE2 and RE3b when the yields the within model
number of individuals N → ∞ even if T is fixed. It can therefore be ÿit = ẍ0it β + üit
approximated in samples with many individual observations N as where ÿit = yit − ȳi , ẍitk = xitk − x̄ik and üit = uit − ūi . Note that
      the individual-specific effect ci , the intercept α and the time-invariant
αbRE α αbRE
 A regressors zi cancel.
 βRE  ∼ N  β  , Avar  βbRE 
 b    
The fixed effects estimator or within estimator of the slope coefficient
γ
bRE γ γ
bRE
β estimates the within model by OLS
Assuming the equicorrelated model (PL4a and RE2a), σ bc2 are con-
bv2 and σ  −1
sistent estimators of σv2 and σc2 , respectively. Then αbRE , βbRE and γbRE are βbF E = Ẍ 0 Ẍ Ẍ 0 ÿ
asymptotically efficient and the asymptotic variance can be consistently
Note that the parameters α and γ are not estimated by the within esti-
estimated as
mator.
 
α
bRE  −1
Avar  βbRE  = W Ω
0 b −1 Random effects model and fixed effects model : The fixed effects esti-
v W
[ 

γ
bRE mator of β is unbiased under PL1, PL2, PL3, and FE3 in small samples.
Additionally assuming PL4 and normally distributed idiosyncratic errors,
Allowing for arbitrary conditional variances and for serial correlation in
it is normally distributed in small samples. Assuming homoscedastic er-
Ωv,i (PL4c and RE2b), the asymptotic variance can be consistently es- h i
rors with no serial correlation (PL4a), the variance V βF E |X can be
b
timated with the so-called cluster-robust covariance estimator treating
unbiasedly estimated as
each individual as a cluster (see the handout on “Clustering in the Linear
h i  −1
Model”). In both cases, the usual tests (z-, Wald-) for large samples can bu2 Ẍ 0 Ẍ
Vb βbF E |X = σ
be performed.
In practice, we can rarely be sure about equicorrelated errors and bu2 = u
where σ b̈0 u/(N
b̈ b̈it = ÿit − ẍ0it βbF E . Note the non-usual
T −N −K) and u
better always use cluster-robust standard errors for the RE estimator. degrees of freedom correction. The usual z- and F -tests can be performed.
Fixed effects model : Under the assumptions of the fixed effects model The FE estimator is consistent and asymptotically normally distributed
(FE1, i.e. RE1 violated), the random effects estimators of α, β and γ are under PL1 - PL4 and FE3 when the number of individuals N → ∞ even
9 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 10

P
if T is fixed. It can therefore be approximated in samples with many where xi = 1/T t xit are the time averages of all time-varying regressors.
individual observations N as Include time fixed δt if they are included in the RE and FE estimation.
A
 h i A joint Wald-test (or F -test) on H0 : λ = 0 tests RE1. Use cluster-robust
βbF E ∼ N β, Avar βbF E
standard errors to allow for heteroscedasticity and serial correlation.
Assuming homoscedastic errors with no serial correlation (PL4a), the Note: Assumption RE1 is an extremely strong assumption and the FE
asymptotic variance can be consistently estimated as estimator is almost always much more convincing than the RE estimator.
h i  −1 Not rejecting RE1 does not mean accepting it. Interest in the effect of a
Avar
[ βbF E = σ bu2 Ẍ 0 Ẍ time-invariant variable is no sufficient reason to use the RE estimator.
0
where σbu2 = u
b̈ u/(N
b̈ T − N ).
Allowing for heteroscedasticity and serial correlation of unknown form 7 Least Squares Dummy Variables Estimator (LSDV)
(PL4c), the asymptotic variance Avar[βbk ] can be consistently estimated
The least squares dummy variables (LSDV) estimator is pooled OLS in-
with the so-called cluster-robust covariance estimator treating each indi-
cluding a set of N − 1 dummy variables which identify the individuals and
vidual as a cluster (see the handout on “Clustering in the Linear Model”).
hence an additional N − 1 parameters. Note that one of the individual
In both cases, the usual tests (z-, Wald-) for large samples can be per-
dummies is dropped because we include a constant. Time-invariant ex-
formed.
planatory variables, zi , are dropped because they are perfectly collinear
In practice, the idiosyncratic errors are often serially correlated (vio-
with the individual dummy variables.
lating PL4a) when T > 2. Bertrand, Duflo and Mullainathan (2004) show
The LSDV estimator of β is numerically identical with the FE esti-
that the usual standard errors of the fixed effects estimator are drastically
mator and therefore consistent under the same assumptions. The LSDV
understated in the presence of serial correlation. It is therefore advisable
estimators of the additional parameters for the individual-specific dummy
to always use cluster-robust standard errors for the fixed effects estimator.
variables, however, are inconsistent as the number of parameters goes to
infinity as N → ∞. This so-called incidental parameters problem gener-
6 Random Effects vs. Fixed Effects Estimation ally biases all parameters in non-linear fixed effects models like the probit
model.
The random effects model can be consistently estimated by both the RE
estimator or the FE estimator. We would prefer the RE estimator if we
can be sure that the individual-specific effect really is an unrelated effect 8 First Difference Estimator
(RE1 ). This is usually tested by a (Durbin-Wu-)Hausman test. However,
Subtracting the lagged value yi,t−1 from the initial model
the Hausman test is only valid under homoscedasticity and cannot include
time fixed effects. yit = α + x0it β + zi0 γ + ci + uit
The unrelatedness assumption (RE1 ) is better tested by running an
auxiliary regression (Wooldridge 2010, p. 332, eq. 10.88, Mundlak, 1978): yields the first-difference model

yit = α + x0it β + zi0 γ + x0i λ + δt + uit ẏit = ẋ0it β + u̇it


11 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 12

where ẏit = yit − yi,t−1 , ẋit = xit − xi,t−1 and u̇it = uit − ui,t−1 . Note that We can estimate this extended model by including a dummy variable
the individual-specific effect ci , the intercept α and the time-invariant for T − 1 time periods with one period serving as the reference period.
regressors zi cancel. The first-difference estimator (FD) of the slope co- Assuming a fixed number of time periods T and the number of individuals
efficient β estimates the first-difference model by OLS. N → ∞, both the RE estimator and the FE estimator are consistent
 −1 using time dummy variables under above conditions. Estimation with
βbF D = Ẋ 0 Ẋ Ẋ 0 ẏ both individual fixed effects and time fixed effects is called two-way fixed
effects estimation.
Note that the parameters α and γ are not estimated by the FD estimator.
In the special case T = 2, the FD estimator is numerically identical to the
FE estimator. 11 Heterogeneous Effects
Random effects model and fixed effects model : The FD estimator is a
PL1 assumes that the parameters βk are constant across individuals i
consistent estimator of β under the same assumptions as the FE estimator.
and time t. However, in reality, effects likely differ across i and t, i.e.
It is less efficient than the FE estimator if uit is not serially correlated
the effects are heterogeneous and researchers seek to estimate an average
(PL4a).
treatment effect AT Ek = E[βitk ]. Unfortunately, the linear panel estima-
tors discussed in this handout βbk are in general not unbiased estimators
9 Fixed Effects vs. First Difference Estimation for AT Ek (see e.g. de Chaisemartin and D’Haultfœuille, 2020).
An exception is the two-way fixed effects estimation in a panel with
Given the fixed effects model (PL1, PL2, PL3, FE3 ), both the fixed ef-
two time periods t = 1, 2 with a dependent variable yit and a single ex-
fects and the first difference estimator of β are consistent. Hence, the two
planatory variable dit which takes the value di2 = 1 if an individual i is
estimators should be similar in large samples. In practice, however, the
treated in period 2 and dit = 0 otherwise: yit = β0 + β1 dit + δt + ci + uit
two estimator often differ substantially. The reason for this is typically
with δ1 = 0. In this case, the two-way fixed effects estimator is equivalent
a misspecification of the timing in the linear model. PL1 assumes that
to the average first difference ∆yi2 = yi2 − yi1 in the treated group mi-
changes in xit have only an instantaneous effect on yit at time t. In prac-
nus the average difference in the control group (differences-in-differences
tice, effects often need several periods to materialize. Such patterns are
estimator). βb1 can be interpreted as the average treatment effect on the
called dynamic treatment effects. In this situation, the first difference es-
treated (AT ET ) even if the individual effects βit1 are heterogeneous pro-
timator will only pick up the instantaneous effect at time t while the fixed
vided that the expected change from period 1 to 2 in the treated group
effects estimator picks up an average of the dynamic treatment effects.
would have been identical to the expected change in the control group
(common trends assumption).
10 Time Fixed Effects

We often also suspect that there are time-specific effects δt which affect
all individuals in the same way

yit = α + x0it β + zi0 γ + δt + ci + uit .


13 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 14

12 Implementation in Stata 17 xtreg ln_wage grade ttl_exp ttl_exp2, re vce(robust)

The fixed effects estimator is calculated by the Stata command xtreg


Stata provides a series of commands that are especially designed for panel
with the option fe:
data. See help xt for an overview.
Stata requires panel data in the so-called long form: there is one xtreg ln_wage ttl_exp ttl_exp2, fe
line for every individual and every time observation. The very powerful Note that the effect of time-constant variables like grade is not identified
Stata command reshape helps transforming data into this format. Before by the fixed effects estimator. The parameter reported as cons in the
working with panel data commands, we have to tell Stata the variables P
Stata output is the average fixed effect 1/N i ci . This command uses
that identify the individual and the time period. For example, load data N T −N −K degrees of freedom for t- and F-tests. Cluster-robust standard
and define individuals (variable idcode) and time periods (variable year ) errors are reported with the vce option:
webuse nlswork.dta xtreg ln_wage ttl_exp ttl_exp2, fe vce(cluster idcode)
xtset idcode year
[ with (N T − 1)/(N T − N − K) · N/(N − 1)
This command multiplies Avar
Stata provides descriptive statistics for panel data with the commands
as a small correction and reports reports cluster-robust t- and F -tests
xtdescribe with N − 1 degrees of freedom. The latter is particularly useful with large
xtsum
T (see Stock and Watson, 2008).
The pooled OLS estimator with corrected standard errors is calculated The Hausman test is calculated by
with the standard ols command regress:
xtreg ln_wage grade ttl_exp ttl_exp2, re
generate ttl_exp2 = ttl_exp^2 estimates store b_re
reg ln_wage grade ttl_exp ttl_exp2, vce(cluster idcode) xtreg ln_wage ttl_exp ttl_exp2, fe
estimates store b_fe
where the vce option was used to report correct cluster-robust standard hausman b_fe b_re, sigmamore
[ with (N T − 1)/(N T − M − K −
errors. This command multiplies Avar
and the auxiliary regression version by
1) · N/(N − 1) as a small sample correction and uses N − 1 degrees of
regress ln_wage grade ttl_exp ttl_exp2
freedom for t- and F-tests. tegen ttl_exp_mean = mean(ttl_exp) if e(sample), by(idcode)
The random effects estimator is calculated by the Stata command egen ttl_exp2_mean = mean(ttl_exp2) if e(sample), by(idcode)
xtreg with the option re: regress ln_wage grade ttl_exp ttl_exp2 ///
ttl_exp_mean ttl_exp2_mean, vce(cluster idcode)
xtreg ln_wage grade ttl_exp ttl_exp2, re test ttl_exp_mean ttl_exp2_mean

Stata reports asymptotic z- and Wald-tests with random effects estima- Note that the time averages are generate with the sample used in both
tion. Cluster-robust standard errors are reported with: the random effects and the pooled OLS estimation.
xtreg ln_wage grade ttl_exp ttl_exp2, re vce(cluster idcode)

Since version 10, Stata always assumes clustering with robust standard
errors in random and fixed effects estimations. So we could also just use
15 Short Guides to Microeconometrics Panel Data: Fixed and Random Effects 16

13 Implementation in R 4.2.3 Cluster-robust standard errors are reported with


coeftest(re, vcov=vcovHC(re, cluster="group", type="HC1"))
The R package plm provides a series of functions and data structures that
are especially designed for panel data. The fixed effects estimator is calculated by plm option within
The plm package works with data stored in a dataframe in the so- fe <- plm(ln_wage ~ grade + ttl_exp + I(ttl_exp^2), model="within",
called long form. Long form data means that there is one line for every data=nlswork, index=c("idcode", "year"))
summary(fe)
individual and every time observation. For example, load data
Note that effects of time-constant variables like grade are not identified
library(haven)
nlswork <- read_dta("https://www.stata-press.com/data/r17/nlswork.dta") by the fixed effects estimator. This command uses N T − N − K degrees
of freedom for t- and F-tests. Cluster-robust standard errors are given by:
where individuals are defined by idcode and time periods by year.
Pooled OLS with cluster-robust standard errors can be estimated with coeftest(fe, vcov=vcovHC(fe, cluster="group", type="HC1"))
a standard regression and the packages lmtest and sandwich [ with (N T − 1)/(N T − K − 1) as a small
This command multiplies Avar
pols1 <- lm(ln_wage~grade+ttl_exp+I(ttl_exp^2), data = nlswork) sample correction and uses N T − K − 1 degrees of freedom for t- and
library(lmtest) F-tests.
library(sandwich)
coeftest(pols1, vcov = vcovCL, cluster = ~idcode) The Hausman test is calculated by estimating RE and FE and then
comparing the estimates:
[ with (N T −1)/(N T −M −K−1)·N/(N −1)
This command multiplies Avar
phtest(fe, re)
as a small sample correction.
Alternatively, pooled OLS with corrected standard errors is estimated and the auxiliary regression version by
by the package plm with the function plm and its model option pooling: vars <- c("idcode", "year", "ln_wage", "grade", "ttl_exp")
library(plm) sample <- nlswork[complete.cases(nlswork[,vars]),vars]
pols2 <- plm(ln_wage~grade+ttl_exp+I(ttl_exp^2), model="pooling", sample$ttl_exp_mean <- ave(sample$ttl_exp, sample$idcode, FUN = mean)
data = nlswork, index=c("idcode", "year")) sample$ttl_exp2_mean <- ave(sample$ttl_exp^2, sample$idcode, FUN = mean)
summary(pols2) aux <- plm(ln_wage~grade+ttl_exp+I(ttl_exp^2)+ttl_exp_mean+ttl_exp2_mean,
coeftest(pols2, vcov=vcovHC(pols2, cluster="group", type="HC1")) model="pooling", data = sample, index=c("idcode", "year"))
summary(aux)
where coeftest reports cluster-robust standard errors. cluster="group" waldtest(aux, .~.-ttl_exp_mean-ttl_exp2_mean,
vcov=vcovHC(aux, cluster="group", type="HC1"))
defines the clusters by the individual identifier set by the option index in
plm, i.e. the variable idcode in the example. This command multiplies Note that the dataset was reduced to the sample used in the random effects
[ with (N T − 1)/(N T − M − K − 1) but not with N/(N − 1) and
Avar and the pooled OLS estimation before generating the time averages.
uses N T − M − K − 1 degrees of freedom for t- and F-tests.
The random effects estimator is calculated by plm option random:
re <- plm(ln_wage~grade+ttl_exp+I(ttl_exp^2), model="random",
data = nlswork, index=c("idcode", "year"))
summary(re)
17 Short Guides to Microeconometrics

References

Introductory textbooks

Stock, James H. and Mark W. Watson (2020), Introduction to Economet-


rics, 4th Global ed., Pearson. Chapter 10.
Wooldridge, Jeffrey M. (2009), Introductory Econometrics: A Modern
Approach, 4th ed., South-Western Cengage Learning. Ch. 13 and 14.
Angrist, Joshua D. and Jörn-Steffen Pischke (2009), Mostly Harmless
Econometrics: An Empiricist’s Companion, Princeton University Press.
Chapter 5.

Advanced textbooks

Cameron, A. Colin and Pravin K. Trivedi (2005), Microeconometrics:


Methods and Applications, Cambridge University Press. Chapter 21.
Wooldridge, Jeffrey M. (2010), Econometric Analysis of Cross Section and
Panel Data, MIT Press. Chapter 10.

Articles

Manuel Arellano (1987), Computing Robust Standard Errors for Within-


Group Estimators, Oxford Bulletin of Economics and Statistics, 49,
431–434.
Bertrand, M., E. Duflo and S. Mullainathan (2004), How Much Should
We Trust Differences-in-Differences Estimates?, Quarterly Journal of
Economics, 119(1), 249–275.
de Chaisemartin, Clément and Xavier D’Haultfœuille (2020), Two-Way
Fixed Effects Estimators with Heterogeneous Treatment Effects, Amer-
ican Economic Review 2020, 110(9), 2964–2996.
Mundlak, Y. (1978), On the pooling of time series and cross section data,
Econometrica, 46, 69–85.
Stock, James H. and Mark W. Watson (2008), Heteroskedasticity-Robust
Standard Errors for Fixed Effects Panel Data Regression, Economet-
rica, 76(1), 155–174. [advanced]

You might also like