SurveyData 3
SurveyData 3
1 / 49
Outline
2 Pooled Cross-Sections
3 Panel Estimators
6 Sampling Weights
2 / 49
Analysis of Survey Data in STATA
3 / 49
Analysis of Survey Data in STATA
In general:
4 / 49
Analysis of Survey Data in STATA
5 / 49
Analysis of Survey Data in STATA
Implications of Truncation
Micro survey data of macroeconomic expectations is typically
truncated, at least for non-experts
Pooled Cross-Sections
7 / 49
Pooled Cross-Sections
Pooled Cross-Sections
8 / 49
Pooled Cross-Sections
Pooled OLS
Assumptions:
9 / 49
Pooled Cross-Sections
Pooled OLS
yt = xt β + ut , t = 1, 2, ..., T (1)
10 / 49
Pooled Cross-Sections
Pooled OLS
⇒ Homoscedasticity
0
4 E ut us xt xs = 0, t 6= s, t, s = 1, 2, ..., T
⇒ No serial correlation
11 / 49
Pooled Cross-Sections
Diff-in-diff Estimation
Natural experiment: One control group (A) and one treatment group
(B, dummy dB)
Two time periods: One before the policy change, one afterwards
(dummy d2)
12 / 49
Pooled Cross-Sections
Diff-in-diff Estimation
Research question: What is the effect of the change in unemployment
insurance on the individual labor market experience?
Estimation equation:
yi,t = β0 + β2 dBi,t + β3 d2t + δ1 d2t ∗ dBi,t + ui,t (2)
OLS estimator of δ1 :
δˆ1 = y B,1 − y B,2 − y A,1 − y A,2
(3)
where y A,1 , y A,2 , y B,1 and y B,2 are the sample averages of y for the
control (A) and the treatment (B) group before and after the policy
change (time periods 1 and 2)
Diff-in-diff Estimation
14 / 49
Pooled Cross-Sections
Clustering: Data may differ across certain groups, either different time
periods or different cross-section groups
16 / 49
Pooled Cross-Sections
Panel Estimators
18 / 49
Panel Estimators
Panel Estimators
In principal, we can test for fixed vs. random effects (Hausman test),
but intuitively, individual fixed effects make more sense ⇒ account for
unobserved constant individual effects
19 / 49
Panel Estimators
1 Exogeneity Assumption:
E (uit |xi , ci ) = 0, t = 1, 2, ..., T (8)
⇒ Strict exogeneity of xit conditional on ci
2 Rank condition:
T
!
X 0
rank E (xit − xi ) (xit − xi ) =K (9)
t=1
21 / 49
Panel Estimators
Between estimator:
y i = xi β + ci + u i (12)
⇒ Eliminates the time effect by calculating time averages between the
cross-sections
⇒ In STATA: xtreg depvar indepvars [if] [in] [weight], be [options]
22 / 49
Panel Estimators
23 / 49
Panel Estimators
Generally: Serial correlation not a big issue with micro survey data,
applies more to aggregated (macro) panel models
24 / 49
Panel Estimators
xtcsd includes further tests by Frees (1995, 2004) and Friedman (1937)
25 / 49
Panel Estimators
Solution:
Use OLS regression with clustered standard errors, robust to
heteroscedasticity across cross-sections, and within cross-section
correlation
26 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
27 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Binary Variables
28 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
P(y = 1|x) = P(y ∗ > 0|x) = P(e > −xβ|x) = 1 − G (−xβ) = G (xβ)
(16)
Estimation by maximum likelihood
Robust or clustered standard errors to account for heteroscedasticity
or clustered correlations
29 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
exp(xβ)
G (xβ) = Λ(xβ) ≡ (18)
1 + exp(xβ)
30 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Marginal Effects
∂p(x) dG
= g (xβ)βj , where g (xβ) ≡ (xβ) (19)
∂xj dxβ
31 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Marginal Effects
32 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Bi-Probit Estimators
33 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Bi-Probit Estimators
34 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
Dependent variable is ordinal, e.g. has values from 1-5 where the
ordering matters
Often the case for qualitative survey data, e.g. Likert scale: (1 – like),
(2 – like somewhat), (3 – neutral), (4 – dislike somewhat), (5 – dislike)
35 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
36 / 49
Analysis of Binary and Ordinal (Qualitative) Variables
38 / 49
Sample Selection and Attrition Bias
Regression equation:
0
yi = xi β + εi observed only if zi = 1 (25)
(ui , εi ) ∼ bivariate normal[0, 0, 1, σε , ρ]
40 / 49
Sample Selection and Attrition Bias
41 / 49
Sample Selection and Attrition Bias
0
2 Estimate yi = xi β + βλ λ̂ + εi by least squares to obtain estimates of
β and βλ
42 / 49
Sample Selection and Attrition Bias
43 / 49
Sampling Weights
Sampling Weights
44 / 49
Sampling Weights
Sampling Weights
45 / 49
Sampling Weights
Literature I
Bruno, G. S. (2005).
Approximating the bias of the LSDV estimator for dynamic unbalanced panel
data models.
Economics Letters 87 (3), 361–366.
46 / 49
Sampling Weights
Literature II
Frees, E. W. (1995).
Assessing cross-sectional correlations in panel data.
Journal of Econometrics 64, 393–414.
Frees, E. W. (2004).
Longitudinal and Panel Data: Analysis and Applications in the Social
Sciences.
Cambridge University Press.
Friedman, M. (1937).
The use of ranks to avoid the assumption of normality implicit in the analysis
of variance.
Journal of the American Statistical Association 32, 675–701.
47 / 49
Sampling Weights
Literature III
Greene, W. (2012).
Econometric Analysis (7th ed.).
Pearson Education.
Heckman, J. J. (1979).
Sample selection bias as a specification error.
Econometrica 47 (1), 153–161.
Nickell, S. (1981).
Biases in dynamic models with fixed effects.
Econometrica 49 (6), 1417–1426.
Pesaran, M. H. (2004).
General diagnostic tests for cross section dependence in panels.
Cambridge Working Paper in Economics 0435.
48 / 49
Sampling Weights
Literature IV
Wooldridge, J. M. (2010).
Econometric Analysis of Cross Section and Panel Data (2nd ed.).
Cambridge, MA: MIT Press.
49 / 49