Panel Data Regression Models-Seminar
Panel Data Regression Models-Seminar
Models
Kabul University
Economics Faculty
MFB program
Presenter: Hassib
Supervisor: Dr. Prof. Ajmal Arian
What is Panel Data
To data collected on multiple subjects (individual, firms and countries) over multiple
points in time, allowing analysis of both individual and time-related effects.
Providing more information and potentially more reliable estimates compared to purely
cross sectional or time-series data.
Combining Cross sectional time series data
Commonly used in economic phenomena such as: effects of policies, market trends, and
individual behavior over time.
Other names: pooled data, combination of time series and cross-section data, micro panel
data, longitudinal data, longitudinal data and cohort analysis.
Some of the well-known panel data sets are:
Panel Study of Income Dynamics (PSID): University of Michigan, tracking and survey of
American families and their descendants since 1968.
National Longitudinal Survey of Youth(NLSY): Bureau of Labor Statistics that follows
the lives of several cohorts of American born between 1957 and 1984.
European Community Household Panel(ECHP): Longitudinal survey in European
countries from 1994 to 2001.
German Socio-Economic Panel (SOEP) : Longitudinal survey of German’s households in
various aspects of social and economic life since 1984.
Panel Study of Belgian Households ( PSBH): Based on Belgian households on
socioeconomic changes over time.
Why Panel Data?
Advantages of Panel data rather than cross-sectional and time-series
data.
Estimate heterogeneity
consequence of combination of Cross-sectional and Time-series is efficient data
panel data are better suited to study the dynamics of change
Measuring and finding effects in panel data is much better.
Enabling to study more complicated models
Minimize biase
Enriching empirical analyze
To illustrate an example
Example
Example
Example
Pooling all the 80 observations, we can write the Grunfeild Investment function:
Estimation of Panel Data Regression Model
For estimating of intercept, slope coefficient, and error term; there are several possibilities:
Assume that the intercept and slope coefficients are constant across time and space and the
error term captures differences over time and individuals.
2. The slope coefficients are constant but the intercept varies over individuals.
3. The slope coefficients are constant but the intercept varies over individuals and time.
4. All coefficients (the intercept as well as slope coefficients) vary over individuals.
5. The intercept as well as slope coefficients vary over individuals and time.
All Coefficients Constant across Time and
Individuals
Slope Coefficients Constant but the Intercept Varies across
Individuals: The Fixed Effects or Least-Squares Dummy Variable
(LSDV) Regression Model
When dummy variable do not pretend sufficient knowledge of about true model, and
disturbance term. The approach which is suggested is called Error Components Model
(ECM) or Random Effect Model (REM).
The basic idea is started by this below equation:
Yit = β1i + β2X2it + β3X3it + uit
Instead of B1i as fixed, we assume this is a random variable with a mean value of B1(no i): β1i
= β1 + εi i = 1, 2, . . . , N
FIXED EFFECTS (LSDV) VERSUS
RANDOM EFFECTS MODEL
One of the challenges for a researcher is to chose FEM or ECM.
FEM is appropriate when εi and the X’s are correlated. On the other hand, while they are uncorrelated
ECM is appropriate.
If T (the number of time series data unite) is large and N (the number of cross sectional data unite) is
small, there is little deference in estimation of parameters through FEM and ECM. Thus, selecting
approach would be based on computational convenience, and FEM may preferable.
If T is small and N is large, parameters that are estimated by two approaches would be significant
deference. Statistical inference: If we believe cross sectional unites are not random drawing from large
sample; FEM is appropriate. And if possible, ECM is appropriate.
If the individual error component εi and one or more repressor are correlated, then the ECM estimators
are biased, whereas those obtained rom FEM are unbiased.
If N is large and T is small, and if the assumptions underlying ECM hold, ECM estimators are more
efficient than FEM estimators.
Tests for choosing FEM and CEM
Hausman Test: The test statistic developed by Hausman has an asymptotic χ2 distribution.
If the null hypothesis is rejected, the conclusion is that ECM is not appropriate and that we
may be better off using FEM, in which case statistical inferences will be conditional on the
εi in the sample.
Johnston and DiNardo : there is no simple rule to help the researcher navigate past the
Scylla of fixed effects and the Charybdis of measurement error and dynamic selection.
Although they are an improvement over cross-section data, panel data do not provide a
cure-all for all of an econometrician’s problems
Some concluding comments regarding to panel
data regression model
Hypothesis testing with panel data.
Heteroscedicity and autocorrelation in ECM.
Unbalanced panel data
Dynamic panel data models in which the legged value(s) of the regression (Yit) appears as
an explanatory variable.
Simultaneous equations involving panel data.
Qualitative depends variables and panel data.