0% found this document useful (0 votes)
227 views

Intro To Panel Data Analysis Using Stata-UiTM Perlis-Mei2015

The document outlines a two-day workshop on introducing panel data analysis using Stata 11. The workshop covers basic Stata commands, importing and working with panel data, and estimating several panel data models, including pooled OLS, fixed effects, random effects, and tests for model selection. Participants will learn how to interface with Stata, import data, use basic commands and panel data commands, and conduct hands-on analysis of panel data models.

Uploaded by

ayoubhaouas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
227 views

Intro To Panel Data Analysis Using Stata-UiTM Perlis-Mei2015

The document outlines a two-day workshop on introducing panel data analysis using Stata 11. The workshop covers basic Stata commands, importing and working with panel data, and estimating several panel data models, including pooled OLS, fixed effects, random effects, and tests for model selection. Participants will learn how to interface with Stata, import data, use basic commands and panel data commands, and conduct hands-on analysis of panel data models.

Uploaded by

ayoubhaouas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

INTRODUCTION TO PANEL DATA

ANALYSIS USING STATA 11

By:
Mahyudin Ahmad, PhD
UiTM Perlis.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 1
Outline of Panel Data workshop
Day 1-Morning – Session 1 Day 2-Morning – Session 2 contd.
1. Stata 11: an introduction 1. Model 3: FE-Within
2. Setting up 2. Model 4: RE
3. Importing data to Stata 11 3. Exercises
4. Basic commands
5. Panel data commands Day 2-Evening – Session 2 contd.
6. Hands-on with Stata 11 1. Test and model selection
2. Conclusion
Day 1-Evening – Session 2 3. Exercises
1. Intro to Panel Data Analysis
2. Model 1: POLS
3. Model 2: FE-LSDV
4. Exercises

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 2
Session 1:
INTRODUCTION TO
STATA 11

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 3
Learning outcomes
At the end of the session 1, participants will be able to:

1. familiarise themselves with Stata11 interface,

2. perform setting-up steps before embarking on Stata11 works,

3. import data from excel or *.csv file to Stata data editor,

4. implement basic necessary commands in Stata11 related to creating,


labelling, label-defining variables and data summary, and

5. familiarise themselves with *.do file and eventually understand and


appreciate the importance of *.do file in Stata.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 4
1. Stata 11: an introduction

Stata 11 interface

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 5
1. Stata 11: an introduction

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 6
1. Stata 11: an introduction

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 7
1. Stata 11: an introduction

Menu tabs and buttons


Create/open
do file

Log
begin

Data
editor

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 8
2. Setting up - basics

Basic setting up steps:

set mem 1000m

set more off

cd "D:\<your working directory name>"

log using <log file name>.log

It is advisable to do these steps before embarking on any regression. The log file
keeps the whole works you did including the regression results/outputs.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 9
2. Setting up - basics
List of previously
executed
command

Output/result
appear here

We write command
here
List of variables
appear here once
we load the data

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 10
2. Setting up - basics

Your directory is specified, see the


bottom left panel in Stata interface

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 11
2. Setting up - basics

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 12
2. Setting up – do file

Some people may think they prefer “click and execute” style, but I guess having
a record of your work is paramount since research is a long term process. You
may want to reuse your preferable method in future using different sets of data.

IMPORTANT to have do file:

1. Keep track your work


2. Make proper plan to your regression
3. Avoid repeating the same regressions over and over again
4. Remind you necessary information when running regression (such as
purpose, variables included, sample used etc.)
5. Finally, it’s very-very handy for future use

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 13
2. Setting up – do file

Creating a do file
Create/open
do file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 14
2. Setting up – do file

Sample do file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 15
2. Setting up – do file

More sample do file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 16
2. Setting up – do file

More sample do file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 17
2. Setting up – do file

Once you written down


your notes and
command, you can
execute them.

Highlight the command,


and click “Execute
Selection (do)” button.

So next time, if you have


already meticulously
planned you work, you
can run the whole
regression with only one
click!

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 18
2. Setting up – do file

Stata output when you execute commands using do file:

Words in do file started with * are


considered as notes and will be simply
rewritten in the output panel.

Commands in do file will be executed and


results produced in the output panel

Unsuccessful command will appear in red,


we can make correction based on
suggestion given, r(170): here there’s no
folder with such name (not yet created).

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 19
3. Importing data to Stata 11

First time loading data from Excel:


REMEMBER: Sort your data vertically according to panel group (column1),
time (column2), followed by the variables
country year gdp sav pop
Albania First cross- 1990 6.75179343 20.9783993 1.6
Albania 1991 -11.4142038 -13.0284996 -0.2
Albania
sectional unit 1992 -27.5896031 -75.4131012 -1.6
Albania 1993 -5.69153612 -33.6716003 -1.4
Albania 1994 11.1974627 -9.88263035 0.2
Albania 1995 9.1941036 -3.94799995 1.2
Albania 1996 7.55757392 -11.8118 1.3
Albania 1997 7.73893405 -9.25912952 1.2
Albania 1998 -8.06352119 -6.69585991 1.1
Albania 1999 missing -1.66910005 1.1
Algeria 1990 2.29575915 27.4666996 2.5
Algeria Time 1991 -3.72084675 36.6562004 2.4
Algeria 1992 -3.55414336 32.3755989 2.4
Algeria
dimension 1993 -0.79384221 27.8384991 2.3
Algeria 1994 -4.35723136 27.0359993 2.2
Algeria 1995 -3.31007521 28.4333992 2.2
Algeria 1996 1.59040861 31.4230003 2.2
Algeria 1997 1.58921549 32.1985016 2.2
Algeria 1998 -1.03429441 27.0669003 2.1
Algeria 1999 1.44857954 31.6912003 2.1

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 20
3. Importing data to Stata 11
Once you have sorted your data, follow these
steps to import data to Stata 11 from Excel.
1. From you excel file, copy all data
including the variable names
2. Click data editor button,
3. Data editor window will popup, then
4. Highlight top left cell, and press Ctrl+V
to paste your data,
5. You’ll be asked how to treat first row:
as data or variable names
6. Choose: Treat first row as
variable names
7. Save it as <your dta
filename>.dta . The data will be
saved in your working folder earlier set
by cd command.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 21
3. Importing data to Stata 11

This is how your data will appear


in data editor window.

Check the pasted data: If your


data appear in red, that means
Stata does not recognized it as
integer. Please go back to Excel
and check the format of the cell
where you put the data.

Your data variable obs, id or yr


will be in blue once you label and
define them. We’ll learn the
command later.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 22
3. Importing data to Stata 11

You may start labelling the variable,


check descriptive statistics, etc by
using commands in the next slide.

Next time, when coming back to


your regression, load saved data via
“Open” button or simply run the
following command :

use “<your dta


filename>.dta” (Note: you must Variables name appear
run the basic setting up steps first - here, right click to see
options to edit variable
refer Pg.5)

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 23
3. Importing data to Stata 11
Importing *.csv data file using point and click menu:

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 24
3. Importing data to Stata 11

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 25
4. Basic commands

Commands normally used (the list below is not extensive, and you may learn
yourself from sources widely available in the internet):

use “<your dta filename>.dta” (Note: the data must be in the


specified folder, in .dta version)
help or for specific to any command, help <command>
describe or des: to obtain general description of the data
summarize or sum: to obtain basic statistics of the data
tab <variable id>: to check frequency of specific variable

gen <newvar> = <expression> : to generate new variable based on


expression given.
Example: gen laggrowth = l.growth
Example: gen squaredwage = wage^2

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 26
4. Basic commands

More basic commands:

label var <var id> "label" : to label a variable (see slide 21)

For data denoted in binary, eg. data for airline variable i which are denoted
in binary number (1, 2, ….,6) to reflect 6 types of airlines, we can label define
the binary numbers for easier reference.

Once the data loaded, we first label the variable i as “Airlines in US” and
then define its binary data with appropriate definitions. The commands are:

label var i "Airline in US"


label define airlinename 1 "Lufthansa" 2 "Delta" 3 "US
Airways" 4 "Atlantic" 5 "Ryanair" 6 "Skyway"
label value i airlinename
Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 27
4. Basic commands
To label define variable using Variables Manager menu:

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 28
4. Basic commands

More basic command in here: http://dss.princeton.edu/training/StataTutorial.pdf

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 29
Hands-on with Stata11

 Hands-on with Stata 11


 Please use the dataset and do file given as in download link:
http://goo.gl/TX0ybe
 Start with “uum do file 1 handson” first.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 30
5. Panel data command

Commands normally used in panel data analysis. Panel data commands start
with xt
help xt : to obtain help files on xt command
xtset id year : to inform Stata that our data is panel
xtsum: to summarize data, will give overall, between and within stats and
obs
xtdes <var id> : to describe a variable
xtreg : to start panel regression. We’ll look more after this
xtabond: difference GMM command
xtdpdsys: system GMM command
xtabond2: this command is a lot better than the above two, it’s capable of
doing both dGMM and sGMM. We’ll look more after this.

We’ll learn more commands from time to time during this workshop.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 31
5. Panel data command

Another commands frequently used in panel data analysis are:


encode country, gen(countryid) : to convert country names in our
dataset into country id (because Stata only accept integer)
testparm i.country : to test country-specific effect, null hypothesis
there is no country-specific effect (run after LSDV command)
testparm i.year : to test time-specific effect, null hypothesis there is no
time effect (run after regression command with i.year variables)
vif : variance inflation factor, to detect collinearity, run after regression
command  mean vif must be <10 to indicate no collinearity
xttest3 : to test for heterokedasticity, run after FE command, null
hypothesis of no heterokedasticity problem.
xtserial <varlist> : to test for serial correlation, null hypothesis of no
sc

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 32
6. Simple example

Source: http://www.princeton.edu/~otorres/Panel101.pdf

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 33
Session 2:
INTRODUCTION TO PANEL DATA
ANALYSIS USING STATA 11

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 34
Outline

1. Introduction to Panel data


2. CLRM assumptions
3. Static Panel data models
 Pooled OLS
 Fixed effects-LSDV
 Fixed effects-Within estimation
 Random Effects
 Other models i.e. BE, FD.
4. Tests and model selection
5. Conclusions & Exercises
Main references: Badi Baltagi (2008), Cameron & Trivedi (2005),
http://www.princeton.edu/~otorres/Panel101.pdf

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 35
Learning outcomes
At the end of the workshop, participants will be able to:

1. elaborate advantages of panel data,

2. identify biases in panel data estimation due to individual heterogeneity

3. perform empirical analyses for panel data and interpret the results

4. distinguish differences in results due to different estimation techniques

5. decide appropriate estimation model for your panel data

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 36
1. Introduction

 3 types of data structure: cross section, time series, panel or longitudinal


data
 Cross section: sample of individuals, countries, etc. at a specific
point in time (i=1.....N, T=1).
 Time series: observations of one or more variables over time,
chronological order (N=1, t=1 .....T)
 Panel data: combines both dimensions! They are repeated
observations on the same cross section, for several time periods.
 Classical panel data: N>T or short or micro panel
 Macro panel: T>N or long panel
 Balanced panel= data available for all individual for all periods, therefore
no. of observations is n=NT
 Unbalanced – different T for individuals – but Stata can take care of this
issue.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 37
1. Introduction

 Choice of econometric methods depends on type of data:


 Least Squares Regression: Normally applied to cross-
sectional data set (eg. OLS).
 Time-Series Econometrics: Normally applied to time series
data, to uncover long run relations and short-run dynamics
(eg. Unit root test, cointegration, VECM)
 Panel Data Modeling: Normally used to capture heterogeneity
across samples and due to the need to have bigger sample
size
 Static panel data model: FE, RE, POLS, BE
 Dynamic panel data: Difference GMM/System GMM
 Panel Unit root and cointegration (macro panel)

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 38
1. Introduction

 Our focus: Short or micro panel data where N>T.

name code year gdp sav pop


Albania First cross- ALB 1990 6,75179343 20,9783993 1,6
Albania ALB 1991 -11,4142038 -13,0284996 -0,2
Albania
sectional unit ALB 1992 -27,5896031 -75,4131012 -1,6
Albania ALB 1993 -5,69153612 -33,6716003 -1,4
Albania ALB 1994 11,1974627 -9,88263035 0,2
Albania ALB 1995 9,1941036 -3,94799995 1,2
Albania ALB 1996 7,55757392 -11,8118 1,3
Albania ALB 1997 7,73893405 -9,25912952 1,2
Albania ALB 1998 -8,06352119 -6,69585991 1,1
Albania ALB 1999 missing -1,66910005 1,1
Algeria DZA 1990 2,29575915 27,4666996 2,5
Algeria DZA Time 1991 -3,72084675 36,6562004 2,4
Algeria DZA 1992 -3,55414336 32,3755989 2,4
Algeria DZA
dimension 1993 -0,79384221 27,8384991 2,3
Algeria DZA 1994 -4,35723136 27,0359993 2,2
Algeria DZA 1995 -3,31007521 28,4333992 2,2
Algeria DZA 1996 1,59040861 31,4230003 2,2
Algeria DZA 1997 1,58921549 32,1985016 2,2
Algeria DZA 1998 -1,03429441 27,0669003 2,1
Algeria DZA 1999 1,44857954 31,6912003 2,1

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 39
2. Advantages and Disadvantages

 Panel data allows you to control for variables you cannot


observe or measure such as :
 time-invariant factors like country access to shore (trade-
growth analysis), ability (income analysis), firm management
characteristics (firm profitability analysis), etc.

 Individual-invariant factors ie variables that change over


time but not across entities like national policies, federal
regulations, international agreements, etc. macro data like
growth, inflation rate for firms’ profitability analysis

 In other words, panel data is able to account for individual


heterogeneity (naturally exists over pooled data) – thereby giving
efficient estimates.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 40
2. Advantages and Disadvantages

 Advantages:
 larger sample size  data more informative, more variations
(normally between variations > within variations), less
collinearity (as is often time series),  leading to increased
precision of estimates.
 Ability to study the dynamics – repeated cross sectional
observations – adjustment over times
 Ability to account for heterogeneity across individuals often
ignored in pooled data – more robust against misspecification
due to omitted variable bias
 Disadvantages:
 Data availability/maintenance,
 measurement error/distortions
 Self-selection bias – other factors than the group heterogeneity

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 41
3. Recall: CLRM assumptions

Table 3.1 above is taken from page 37, “Applied Econometrics”, Asteriou & Hall, 2nd ed. 2011, Palgrave Macmillan.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 42
3. Recall: CLRM assumptions

 But, if the assumptions do not hold?

 Say, Cov(xit, uit) ≠ 0, what will happen?


 there’s correlation between entity’s error term and predictor variables.
 we assume that something within the error term (unique characteristics
within the panel units) may impact or bias the predictor or outcome
variables
 eg. Schooling affect level of income, however ability (unobserved
factor) also affect one’s schooling level and earning.
 need to control for this factor
 xit is therefore “endogenous” (reverse causation, measurement error in
an explanatory variable).
 OLS will be biased in small samples and inconsistent in large samples

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 43
4. Panel data models

 Pooled OLS
 Fixed Effects
 Least Square Dummy Variables
 Within estimator
 Between estimator
 Random Effects

 While we try to understand these methods, we can use sample data to


run a simple regression of the above methods.
 To load the data, execute this command in Stata :
use http://dss.princeton.edu/training/Panel101.dta

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 44
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 45
4. Panel data models

 Variations in panel data:

 For time-invariant regressors (race, gender, etc = zero within


variation

 For individual-invariant regressors (trend, policy, macro data, etc =


zero between variation

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 46
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 47
4. Panel data models

 Pooled OLS

yit  0  1 xit   it (1)

 Assumption: the intercept and slope coefficients are constant


 xit is assumed exogenous i.e. uncorrelated with εit ,
 εit is iid error

 In this case, we simply pooled the data (prepare the data in panel,
hence higher number of observations) and run OLS.
 Pooled OLS – subject to heterogeneity bias if assumption εit =iid error
does not hold
 After loading the data, run the command : regress y x1

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 48
4. Panel data models

The intercept and slope are


constant across units and
times.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 49
4. Panel data models

 What if xit correlates with εit , due to the presence of fixed effects, since
εit = vi + uit ?  heterogeneity bias!
 Fixed effects may come from individual or entity (ability, country
access to shore) and time (macro variables, policies, etc)
 In panel data model, we can represent both fixed effects as:
 One-way fixed effect:
o entity fixed effect : yit  1 xit  vi  uit or
o time fixed effect yit  1 xit   t  uit
 Two-way fixed effects, i.e. include both entity AND time fixed
effects y   x v  u
it 1 it i t it

 Solution: we need to remove the effect of these time-invariant


characteristics (vi) on the predictor variables (xit) so that we can
assess the predictors’ true effect.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 50
4. Panel data models

 Assume one-way entity fixed effect:

yit  vi  1 xit  uit (2)

 Model in Eq.(2) shows there will be different intercepts for


different entities but their slopes are constant (homogenous)
across time/entities.
 Corr (xit,vi) ≠ 0. Thus need to control this to get unbiased
estimates of xit.
 vi is assumed unique to the individual and should not be
correlated with other individual characteristics (orthogonal across
entities).
 uit is iid error

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 51
4. Panel data models

 Fixed effects estimation – two ways: Least Square Dummy


Variables (LSDV) and Within estimation (WE)

 LSDV
N
yit   i Di  1 xit  uit (3)
i 1

 Where Di is dummy variable for each entity. N+K-1


regressors.
 Run OLS on Eq.(3)
 In Stata, type this command: regress y x1 i.country
 Run testparm i.country to check country-specific effect
 Problem?  more dummy variables reduces degree of
freedom. Cumbersome if entities are large.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 52
4. Panel data models

The intercepts are different


but the slopes are constant
across units and times.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 53
4. Panel data models

 Within estimation, refer Eq.(2) again

yit  vi  1 xit  uit (2)


 Take time-average of each variable for each cross section

yi  vi  1 xi  ui (4)

Where :
T T T

y x u
1 1 1
yi  it , xi  it and ui  it
T t 1
T t 1
T t 1

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 54
4. Panel data models

 Within estimation:
 Now take Eq.(2) – Eq.(4)
yit  yi  (vi  1 xit  uit ) - (vi  1 xi  ui )
yit  yi  1 ( xit  xi )  (uit  ui )
 Now vi is eliminated, no heterogeneity bias anymore
 “Deviation from individual mean” form
yit  1xit  uit (5)
 Run OLS to Eq.(5)

 Within estimation in Stata can be done via:


xtreg y x1, fe
areg y x1, absorb(country)

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 55
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 56
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 57
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 58
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 59
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 60
Hands-on with Stata11

 Hands-on 2 with Stata 11, use “uum do file 2 handson” file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 61
4. Panel data models

Random Effects
 The rationale behind random effects model is that, unlike the fixed
effects model, the variation across entities is assumed to be random
and uncorrelated with the predictor or independent variables
included in the model.
“…the crucial distinction between fixed and random effects is whether the unobserved
individual effect embodies elements that are correlated with the regressors in the
model, not whether these effects are stochastic or not” [Green, 2008, p.183]
 In other words: Heterogeneity in FE is due to individual differences,
and it correlates with X, BUT in RE, the differences are rather random
and uncorrelated with X.

 If you have reason to believe that differences across entities have


some influence on your dependent variable then you should use
random effects.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 62
4. Panel data models

 An advantage of random effects is that you can include time invariant


variables (i.e. gender) and obtain an estimation for its coefficient. In
the fixed effects model these variables are absorbed by the intercept.
 RE – recall Eq.(2)
yit  1 xit  vi  uit (2)

 Now assume that vi is random effects and Corr (xit,vi) = 0


 Error component model (composite error:  it  vi  uit )
 Variance of the error term, σ2Ɛ = σ2v + σ2u
 In RE, the Ɛit is therefore serially correlated due to the presence of
time-invariant component, vi.
 OLS will be inefficient due to this autocorrelation, OLS standard
errors will be invalid  need to perform GLS estimator.
 See next page transformation from OLS to GLS.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 63
4. Panel data models
 RE model – the transformation:
yit  yi   1 ( xit  xi )  (1   )vi  (uit  ui ) (6)

 u2
where   1 -
T v2   u2

 If θ =1 the RE estimator is identical to FE (within) estimator


 If θ =0 the RE estimator is identical to Pooled OLS estimator
(same intercept)
 In other words, RE estimator lies between Pooled OLS and
FE within estimator depending on value of θ.
 Run OLS to Eq.(6)
 In Stata, the command is xtreg y x1, re
 Run xtreg y x1, re theta to know the value of θ

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 64
4. Panel data models

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 65
4. Panel data models

 First difference estimator

yit  yit 1  (vi  vi )  1 ( xit  xit 1 )  (uit  uit 1 )


 Now vi is also eliminated. But what are their differences?
 It’s called deviation from the immediate lag.
 First observation will be eliminated completely.

 Between estimator

yi  vi  1 xi  ui
 Not so useful, as it wipes out time variations completely.
 Stata command to estimate between estimator:
xtreg y x1 x2 x3, be

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 66
4. Panel data models

Results of the between estimation in Stata:

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 67
5. Tests for Model Selection

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 68
5. Tests

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 69
5. Tests

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 70
5. Tests

testparm i.year

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 71
5. Tests

 Run LSDV command: regress y x1 i.country, then run


vif to check for collinearity. Mean vif must be <10 so there is no
multicollinearity

 Run xtreg y x1 i.year, fe  then xttest3 to test for


heterokedasticity in the error term.

 If hetero is present, then run xtreg y x1 i.year, fe


robust to take care hetero problem.

 Run xtserial <all variables> to test for sc, null


hypothesis of no sc.

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 72
Hands-on with Stata11

 Continue hands-on 2 with Stata 11, we are still using “uum do file
2 handson” file

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 73
6. The best methods

Fixed effect estimator of β remains consistent even if the true model is not fixed
effect

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 74
7. Summary

 It’s all about how we treat the vi in the equation:

yit   0  1 xit   it (1)


and  it  vi  uit

 If vi assumed to be absent, then ɛit is iid – run POLS


 If vi assume to be present and:
 Correlated with xit – run fixed effects
 Uncorrelated with xit – run random effects
 Test to find best method using Hausman or BP-LM tests

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 75
7. Summary

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 76
8. Exercise

 Exercise 1 files downloadable from here: http://goo.gl/eYS9Fx

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 77
8. Exercise

Pooled OLS

Fixed effect LSDV

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 78
8. Exercise

Comparing POLS and LSDV coefficients

The command is:


regress c q pf lf
estimates store ols
xi: regress c q pf lf i.i
estimates store ols_dum
estimates table ols ols_dum, star stats(N)

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 79
8. Exercise

FE - Within estimation

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 80
8. Exercise

FE – Within estimation using areg command

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 81
8. Exercise

Random effect estimation

Note: if you want the value of theta, add word theta after the above command

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 82
8. Exercise
Comparing POLS, LSDV, WE and RE coefficients

The command is:


regress c q pf lf
estimates store ols
xi: regress c q pf lf i.i
estimates store lsdv
xtreg c q pf lf, fe
estimates store we
xtreg c q pf lf, re
estimates store re
estimates table ols lsdv we re, star stats(N)
Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 83
8. Exercise
Hausman test

The command is:


xtreg c q pf lf, fe
estimates store fe
xtreg c q pf lf, re
estimates store re
hausman fe re

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 84
8. Exercise
Testing for time fixed effect:

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 85
8. Exercise 2

 If we have time we can do Exercise 2 using files downloadable from:


http://goo.gl/ZFymdz

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 86
THE END
Thank you

Introduction to Stata11 & Panel Data Analysis –UiTM Perlis, 8 Mei 2015 Page 87

You might also like