0% found this document useful (0 votes)
4 views11 pages

Empirical Finance8

empirical finance8

Uploaded by

edison6685
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views11 pages

Empirical Finance8

empirical finance8

Uploaded by

edison6685
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

8. Specification of models; by K. M.

Abadir

These slides highlight some essential ingredients in modelling. Technical details


are found throughout the earlier lectures notes; this lecture illustrates the
methodology with a time-series example (“NP” data file). Further reference:
PC-GIVE manual or D.F. Hendry’s (1995) Dynamic Econometrics, OUP.

A. Exploratory data analysis (EDA) and parametric model


specification

The first thing to do is to look at the data and their descriptive properties.
NP analysis can be used to explore if there is a nonlinear relation between the
variables and their past, and/or between the variables and each other.
Time plots can be very informative in the quest for properties and/or to narrow
down possible models. For example, if there is a clear break or a clear trend,
then these should be included in the dynamic model for the series.
The partial ACF (PACF) plot helps identify the lag length of univariate AR

models: an AR() with coefficients {}=1 gives rise to a PACF with only 

significant first values equal to estimates of {}=1. Therefore, PACFs help
determine the lag length . PACF provides only rough guidance if the model
is more elaborate (e.g. fractional I() case where ACF is more informative).
For the lag length of volatility models (like GARCH), use the (P)ACF of 2
instead of .

However, both ACF and PACF do not allow for a break and other such features,
which should be removed before calculating (P)ACF. For example, in the “NP”
file, PACF indicates an AR(1) for consumption (and also a unit root); while
if a break is allowed for, we get AR() with   1 (and a stationary series
with business cycles).
The regression for unit root analysis should include trend or break, if the
possibility of one has been identified in the time plot. The tests by Pierre
Perron and coauthors are a formal way to discriminate between a unit root
and a break, including the case of breaks at unspecified points in time (the
procedure estimates the timing of the breaks). Outliers should also be identified
and dealt with, e.g. by means of dummies. These are dealt with in the recent
“impulse saturation” work by Hendry, Ericsson, and coauthors, implemented
in PC-GIVE.

When putting together two or more variables in an equation (like ECM), they
may be linked through other lags. Even if the PACFs indicate AR(1) for each
variable, the interaction of the two variables may require further lags; e.g.
current consumption changes may be linked to many lags of income changes.
The frequency of the data may dictate the longest lag length to start with.
Economic or Finance Theory may be another source of this choice. In either
case, the lags should be sufficiently long and, if the choice is wrong, the tests
on the next slide will detect it.
B. Testing for misspecification (diagnostic tests)
The previous section can help identify a starting model, e.g. of the ECM type.
In general, the resulting regression errors, , should be ‘clean’ and contain:
1. no AR: check DW ≈ 2 (only a rough guide if RHS includes lagged dependent
variable) and LM test;
2. no heteroskedasticity: check general heteroskedasticity tests (e.g. White’s)
or specific ARCH etc. tests, use HAC SEs if necessary;
3. no omitted nonlinearities: check Ramsey’s RESET.
Look at the residuals and their ACF: do you see any remaining patterns (e.g.
cycle)? If so, then something is missing from the model; e.g. the lag length
was insufficient in the ADL or ECM.
For y = Xβ + ε, apart from ensuring a clean ε, another source of problems to
avoid is in the estimate of β. You should therefore check parameter stability,
e.g. by plotting recursive parameter estimates or by a Chow test.
Then, you can start testing the significance of estimated coefficients (t-ratios
etc.). If the models fails the above, usual significance tests will be misleading.
Equations with many regressors are unreliable: high SEs that make most es-
timates seem insignificant (remove some regressors & this changes), multi-
collinearity, etc.
We have seen this problem in the first application of Lecture 4.
You can reduce the equation by significance tests or information criteria (IC)
rankings. If using significance, start with estimates that have small t-ratios
(e.g. |t|  1) and, if possible, corresponding to the highest lag (to reduce the
loss of initial data points as in Lecture 4).
If you use “automatic model selection” in PC-GIVE:

• Before doing so, test special hypotheses; e.g. in Lecture 4 we started by


testing a unit root factorization in application 1, or  =  0 in application 2.
• Make sure you select the 10% (or even 20%), instead of the default 5% joint
level, if you want to keep marginally-significant estimates in the regression.
• The procedure assumes approximate orthogonality (uncorrelatedness) of the
regressors. It therefore matters how you specify the model to estimate.
For example, ADL and ECM are mathematically equivalent formulations,
but the latter has variables that are closer to orthogonality than the former
specification. Therefeore, the numerically-chosen final model is more reliable
in ECM than ADL format, especially if the series are persistent.

If you choose the ADL (not ECM) specification, you can make PC-GIVE factor
out the ECM and print it (and long-run relation) at the end.
C. Model selection
The general-to-specific methodology is one where you reduce a general model
by removing components that have insignificant coefficients. The logic behind
it is that starting from the simpler model will usually result in omitted-variable
bias and incorrect inferences.
For example, if the DGP (with 2 sets of fixed regressors X1 and X2) is
³ ´
y = X1β1 + X2β2 + ε ε ∼ D 0  2I 
using the LS (same as ML if ε ∼ N, i.e. D = N) estimator β e 1 ≡ (X 0 X1)−1X 0 y
1 1
e 1 is obtained by regressing y on X1 only, omitting
leads to a biased estimator: β
X2 erroneously from³the estimated model. The bias arises´because
¡ 0 ¢−1 0
e
E(β1) = E X1X1 X1 (X1β1 + X2β2 + ε)
¡ 0 ¢−1 0 ¡ 0 ¢−1 0
= β1 + X1X1 X1X2β2 + X1X1 X1 E (ε)
¡ 0 ¢−1 0
= β1 + X1X1 X1X2β2
For general X2, the estimator β e 1 is unbiased only when β2 = 0 (hence X2
is absent from model and DGP) or when X10 X2 = O (hence X1 and X2 are
orthogonal to one another; e.g. like 1     12 in Lecture 4).
We have seen an illustration of this bias problem at the start of this lecture,
when we fitted an AR model (“NP” file) without allowing for the possibility of
a break, then got very different AR parameters when we allowed for a break.

A second component of model selection is encompassing analysis: whether one


model encompasses statistically the explanatory power of the other model.
When models are nested (one is a special case of the other, e.g. AR(3)⊂AR(4)),
we can do the same as before: significance tests or IC rankings, though the
latter is only a ranking without testing if the difference in ICs is significant
(this has been solved in Sin and White, JoE, 1996).
ICs improve when you remove insignificant coefficients (because the likelihood
is almost unaffected and there is a reduction of penalty on the model’s dimen-
sion), but removing a significant coefficient can have an ambiguous effect.
When the models are non-nested, use ICs or encompassing tests such as Vuong’s
(1989, Econometrica) test.
Alternatively, in simple cases, it may be possible to nest both models into a
general one (take the weighted average of both RHSs), then test reductions.
For example, improving the dynamic regression of consumption on income by
adding one of either inflation or dummies for the 1973 break (two competing
models), a general model can be set up by adding the two factors and testing
which component is more significant.
However, this approach may not lead to a conclusive answer.
Furthermore, it may not be feasible for comparing some models; e.g. compet-
ing non-nested ARCH-type models.

2
Do not compare models using 2 or  , unless the LHS variables are the same.
This is because P 2
2  b
 c )
var(b
 ≡1−P 2
=1−
 ( − )
c
var()
with  the LHS variable: just rearranging the variables to make the LHS
variance larger, we get a higher 2 (compare 2 of log prices vs returns).
Appendix to the course
Essentials to memorize
The course contains results that you should be able to recognize and analyze,
but there are a few essential formulae that should be memorized and you should
be able to write them down:
1. Basic math, such as the binomial expansion, geometric progression, etc.
2. Basic rules of probability, the main ones being:
• Pr(  ) = 1 − Pr( ≤ )
• () ≡ Pr( ≤ )
• For  continuous,
R () R= d() d; for  discrete, () = Pr( = )
• E(())
R = all    = ∈X ()() d (for  continuous)
•  = all   (for  continuous)
• Pr( and ) = Pr( | ) Pr()
• E = E E| (if   independent, then E| = E and E = E E)
R R
•  = all   = all  |  = E (| ) (the first two equalities hold
for  continuous, but the last equality holds for any ).
3. Formula for basic p.d.f.s  () of:
• Normal
• Uniform
• Bernoulli,
as well as basic statistical concepts like moments (mean, variance, skewness,
kurtosis), MSE, etc.
4. Basic models:
• ARMA( )
• VAR(), ADL( ), ECM, ICM
• I()
• GARCH( ).

You will not need to write down other distributions or models (e.g. Bin ( )
or EGARCH(1,1)), but you should be able to recognize them if given to you.

You might also like