Empirical Finance8
Empirical Finance8
Abadir
The first thing to do is to look at the data and their descriptive properties.
NP analysis can be used to explore if there is a nonlinear relation between the
variables and their past, and/or between the variables and each other.
Time plots can be very informative in the quest for properties and/or to narrow
down possible models. For example, if there is a clear break or a clear trend,
then these should be included in the dynamic model for the series.
The partial ACF (PACF) plot helps identify the lag length of univariate AR
models: an AR() with coefficients {}=1 gives rise to a PACF with only
significant first values equal to estimates of {}=1. Therefore, PACFs help
determine the lag length . PACF provides only rough guidance if the model
is more elaborate (e.g. fractional I() case where ACF is more informative).
For the lag length of volatility models (like GARCH), use the (P)ACF of 2
instead of .
However, both ACF and PACF do not allow for a break and other such features,
which should be removed before calculating (P)ACF. For example, in the “NP”
file, PACF indicates an AR(1) for consumption (and also a unit root); while
if a break is allowed for, we get AR() with 1 (and a stationary series
with business cycles).
The regression for unit root analysis should include trend or break, if the
possibility of one has been identified in the time plot. The tests by Pierre
Perron and coauthors are a formal way to discriminate between a unit root
and a break, including the case of breaks at unspecified points in time (the
procedure estimates the timing of the breaks). Outliers should also be identified
and dealt with, e.g. by means of dummies. These are dealt with in the recent
“impulse saturation” work by Hendry, Ericsson, and coauthors, implemented
in PC-GIVE.
When putting together two or more variables in an equation (like ECM), they
may be linked through other lags. Even if the PACFs indicate AR(1) for each
variable, the interaction of the two variables may require further lags; e.g.
current consumption changes may be linked to many lags of income changes.
The frequency of the data may dictate the longest lag length to start with.
Economic or Finance Theory may be another source of this choice. In either
case, the lags should be sufficiently long and, if the choice is wrong, the tests
on the next slide will detect it.
B. Testing for misspecification (diagnostic tests)
The previous section can help identify a starting model, e.g. of the ECM type.
In general, the resulting regression errors, , should be ‘clean’ and contain:
1. no AR: check DW ≈ 2 (only a rough guide if RHS includes lagged dependent
variable) and LM test;
2. no heteroskedasticity: check general heteroskedasticity tests (e.g. White’s)
or specific ARCH etc. tests, use HAC SEs if necessary;
3. no omitted nonlinearities: check Ramsey’s RESET.
Look at the residuals and their ACF: do you see any remaining patterns (e.g.
cycle)? If so, then something is missing from the model; e.g. the lag length
was insufficient in the ADL or ECM.
For y = Xβ + ε, apart from ensuring a clean ε, another source of problems to
avoid is in the estimate of β. You should therefore check parameter stability,
e.g. by plotting recursive parameter estimates or by a Chow test.
Then, you can start testing the significance of estimated coefficients (t-ratios
etc.). If the models fails the above, usual significance tests will be misleading.
Equations with many regressors are unreliable: high SEs that make most es-
timates seem insignificant (remove some regressors & this changes), multi-
collinearity, etc.
We have seen this problem in the first application of Lecture 4.
You can reduce the equation by significance tests or information criteria (IC)
rankings. If using significance, start with estimates that have small t-ratios
(e.g. |t| 1) and, if possible, corresponding to the highest lag (to reduce the
loss of initial data points as in Lecture 4).
If you use “automatic model selection” in PC-GIVE:
If you choose the ADL (not ECM) specification, you can make PC-GIVE factor
out the ECM and print it (and long-run relation) at the end.
C. Model selection
The general-to-specific methodology is one where you reduce a general model
by removing components that have insignificant coefficients. The logic behind
it is that starting from the simpler model will usually result in omitted-variable
bias and incorrect inferences.
For example, if the DGP (with 2 sets of fixed regressors X1 and X2) is
³ ´
y = X1β1 + X2β2 + ε ε ∼ D 0 2I
using the LS (same as ML if ε ∼ N, i.e. D = N) estimator β e 1 ≡ (X 0 X1)−1X 0 y
1 1
e 1 is obtained by regressing y on X1 only, omitting
leads to a biased estimator: β
X2 erroneously from³the estimated model. The bias arises´because
¡ 0 ¢−1 0
e
E(β1) = E X1X1 X1 (X1β1 + X2β2 + ε)
¡ 0 ¢−1 0 ¡ 0 ¢−1 0
= β1 + X1X1 X1X2β2 + X1X1 X1 E (ε)
¡ 0 ¢−1 0
= β1 + X1X1 X1X2β2
For general X2, the estimator β e 1 is unbiased only when β2 = 0 (hence X2
is absent from model and DGP) or when X10 X2 = O (hence X1 and X2 are
orthogonal to one another; e.g. like 1 12 in Lecture 4).
We have seen an illustration of this bias problem at the start of this lecture,
when we fitted an AR model (“NP” file) without allowing for the possibility of
a break, then got very different AR parameters when we allowed for a break.
2
Do not compare models using 2 or , unless the LHS variables are the same.
This is because P 2
2 b
c )
var(b
≡1−P 2
=1−
( − )
c
var()
with the LHS variable: just rearranging the variables to make the LHS
variance larger, we get a higher 2 (compare 2 of log prices vs returns).
Appendix to the course
Essentials to memorize
The course contains results that you should be able to recognize and analyze,
but there are a few essential formulae that should be memorized and you should
be able to write them down:
1. Basic math, such as the binomial expansion, geometric progression, etc.
2. Basic rules of probability, the main ones being:
• Pr( ) = 1 − Pr( ≤ )
• () ≡ Pr( ≤ )
• For continuous,
R () R= d() d; for discrete, () = Pr( = )
• E(())
R = all = ∈X ()() d (for continuous)
• = all (for continuous)
• Pr( and ) = Pr( | ) Pr()
• E = E E| (if independent, then E| = E and E = E E)
R R
• = all = all | = E (| ) (the first two equalities hold
for continuous, but the last equality holds for any ).
3. Formula for basic p.d.f.s () of:
• Normal
• Uniform
• Bernoulli,
as well as basic statistical concepts like moments (mean, variance, skewness,
kurtosis), MSE, etc.
4. Basic models:
• ARMA( )
• VAR(), ADL( ), ECM, ICM
• I()
• GARCH( ).
You will not need to write down other distributions or models (e.g. Bin ( )
or EGARCH(1,1)), but you should be able to recognize them if given to you.