1 Introduction: Why Time Series Analysis
1 Introduction: Why Time Series Analysis
(1)
yt = yt1 + ut , = 1,
where t = 1, 2, , T ; y0 = 0, and {ut } is a white noise process, i.e.
E(ut ) = 0,
E(u2t ) = 2 ,
E(ut u ) = 0, t 6= .
The OLS estimator is
!1 T
!1 T
T
T
X
X
X
X
2
2
yt1
yt yt1 = +
yt1
ut yt1 .
=
t=1
t=1
t=1
t=1
T ( ) N(0, 1 2 ).
However, for Model (2), as T ,
T ( ) 0 in probability,
which is of no use in test. How about T ( 1) for Model (2)? From Model (2),
yt = u1 + u2 + + ut ,
and hence yt N(0, t 2 ) or
y
t
t
2
N(0, 1). Note that yt2 = (yt1 + ut )2 = yt1
+
T
T
1 X 2
1 X
2
2
u
y
=
u
y
t
t1
t1
t
2 T t=1
2 2 T t=1 t
T
T
yT
1 X 2 1
1 X 2
1 2
y
u =
2
u
2 2 T T 2 2 T t=1 t
2 T
2 T t=1 t
1 2
(1) 1 .
2
2
Since yt1 N (0, (t 1) 2 ), we have E(yt1
) = (t 1) 2 and
T
X
t=1
2
yt1
T
X
T (T 1) 2
=
(t 1) 2 =
= O(T 2 ).
2
t=1
T ( 1) =
P
2
(1/T 2 ) Tt=1 yt1
(3)
1
2
W (r)dr
1 Z
W (r)dW (r),
where W (r) is the standard Brownian motion. Hence the problems of stationarity
and unstationarity.
Examine the following simple model:
(4)
yt = a0 + a1 zt + et ,
where {yt } and {zt } are two independent random walk processes, i.e.
yt = yt1 + yt
(5)
zt = zt1 + zt
(6)
t
X
i=1
yi a1
t
X
zi .
i=1
requires that the error term be a white-noise. Phillips (1986) shows that, the large
the sample, the more likely to falsely reject the null (i.e. the more significant for the
test in OLS). Therefore, before estimation, we should pretest the unstationarity
of the time series variables in the regression model (the unit root test). A further
study on some cases in Model (4):
1. If {yt } and {zt } are both stationary, the regression model is appropriate.
2. If {yt } and {zt } are integrated of dierent orders, the regression is meaningless.
3. If {yt } and {zt } are integrated of the same order and the residual sequence is
unstationary, the regression is spurious.
4. If {yt } and {zt } are integrated of the same order and the residual sequence is
stationary, then they are cointegrated.
Example 1 (see ex1): Give the graphs for the time series variables of the AR(1)
and the Random walk process, and conduct the OLS estimation.
The purpose of this course is to introduce basic theory and applications of time
series econometrics. The course requires basic knowledge of probability and statistics.
Students are require to perform computations using EViews.
An outline of the course (tentative):
CH1 Basic Regression with Time Series
CH2 Stationary Autoregressive Process
CH3 ARCH and GARCH
CH4 Unstationary Autoregressive Process
CH5 Vector Autoregression (VAR) Models
CH6 Cointegration
There are no texts for this course. The following materials will be useful for the
course.
Walter Enters (2003), Applied Econometric Time Series (Second Edition).
Hamilton (1994), Time Series Analysis, Princeton University Press.
Wooldridge(2009), Introductory Econometrics: A Modern Approach (4th Edition)
yt = a0 + a1 yt1 + ut .
t1
X
ai1
at1 y0
i=0
t1
X
ai1 uti .
i=0
t1
X
ai1
at1 y0
i=0
= a0
t+m
X
t1
X
ai1 uti
i=0
ai1
i=0
at+m+1
ym1
1
t+m
X
ai1 uti .
i=0
If |a1 | < 1, as m ,
yt = a0
ai1
i=0
a0
+
1 a1
ai1 uti
i=0
ai1 uti .
i=0
This is only a special solution to Model (7). The general solution is given by
yt =
Aat1
X
a0
+
+
ai uti .
1 a1 i=0 1
(8)
P
i
Choosing A = y0 a0 /(1a1 )
i=0 a1 ui , we can derive the special solution from
the general one above. Note in (8) that Aat1 is the homogeneous solution of the
homogeneous equation yt = a1 yt1 and the other part is a particular solution
to the dierence equation (7):
The general solution = the homogeneous solution + a particular solution.
4
Imposing the initial condition on the general solution gives a special solution satisfying the initial condition. If |a1 | > 1, given an initial condition y0 , yt can be
Pt1 i
P
i
t
solved: yt = a0 t1
i=0 a1 + a1 y0 +
i=0 a1 uti . If no initial conditions are given, the
sequence cannot be convergent. If a1 = 1,
yt = a0 + yt1 = a0 t + y0 +
t
X
ui
i=1
and yt = a0 +ut . This shows that each disturbance has a permanent non-decaying
eect on the value of yt .
How to determine the homogeneous solution? The structure of the homogeneous equation is determined by the pattern of the characteristic roots. For AR(1)
Model (7), the characteristic equation (root) is = a1 ; hence the homogeneous
solution is yth = Aat1 , where A is an arbitray constant, which is interpreted as a
deviation from long-run equilibrium. For AR(2) model yt = a1 yt1 + a2 yt2 + ut ,
the homogeneous equation is yt = a1 yt1 + a2 yt2 . Inserting yt = At deduces that
the characteristic equation is 2 a1 a2 = 0. The two roots are
q
2
1 , 2 = a1 a1 + 4a2 /2.
Note that the linear combination of t1 and t2 also solves the homogeneous equation. There are three cases according as a21 + 4a2 > 0, = 0 and < 0. The homogeneous solutions are, respectively,
yth = A1 t1 + A2 t2 ,
yth = (A1 + A2 t)t , = 1 = 2
yth
= A1 r cos (t + A2 ) , r = a2 , = arg tg
a21 4a2 /a1 .
t
P
For higher order homogeneous equation yt = pi=1 ai yti , the characteristic equaP
P
tion is t pi=1 ai ti = 0 or p pi=1 ai pi = 0.
Pp
How to determine particular solutions? Consider yt =
i=1 ai yti + xt .
p
rt
d
If xt is deterministic, e.g. xt = 0; bd ; a0 + bt , setting yt = c, c0 + c1 drt , c0 +
c1 t + + cd td ,solve the constants by inserting the ytp into the equation. If xt =
P
s
ut is stochastic, set yts =
i=0 i uti , insert yt into the equation and solve the
coecients i by the method of undetermined coecients.
Stable solution and stability conditions: all the characteristic roots lie inside
the unit circle. For AR(2) model yt = a1 yt1 + a2 yt2 + t , the stability conditions
are
a2 + a1 < 1
a2 a1 < 1
a2 > 1, a2 < 0
The general solution of the dierence equation is yt = ytp + yth , where ytp is a
particular solution of the dierence equation; yth is the general solution of its
homogeneous equation. Further, ytp can be expressed as ytd + yts , where ytd is the
deterministic part and yts is the stochastic part.
ytd is determined according to the dierent cases of the deterministic part xt in the
dierence equation, e.g. xt = constant, bdrt or btd .
yts is determined by the stochastic part xt in the dierence equation, e.g. if yt =
P
a1 yt1 + a2 yt2 + xt , and xt = t , set yts =
i=0 i ti .
The coecients of ytd and yts in yt = ytp + yth can be determined by substituting
yt into the original dierence equation and using the method of undetermined
coecients.
Lag operator is a linear operator, which is extensively applied in time series analysis. Note the application of 1/(1 aL), |a| < 1 or |a| > 1. If |a| < 1,
X
X
yt
=
ai Li yt =
ai yti .
1 aL
i=0
i=0
If |a| > 1,
X
X
yt
yt
1
i
1
= (aL)1
=
(aL)
(aL)
y
=
(aL)
ai yt+i .
t
1
1 aL
1 (aL)
i=0
i=0
A(L)yt = a0 + B(L)t has the particular solution yt = (a0 + B(L)t ) /A(L). The
stability condition is that the inverse characteristic roots (i.e. the roots of the
inverse characteristic equation A(L) = 0) lie outside of the unit circle.
Here we introduce three methods to express yt as the sum of a function of time t
and a moving average of the disturbance: Iterative Method, the Method of Undetermined Coecients, and Lag Operator Approach. Whether such an expansion is
6
Stationary Processes
Et yt+i E[yt+i |yt , yt1 , ..., y1 ], the expectation value of yt+i conditional on the
observed values of y1 , y2 , , yt .
White-noise process: {t }, E[t ] = 0, V ar(t ) = 2 (constant), Et ts = Etj tsj
= 0, t and for all j, s 6= 0.
Pq
MA(q): a moving average of order q, {xt } satisfying xt =
i=0 i ti , where
0 = 1.If two or more of the coecients i dier from 0, {xt } are not white-noise.
(Consider xt = t + 0.5t1 )
AR(p): p-order autoregressive, yt = a0 +
Pp
i=1
ai yti + t
p
X
ai yti +
i=1
q
X
i ti , 0 = 1
i=0
a0 +
q
X
i=0
i ti / 1
p
X
i=1
ai Li ,
are all constants, which are time-invariant (i.e. unaected by a change of time
origin). Note the dierence between (covariance) stationary and strictly
stationary(j1 , j2 , , js , the joint distribution of (yt , yt+j1 , , yt+js ) is determined by j1 , j2 , , js , but unaected by a change of time origin; i.e. it is invariant to the time t in which the observations are made). The covariance stationary
process requires that the mean and the covariance are time-invariant and finite
while strictly stationary process requires that the mean, the covariance and the
other higher moments be time-invariant, but not necessarily be finite. If the mean
and the covariance are finite, the strong stationary process is covariance stationary
and the strong stationarity is stronger.
autocorrelation between yt and yts : s
s
0
Cov(yt ,yts )
Cov(yt ,yt )
Cov(yt ,yts )
,
2y
0 = 1.
a0 (1 at1 )
+ at1 y0 ,
1 a1
Eyts =
a0 (1 ats
1 )
+ ats
1 y0
1 a1
are time dependent and Eyt 6= Eyts , the {yt } cannot be stationary. Add restrictions:
|a1 | < 1 and {yt } have been occurring for an infinitely long time.
Then for any integer m > 0,
X
a0 (1 at+m+1
)
1
+ at+m+1
ym1 +
ai1 ti .
yt =
1
1 a1
i=0
t+m
(9)
As m , yt =
a0
1a1
i=0
Eyt =
a0
= Eyts ,
1 a1
V ar(yts ) =
2
,
1 a21
Cov(yt , yts ) =
a2j+s
i,j+s =
1
j=0
2 as1
1 a21
i ti = t + (a1 0 + 1 )t1 +
i=2
X
(a1 i1 + a2 i2 )ti
i=2
and hence
0 = 1
1 = a1 0 + 1 1 = a1 + 1
i = a1 i1 + a2 i2 , i 2.
P
Therefore, yt = t + (a1 + 1 )t1 +
i=2 i ti , where i are determined by the
dierence equation i = a1 i1 + a2 i2 , i 2, with 0 = 1 and 1 = a1 + 1 . If
the characteristic roots of ARMA(2, 1) model lie within the unit circle,
{i } constitute a convergent sequence, and {yt } become stationary. Check the
stationarity conditions for {yt } generated by the ARMA(2, 1) :
Eyt = 0 = Eyts , t, s,
V ar(yt ) = V ar(yts ) =
X
i=0
Cov(yt , yts ) = E
i j ti tsj = 2
i,j=0
2i , t, s,
i j i,s+j = 2
i,j=0
s+j j .
j=0
i=0
i ti , 0 = 1. t, s,
Ext = 0 = Exts
X
2
V ar(xt ) =
2i = V ar(xts )
i=0
i i+s
i=0
P
If
i=0 i i+s < s, MA() will be stationary. A direct implication is that
MA(q) is always stationary for any finite q.
Find stationarity conditions for AR(p) :
yt = a0 +
p
X
ai yti + t .
i=1
If the characteristic roots of the homogeneous equation all lie inside the
P
unit circle (and hence 1 pi=1 ai 6= 0), the particular solution
yt =
a
P0p
i=1
10
ai
X
i=0
i ti
a
P0p
i ti
i=0
j tj
j=0
i j i,j =
i,j=0
Con(yt , yts ) = E
i ti
i=0
X
j=0
X
i=0
j=1
aj ij =
ai
i=1
Pp
i j E(ti tj )
i,j=0
2i < ,
j tsj
j=0
i j i,s+j
i,j=0
j+s j <
p
X
ai yti +
i=1
q
X
i ti , 0 = 1.
i=0
P
Since { qi=0 i ti } is stationary for any finite q, only the characteristic roots of
the autoregressive portion of the ARMA(p, q) process determine whether the {yt }
is stationary. Therefore, if the roots of the inverse characteristic equation
1 a1 L a2 L2 ap Lp = 0 lie outside of the unit circle, then {yt } is
stationary.
s
0
Cov(yt ,yts )
,
V ar(yt )
serves as a useful tool to identify and estimate time-series models (the Box-Jenkins
Approach).
P i
a0
+
ACF for AR(1) process: yt = a0 + a1 yt1 + t . Since yt = 1a
i=0 a1 ti , we
1
have
0 = V ar(yt ) = 2 /(1 a21 ),
s = Cov(yt , yts ) =
X
j=0
11
a2j+s
i,j+s =
1
2 as1
.
1 a21
12
1+ 21 +2a1 1 2
1a21
and 1 =
(1+a1 1 )(a1 + 1 ) 2
.
1a21
Therefore,
(1 + a1 1 )(a1 + 1 )
1 + 21 + 2a1 1
= a1 s1 , s 2,
1 =
s
which can be solved from the initial condition 1 .The ACF s converge to 0 geometrically (directly or oscillatorily according as a1 > 0 and a1 < 0), as s ,
provided |a1 | < 1.
ACF for ARMA(p, q) process: yt = a1 yt1 + +ap ytp +t + 1 t1 + + q tq .
The ACF s (s = 1, 2, , q) calculation is complicated, thus omitted here. The
ACF s (s > q) satisfy (Note that s = s ):
s = a1 s1 + + ap sp , s q + 1.
Under the stationarity restriction (all the characteristic roots of the model are
within the unit circle), the ACF converge to 0 as s .
yt = 11 yt1
+ et ,
yt = 21 yt1
+ 22 yt2
+ et ,
then 22 = P ACF between yt and yt2. The same arguments for 33 , 44 , ...
AR(P): no direct correlation between yt and yts for s > p, i.e. ss = 0 for
s p + 1.
MA(q) results in infinite-order AR representation. The PACF exhibit a decay.
ARMA(p,q): ACF begin to decay after lag q (since the ACF for MA(q) cut to 0
after lag q and the ACF for AR(p) decay) while PACF begin to decay after lag p
(since the PACF for AR(p) cut to 0 after lag p and the PACF for MA(q) exhibit
a decay).
A rule to select models is used by comparing the graphs of the ACF and PACF
to the theoretical patterns. For example, if the ACF exhibited a single spike and
the PACF exhibited monotonic decay, try to select an MA(1) model; however,
if the ACF exhibited monotonic decay and the PACF exhibited a single spike,
try to select an AR(1) model. If the ACF exhibited monotonic decay and the
PACF exhibited two spikes, try to select an AR(2) model. If the PACF exhibited
monotonic decay with no spikes, try to select an ARMA or MA model. ....... (see
ex2 for the graphs of ACF and PACF for some simpe ARMA models).
ss = 1
Ps1
P
rs s1
/
1
r
r
j=1 s1,j sj
j=1 s1,j j , for s 2,
14
s1,sj , j = 1, 2, , s 1.
s,j =
s1,,j
ss
ss is a consistent approxwhere
imation of the PACF. For example, if the true value of rs is zero (i.e. s = 0),
there is no autoregression part in the process and hence the process is MA(s 1);
ss is zero (i.e. ss = 0), there is no moving average part in
if the true value of
the process and hence the process is AR(s 1).
Under the null: yt MA(s 1) (i.e. s = 0) with normally distributed errors,
rs N(0, V ar(rs )) asymptotically, where
(
T 1 , s = 1
V ar(rs ) =
P
2
T 1 1 + 2 s1
i=1 rj , s > 1.
p+i,p+i ) is
Under the null: yt AR(p) (i.e. p+i,p+i = 0, i > 0), the variance V ar(
approximately 1/T. In EViews, the dotted lines in the plots of the ACF and PACF
11 computed as 2/ T .
are the approximate two standard error bounds of r1 or
If the value of the ACF or PACF is within these bounds, it is not significantly
dierent from zero at (approximately) the 5% significance level.
Two kinds of test: (1) t-test: From the sample ACF, construct t-ratio: t =
p
rs / V ar(rs ) for the significance of s-order autocorrelation for some s > 0 (H0 :
s = 0 or yt MA(s 1)). From the sample PACF, construct t-ratio: t =
ln
2
t=1 t . If || < 1,
2
2
2
t1
X
X
i
() yti =
()i yti ,
t = yt /(1 + L) =
i=0
i=0
3. Diagnosis: Plot the residuals from the estimated model to look for outliers
and for evidence of periods in which the model does not fit the data well.
Construct the sample ACF and the PACF of the residuals and conduct the
t-test and Q-test (in the above) to see whether any one or all of the residual
autocorrelations or partial autocorrelation are statistically significant or not.
If several residual correlations are marginally significant (from the t-test) and
a Q-statistic test is not significant at 10% level, be wary: it is possible to
form a better performing model. If there are sucient observations, fit the
same ARMA model to each of two subsamples (The standard F-test can be
applied to test whether the data-generating process is unchanging).
Notes: (1) if all the plausible ARMA models estimated above show evidence of a
poor fit during a reasonably long portion of the sample, consider multivariate estimation
methods; (2) if the variance of the residuals is increasing or has some tendency to change,
use a logarithmic transformation or ARCH techniques.
Forecast
AR(1) : yt = a0 + a1 yt1 + t . Given the actual data-generating process (i.e. a0
and a1 are known) and the current and past realizations of the {t } and {yt }, by
forward iteration,
(
yt+1 = a0 + a1 yt + t+1
Et yt+1 = a0 + a1 yt ,
(
yt+2 = a0 (1 + a1 ) + a21 yt + t+2 + a1 t+1
Et yt+2 = a0 (1 + a1 ) + a21 yt ,
(
..
.
j
j1
yt+j = a0 (1 + a1 + + aj1
1 ) + a1 yt + t+j + a1 t+j1 + + a1 t+1
j
Et yt+j = a0 (1 + a1 + + aj1
1 ) + a1 yt .
a0
if |a1 | < 1.
1 a1
For the stationary AR(1) process, the conditional forecast of yt+j converges to the
unconditional mean as j . Note that the j-step-ahead forecast error is
et (j) = yt+j Et yt+j = t+j + a1 t+j1 + + aj1
1 t+1
18
2(j1)
2
with Et et (j) = 0 and V ar[et (j)] = 2 [1+a21 + +a1
] = 2 (1a2j
1 )/(1a1 )
2 /(1 a21 ) as j . The forecasts are unbiased, but the variance of the forecast
errors is an increasing function of j, implying that the quality of the forecasts
declines as we forecast further out into the future. If we assume that t is normally
distributed, then the 95% confidence interval for the one-step-ahead forecast of yt+1
is a0 + a1 yt 1.96 and the 95% confidence interval for j-step-ahead forecast of
yt+j is
!
2j 1/2
a0 (1 aj1 )
1
a
1
+ aj1 yt 1.96
.
1 a1
1 a21
ARMA(2, 1) : yt = a0 +a1 yt1 +a2 yt2 +t + 1 t1. Assume that all the coecients
are known, all the variables subscripted t, t 1, are known at t, and Et t+j = 0
for j > 0.
(
yt+1 = a0 + a1 yt + a2 yt1 + t+1 + 1 t.
Et yt+1 = a0 + a1 yt + a2 yt1 + 1 t ,
..
.
yt+j = a0 + a1 yt+j1 + a2 yt+j2 + t+j + 1 t+j1
Et yt+j = a0 + a1 Et yt+j1 + a2 Et yt+j2 , j 2.
1 , a
2 and 1 , the estiGiven the sample size T and the estimated coecients a0 , a
mated ARMA(2,1) model is
yt = a
0 + a1 yt1 + a
2 yt2 + t + 1t1.
The out-of-sample forecasts can be easily constructed as follows:
0 + a1 yT + a
2 yT 1 + 1T.
ET yT +1 = a
ET yT +2 = a0 + a
1 ET yT +1 + a2 yT
ET yT +3 = a
0 + a1 ET yT +2 + a
2 ET yT +1
ET yT +j = a
0 + a1 ET yT +j1 + a2 ET yT +j2 , j 2.
However, the confidence intervals for the forecasts are dicult to construct.
19
PH 2
e1i
MSP E1
F
= Pi=1
F (H, H).
H
2
MSP E2
i=1 e2i
The assumptions for the F-distribution are: et N(0, 2 ), Eet ets = 0(s 6= 0)
and Ee1t e2t = 0. The violation of any one of the assumptions will lead to the
failure of the F-distribution.
20
2
2
xz = Ext zt = Ee1t Ee2t
Under the null of equal forecast accuracy for the two models, xz = Ee21t
Ee22t = 0, i.e. xt and zt are uncorrelated. Let rxz is the sample version of xz ,
p
2 )/(H 1) t(H 1). Examine the sign of this t-statistic
then rxz / (1 rxz
and the significance of the t-test.
4. The Diebold-Mariano(1995) test: (Even the first two assumptions et N(0, 2 )
and Eet ets = 0(s 6= 0) are not required). Use a more general loss function
of the forecast error g(ei ) instead of the quadratic one e2i . Let
H
H
1 X
1 X
d=
di =
(g(e1i ) g(e2i )),
H i=1
H i=1
p
var(d)
N(0, 1)
which from CLT is asymptotically normally distributed: d/
under the hypothsis that there is equal forecast accuracy. If {di } is serially
uncorrelated (conduct CDF, PACF and Q-statistic test to {di }),
d
qP
0
H
2
i=1 (di d) /(H 1)
t(H 1).
q
( 0 + 2 1 + + 2 q )/(H + 1 2j + H 1 j(j 1)) t(H 1)
DM d/
t(H 1), (j-step-ahead).
Seasonality: Forecasts that ignore seasonality will have a high variance, even
in using the deseasonalized or seasonally adjusted data. In practice, the seasonal
pattern will interact with the nonseasonal pattern in the data, making identification dicult. The ACF and PACF for a combined seasonal/nonseasonal process
21
will reflect both elements. There are two methods to introduce the seasonal eect:
additive seasonality and multiplicative seasonality. For example, the followings are
additive specifications
yt = a1 yt1 + t + 1 t1 + 4 t4
yt = a1 yt1 + a4 yt4 + t + 1 t1
while
(1 a1 L)yt = (1 + 1 L)(1 + 4 L4 )t
(1 a1 L)(1 a4 L4 )yt = (1 + 1 L)t