Lecture 2 M
Lecture 2 M
2.1 Representations
The Vector Autoregressive (VAR) Representation
We define C(L)−1 as an (n × n) lag polynomial such that C(L)−1 C(L) = I; i.e. when these
lag polynomial matrices are matrix-multiplied, all the lag terms cancel out. This operation
in effect converts lags of the errors into lags of the vector of dependent variables.
Define A(L) = C(L)−1. Then given the (invertible) MA coefficients, it is easy to map these
into the VAR coefficients:
Yt = C(L)t
A(L)Yt = t (3)
To show that this matrix lag polynomial exists and how it maps into the coefficients in C(L),
note that by assumption we have the identity
A0 = I
A1 = −A0 C1
Ak = −A0 Ck − A1 Ck−1 − ... − Ak−1 C1
As noted, the VAR is possibly of infinite order (i.e. infinite number of lags required to fully
represent joint density). In practice, the VAR is usually restricted for estimation by truncating
the lag-length.
Note: Here we are considering zero mean processes. In case the mean of Yt is not zero we
should add a constant in the VAR equations.
Alternative representations: VAR(1) Any VAR(p) can be rewritten as a VAR(1). To
0 , ..., Y 0
form a VAR(1) from the general model we define: e0t = [0 , 0, ..., 0], Yt0 = [Yt0 , Yt−1 t−p+1 ]
A1 A2 ... Ap−1 Ap
In 0 ... 0 0
A= 0
In ... 0 0
... ... ..
.
0 ... ... In 0
Yt = AYt−1 + et
Y = XΓ + u
!
X11 X12
Vec representation Let vec denote the stacking columns operator, i.e X = X21 X22
X31 X32
X11
X21
X31
then vec(X) =
X12
X22
X32
Yt = (In ⊗ Xt0 )γ + t
Stationarity of a VAR
Yt = µ + AYt−1 + εt
Yt = µ + AYt−1 + εt
= µ + A(µ + AYt−2 + εt−1 ) + εt
= (I + A)µ + A2 Yt−2 + Aεt−1 + εt
..
.
j−1
X
Yt = (I + A + ... + Aj )µ + Aj Yt−j + Ai εt−i
i=0
1. Aj = P Λj P −1 → 0.
Pj−1
3. the infinite sum i=0
Ai εt−i exists in mean square (see e.g. proposition C.10L);
Note that the eigenvalues (λ) of A satisfy det(Iλ − A) = 0. Therefore the eigenvalues
correspond to the reciprocal of the roots of the determinant of A(z) = I − Az.
A VAR(1) is called stable if det(I − Az) 6= 0 for |z| ≤ 1. Equivalently stability requires
that all the eigenvalues of A are smaller than one in absoulte value.
A condition for stability: For a VAR(p) the stability condition also requires that all the
eigenvalues of A (the AR matrix of the companion form of Yt ) are smaller than one in mod-
ulus or all the roots larger than one. Therefore we have that a VAR(p) is called stable if
det(I − A1 z − A2 z 2 , ..., Ap z p ) 6= 0 for |z| ≤ 1.
Notice that the converse is not true. An unstable process can be stationary.
Notice that the vector M A(∞) representation of a stationary VAR satisfies the absolute
summability condition so that assumptions of 10.2H hold.
Example A stationary VAR(1)
0.5 0.3 1 0.3 0.81
Yt = AYt−1 + t , A = , Ω= E(t 0t ) = , λ=
0.02 0.8 0.3 .1 0.48
Rewriting the VAR(p) as a VAR(1) it is particularly useful in order to find the Wold repre-
sentation of Yt .
We can proceed similarly for the VAR(1). Substituting backward in the companion form we
have
Yt = Aj Yt−j + Aj−1 et−j+1 + ... + A1 et−1 + ... + et
P∞
If conditions for stationarity are satisfied, the series i=1
Aj converges and Yt has an VMA(∞)
representation in terms of the Wold shock et given by
Yt = (I − A)−1 et
∞
X
= Aj et−j
i=1
= C(L)et
Yt = AYt−1 + et (5)
Σ̃ = E[(Yt )(Yt )0 ]
= AΣ̃A0 + Ω̃ (6)
a closed form solution to (7) can be obtained in terms of the vec operator.
Let A, B, C be matrices such that the product ABC exists. A property of the vec operator is
that
vec(ABC) = (C 0 ⊗ A)vec(C)
Applying the vec operator to both sides of (7) we have
Specification of the VAR is key for empirical analysis. We have to decide about the following:
1. Number of lags p.
2. Which variables.
3. Type of transformations.
Number of lags As in the univariate case, care must be taken to account for all systematic
dynamics in multivariate models. In VAR models, this is usually done by choosing a sufficient
number of lags to ensure that the residuals in each of the equations are white noise.
AIC: Akaike information criterion Choosing the p that minimizes the following
HQ: Hannan- Quinn information criterion Choosing the p that minimizes the following
AIC overestimate the true order with positive probability and underestimate the true or-
der with zero probability.
Suppose a VAR(p) is fitted to Y1 , ..., YT (Yt not necessarily stationary). In small sample
the following relations hold:
p̂BIC ≤ p̂AIC if T ≥ 8
p̂HQ ≤ p̂AIC if T ≥ 16
Type of variables Variables selection is a key step in the specification of the model.
VAR models are small scale models so usually 2 to 8 variables are used.
Variables selection depends on the particular application and in general should be included
all the variable conveying relevant information.
Trend-stationary series
Yt = µ + bt + εt , εt ∼ W N.
Difference-stationary series
Yt = µ + Yt−1 + εt , εt ∼ W N.
Figure 1: blu: log(GDP). green: log(CPI)
These series can be thought as generated by some nonstationary process. Here there are
some examples
∆xt = b + γxt + εt
xt = b + axt + εt
with a = 1 + γ < 1.
∆xt = b + γxt + ct + εt
With this specification under the alternative the preocess is stationary with a deterministic
linear trend.
Augmented Dickey-Fuller test. In the augmented version of the test p lags of the lags of
∆xt can be added, i.e.
A(L)∆xt = b + γxt + εt
or
A(L)∆xt = b + γxt + ct + εt
• If the test statistic is smaller than (negative) the critical value, then the null hypothesis of
unit root is rejected.
Transformations I: first differences Let ∆ = 1 − L be the first differences filter, i.e. a filter
such that ∆Yt = Yt − Yt−1 and let us consider the simple case of a random walk with drift
Yt = µ + Yt−1 + t
where t is WN. By applying the first differences filter (1 − L) the process is transformed into
a stationary process
∆Yt = µ + t
Let us now consider a process with deterministic trend
Yt = µ + δt + t
∆Yt = δ + ∆t
which is a stationary process but is not invertible because it contains a unit root in the MA
part.
log(GDP) and log(CPI) in first differences
Transformations II: removing deterministic trend Removing a deterministic trend (linear
or quadratic) from a process from a trend stationary variable is ok.
However this is not enough if the process is a unit root with drift. To see this consider again
the process
Yt = µ + Yt−1 + t
this can be writen as
t
X
Yt = µt + Y0 + j
j=0
By removing the deterministic trend the mean of the process becomes constant but the
variance grows over time so the process is not stationary.
log(GDP) and log(CPI) linearly detrended
Transformations of trending variables: Hodrick-Prescott filter The filter separates the
trend from the cyclical component of a scalar time series. Suppose yt = gt + ct , where gt
is the trend component and ct is the cycle. The trend is obtained by solving the following
minimization problem
T T −1
X X
min c2t + λ [(gt+1 − gt ) − (gt − gt−1 )]2
{gt }Tt=1
t=1 t=2
The parameter λ is a positive number (quarterly data usually =1600) which penalizes vari-
ability in the growth component series while the first part is the penalty to the cyclical
component. The larger λ the smoother the trend component.