0% found this document useful (0 votes)
16 views39 pages

04 Var

Uploaded by

sd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views39 pages

04 Var

Uploaded by

sd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Class Notes VAR Models Freddy Espino

Macroeconometrics

Vector Autoregressive Models (VAR)

Freddy Espino

Readings

• Enders, Ch. 5

• Sims (1980) “Macroeconomics and Reality”, Econometrica, Vol. 48, No.

1, pp. 1-48.

• Watson, Mark (1994) “Vector Autoregressions and Cointegration”,

Handbook of Econometrics, Ch. 47, Volume IV, Pp. 2843-2915.

• Stock and Watson (2001) “Vector Autoregressions”, Journal of

Economic Perspectives, Vol. 15, Number 4, Pages 101–115.

Page 1 of 39
Class Notes VAR Models Freddy Espino

1. Introduction

• Multivariate time series methods are widely used by empirical

economists, and econometricians have focused a great deal attention

at refining and extending these techniques so that they are well suited

for answering questions.

• Vector Autoregressions (VARs) were introduced into empirical

economics by Sims (1980), who demonstrated that VARs provide a

flexible and tractable framework for analyzing economic time series.

• According to Sims (1980), if there is true simultaneity among a set of

variables, they should all be treated on an equal footing; there should

not be any a priori distinction between endogenous and exogenous

variables.

• In the two variables case, we can let the time path 𝑦1,𝑡 be affected by

current and past realizations of 𝑦2,𝑡 sequence and vice versa.

Page 2 of 39
Class Notes VAR Models Freddy Espino

2. Vector Autoregressive (VAR)

• Consider the simple bivariate model

1. 𝑦1,𝑡 = 𝛾10 + 𝑏12 𝑦2𝑡 + 𝛾11 𝑦1,𝑡−1 + 𝛾12 𝑦2,𝑡−1 + 𝜀1,𝑡

2. 𝑦2,𝑡 = 𝛾20 + 𝑏21 𝑦1,𝑡 + 𝛾21 𝑦1,𝑡−1 + 𝛾22 𝑦2,𝑡−1 + 𝜀2,𝑡

• Where it is assumed that

1. both 𝑦1,𝑡 and 𝑦2,𝑡 are stationary,

2. 𝜀𝑡′ s are white-noise disturbances with variance 𝜎12 and 𝜎22 ; and

3. 𝜀1,𝑡 and 𝜀2,𝑡 are uncorrelated.

• Equations (1) and (2) constitute a first-order vector autoregression

VAR(1), because it has 1 lag.

• Note that 𝜀 ′ 𝑠 are pure innovations (or shocks) in 𝑦1,𝑡 and 𝑦2,𝑡 .

• If 𝑏21 (𝑏12 ) is not equal to zero, 𝜀1,𝑡 (𝜀2,𝑡 ) has an indirect

contemporaneous effect on 𝑦2,𝑡 (𝑦1,𝑡 ).

• Using matrix algebra, we can write the system in the compact form:

1 𝑏12 𝑦1,𝑡 𝛾10 𝛾11 𝛾12 𝑦1,𝑡−1 𝜀1,𝑡


3. [ ] [𝑦 ] = [𝛾 ] + [𝛾 ] [ ] + [ 𝜀2,𝑡 ]
𝑏21 1 2,𝑡 20 21 𝛾21 𝑦2,𝑡−1

4. 𝐵𝑦𝑡 = 𝛤0 + 𝛤1 𝑦𝑡−1 + 𝜀𝑡

5. 𝑦𝑡 = 𝐴0 + 𝐴1 𝑦𝑡−1 + 𝑢𝑡

• Notice that 𝑢𝑡 = 𝐵−1 𝜀𝑡 or 𝜀𝑡 = 𝐵𝑢𝑡

• We do not observe 𝜀𝑡 but 𝑢𝑡

• Thus, if we know 𝐵, we can identify 𝜀𝑡

Page 3 of 39
Class Notes VAR Models Freddy Espino

• Using the new notation, we can rewrite in the equivalent form:

6. 𝑦1,𝑡 = 𝑎10 + 𝑎11 𝑦1,𝑡−1 + 𝑎12 𝑦2,𝑡−1 + 𝑢1,𝑡

7. 𝑦2,𝑡 = 𝑎20 + 𝑎21 𝑦1,𝑡−1 + 𝑎22 𝑦2,𝑡−1 + 𝑢2,𝑡

• System (1) and (2) is called a Structural VAR (SVAR) or the

primitive system and the system (7) and (6) is called a Reduced

VAR (VAR) or in standard form.

• SVAR:

1 𝑏12 𝑦1,𝑡 𝛾10 𝛾11 𝛾12 𝑦1,𝑡−1 𝜀1,𝑡


[ ] [𝑦 ] = [ 𝛾 ] + [𝛾 ] [
𝛾21 𝑦2,𝑡−1 ] + [ 𝜀2,𝑡 ]
𝑏21 1 2,𝑡 20 21

𝜀1,𝑡
[𝜀 ] = 𝜀𝑡 ~𝑖𝑖𝑑(0, 𝛴𝜀 )
2,𝑡

𝜎12 0
𝛴𝜀 = [ ]
0 𝜎22

• VAR:

𝑦1,𝑡 𝑎10 𝑎11 𝑎12 𝑦1,𝑡−1 𝑢1,𝑡


[𝑦 ] = [𝑎 ] + [𝑎 ] [
𝑎21 𝑦2,𝑡−1 ] + [ 𝑢2,𝑡 ]
2,𝑡 20 21

𝑢1,𝑡
[𝑢 ] = 𝑢𝑡 ~𝑖𝑖𝑑(0, Ω𝑢 )
2,𝑡

𝜔12 𝜔12
Ω𝑢 = [ ]
𝜔21 𝜔22

• A VAR with p lags VAR (p):

𝑦𝑡 = 𝐴0 + 𝐴1 𝑦𝑡−1 + ⋯ + 𝐴𝑝 𝑦𝑡−𝑝 + 𝑢𝑡

𝑦𝑡 = 𝐴0 + 𝐴(𝐿) 𝑦𝑡 + 𝑢𝑡

Page 4 of 39
Class Notes VAR Models Freddy Espino

• It is important to note that the error terms 𝑢1,𝑡 and 𝑢2,𝑡 are

composites of the two shocks 𝜀1,𝑡 and 𝜀2,𝑡 :

• 𝑢1,𝑡 = (𝜀1,𝑡 − 𝑏12 𝜀2,𝑡 )/(1 − 𝑏12 𝑏21 )

• 𝑢2,𝑡 = (𝜀2,𝑡 − 𝑏21 𝜀1,𝑡 )/(1 − 𝑏12 𝑏21 )

• Thus:

• 𝐸[𝑢1,𝑡 ] = 𝐸[𝑢2,𝑡 ] = 0

• 𝜔12 and 𝜔22 are constant

• 𝐶𝑜𝑣[𝑢1,𝑡 , 𝑢1,𝑡−𝑖 ] = 𝐶𝑜𝑣[𝑢2,𝑡 , 𝑢2,𝑡−𝑖 ] = 0 for 𝑖 ≠ 0

• A critical point to note is that 𝑢1,𝑡 and 𝑢2,𝑡 are correlated. The

covariance of the two terms is not zero:

• 𝐸[𝑢1,𝑡 𝑢2,𝑡 ] = − 𝑏21 𝜎12 + 𝑏12 𝜎22 ⁄(1 − 𝑏12 𝑏21 )2 = 𝜔12 = 𝜔21

• Only in the case that 𝑏12 = 𝑏21 = 0, shocks will be uncorrelated

• The SVAR is one example of the Simultaneous Equation Model (SEM).

• We cannot estimate the SVAR using per-equation OLS, due to the bias

of simultaneity: there is instantaneous interaction between 𝑦1𝑡 and 𝑦2𝑡 ,

both 𝑦1𝑡 and 𝑦2𝑡 are endogenous, and the regressors include the current

value of endogenous variables in the structural form.

• We can estimate the VAR using per-equation OLS.

Page 5 of 39
Class Notes VAR Models Freddy Espino

• Then, we recover the SVAR from the VAR, with (identification)

restriction imposed.

Reduced-Forms and Structural Equations

• To illustrate the key issues involved, consider a stochastic version of

Samuelson’s (1939) classic model:

𝑦𝑡 = 𝛾1 + 𝑐𝑡 + 𝑖𝑡 + 𝜀𝑦𝑡

𝑐𝑡 = 𝛾2 + 𝛼𝑦𝑡−1 + 𝜀𝑐𝑡 0<𝛼<1

𝑖𝑡 = 𝛾3 + 𝛽(𝑐𝑡 − 𝑐𝑡−1 ) + 𝜀𝑖𝑡 𝛽>0

• Where 𝑦𝑡 , 𝑐𝑡 , and 𝑖𝑡 denote real GDP, consumption, and investment in

period 𝑡, respectively.

• In this Keynesian model, 𝑦𝑡 , 𝑐𝑡 , and 𝑖𝑡 are endogenous variables.

• The terms 𝜀𝑦𝑡 , 𝜀𝑐𝑡 and 𝜀𝑖𝑡 and are zero mean random disturbances for

real GDP, consumption and investment, and the coefficients 𝛼 and 𝛽

are parameters to be estimated.

• A structural model is one expressing the endogenous variables as

being dependent on the current and lag realization of another

endogenous variable, and disturbance terms.

1 −1 −1 𝑦𝑡 𝛾1 0 0 0 𝑦𝑡−1 𝜀𝑦𝑡
[0 1 0 ] [ 𝑐𝑡 ] = [𝛾2 ] + [𝛼 0 0] [ 𝑐𝑡−1 ] + [ 𝜀𝑐𝑡 ]
0 −𝛽 1 𝑖𝑡 𝛾3 0 −𝛽 0 𝑖𝑡−1 𝜀𝑖𝑡

Page 6 of 39
Class Notes VAR Models Freddy Espino

• A reduced-form model is one expressing the value of a variable in

terms of its own lags, lags of other endogenous variables, current and

past values of exogenous variables, and disturbance terms.

𝑦𝑡 𝑎10 𝑎11 𝑎12 𝑎13 𝑦𝑡−1 𝑢𝑦𝑡


[ 𝑐𝑡 ] = [𝑎20 ] + [𝑎21 𝑎22 𝑎23 ] [ 𝑐𝑡−1 ] + [ 𝑢𝑐𝑡 ]
𝑖𝑡 𝑎30 𝑎31 𝑎32 𝑎33 𝑖𝑡−1 𝑢𝑖𝑡

Page 7 of 39
Class Notes VAR Models Freddy Espino

3. Lag Selection

• In a VAR, long lengths quickly consume degrees of freedom.

• If lag length is 𝑝, each of the 𝐾 equations contain 𝐾 × 𝑝 coefficients

plus the intercept term: 𝑁 = 𝐾 + 𝐾 2 𝑝

• Appropriate lag-length selection can be critical.

1) If p is too small, the model is mis specified.

2) If p is too large, degrees of freedom are wasted.

• There are basically three methods that have been employed to

determine what p should be:

1) By using some theoretical model.

▪ For example, if one has a DSGE model of an economy in mind,

one will know what the potential set of variables to appear in

a VAR would be, as well as the likely order of it.

▪ Generally, if data is quarterly a VAR(2) would probably

suffice.

2) By using a rule of thumb.

▪ Initially in practice one used to see people choosing p = 4 when

working with quarterly data and p = 6 with monthly data.

▪ Provided T is small these are probably upper limits to the

likely order.

3) By using statistical criteria:

Page 8 of 39
Class Notes VAR Models Freddy Espino

▪ The idea of imposing a penalty for adding regressors to the

model has been carried further in the Akaike Information

Criterion (AIC) and Schwarz Information Criterion

(SIC), for example.

• Lag Selection: Lag Exclusion Tests

▪ Suppose we want to test ℓ against ℓ − 1 lags in a VAR system.

▪ The proper test for this cross-equation restriction is a

Likelihood Ratio (LR):

▪ 𝐿𝑅 = (𝑇 − 𝑚)[𝑙𝑜𝑔(|𝛴ℓ−1 |) − log (|𝛴ℓ |)]~𝜒 2 (𝑞)

o 𝑇 is the number of usable observations

o 𝑚 is the number of parameters estimated in each

equation of the unrestricted model, in this case ℓ.

o 𝑞 = (#𝑈𝑛𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑒𝑑 − #𝑅𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑒𝑑) in terms of

parameters.

▪ Under the Null that H0: 𝑝 = ℓ − 1 lags are appropriate,

reestimate the VAR over the same sample period using ℓ − 1

lags and obtain the variance/covariance matrix of the

residuals: 𝛴ℓ−1

▪ If the calculated value of the statistic is less than 𝜒 2 (𝑞) at a

perspective significance level (5%), we would not be able to

reject H0.

Page 9 of 39
Class Notes VAR Models Freddy Espino

4. Stability and Stationarity

• The VAR(p)

𝑦𝑡 = 𝐴0 + 𝐴1 𝑦𝑡−1 + ⋯ + 𝐴𝑝 𝑦𝑡−𝑝 + 𝑢𝑡

𝑦𝑡 = 𝐴0 + 𝐴(𝐿) 𝑦𝑡 + 𝑢𝑡

• Can be written as:

[𝐼 − 𝐴(𝐿)]𝑦𝑡 = 𝐴0 + 𝑢𝑡

𝐶(𝐿)𝑦𝑡 = 𝐴0 + 𝑢𝑡

• Like in the ARIMA model, stability requires that the characteristic

roots of 𝐶(𝐿) lie outside the unit circle.

• If we can abstract from an initial condition, the 𝑦𝑡 sequences will be

jointly covariance stationarity if the stability condition holds.

• Each sequence 𝑦𝑡 has a finite and time-invariant mean, and a finite

and time-invariant variance.

• The VAR is covariance-stationary if all values of 𝐿 satify:

|𝐼 − 𝐴1 𝐿 − 𝐴2 𝐿2 − ⋯ − 𝐴𝑝 𝐿𝑝 | = 0

• Lie outside the unit circle.

• Bivariate VAR(1), example:

0.3 0.7
𝐴1 = [ ]
0.8 0.6

• The characteristic roots are:

1 0 0.3 0.7
|[ ] − [ ] 𝐿| = 0
0 1 0.8 0.6

Page 10 of 39
Class Notes VAR Models Freddy Espino

1 − 0.3𝐿 −0.7𝐿
|[ ]| = 0
−0.8𝐿 1 − 0.6𝐿

(1 − 0.3𝐿)(1 − 0.6𝐿) − (0.8)(0.7)𝐿2 = 0

1 − 0.3𝐿 − 0.6𝐿 + (0.3)(0.6)𝐿2 − (0.8)(0.7)𝐿2 = 0

1 − 0.9𝐿 − 0.38𝐿2 = 0

𝐿1 = −3.1927

𝐿2 = 0.8243

• It is often convenient to rewrite the pth-order VAR model in the scalar

as a first-order difference equation in a vector 𝜉𝑡 .

• Define the (𝐾𝑝 × 1) vector 𝜉𝑡 by:

𝑦𝑡
𝑦𝑡−1
𝜉𝑡 = 𝑦𝑡−2

[𝑦𝑡−𝑝+1 ]

• Define the (𝐾𝑝 × 𝐾𝑝) matrix ℱ by:

𝐴1 𝐴2 𝐴3 ⋯ 𝐴𝑝−1 𝐴𝑝
𝐼 0 0 ⋯ 0 0
ℱ= 0 𝐼 0 ⋯ 0 0
⋮ ⋮ ⋮ ⋯ ⋮ ⋮
[0 0 0 ⋯ 𝐼 0]

• Define the (𝐾𝑝 × 1) vector 𝑣𝑡 by:

𝜀𝑡
0
𝑣𝑡 = 0

[0]

Page 11 of 39
Class Notes VAR Models Freddy Espino

• Consider the first-order AR model, which is sometimes called the

companion form:

𝜉𝑡 = ℱ𝜉𝑡−1 + 𝑣𝑡

• Or

𝑦𝑡 𝐴1 𝐴2 𝐴3 ⋯ 𝐴𝑝−1 𝐴𝑝 𝑦𝑡−1 𝜀𝑡
𝑦𝑡−1 𝐼 0 0 ⋯ 0 0 𝑦𝑡−2 0
𝑦𝑡−2 = 0 𝐼 0 ⋯ 0 0 𝑦𝑡−3 + 0
⋮ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮ ⋮
[𝑦𝑡−𝑝+1 ] [ 0 0 0 ⋯ 𝐼 𝑦
0 ] [ 𝑡−𝑝 ] [ 0 ]

• This is a system of p equations.

• The first equation is identical to the VAR(p) model. The second

equation is simply identity.

• The eigenvalues (𝜆) of the matrix ℱ must be less than one in

absolute value for the model to be stable or stationary.

• The eigenvalues of the matrix ℱ satisfy:

|𝐼𝜆𝑝 − 𝐴1 𝜆𝑝−1 − 𝐴2 𝜆𝑝−2 − ⋯ − 𝐴𝑝 | = 0

• Hence, a VAR(p) is covariance stationary as long as |𝜆| < 1 for all

values of 𝜆.

• Example:

0.3 0.7
𝐴1 = [ ]
0.8 0.6

• The eigenvalues are:

Page 12 of 39
Class Notes VAR Models Freddy Espino

1 0 0.3 0.7
|[ ]𝜆 − [ ]| = 0
0 1 0.8 0.6

𝜆 − 0.3 −0.7
|[ ]| = 0
−0.8 𝜆 − 0.6

(𝜆 − 0.3)(𝜆 − 0.6) − (0.8)(0.7) = 0

𝜆2 − 0.9𝜆 − 0.38 = 0

𝜆1 = 1.2132

𝜆2 = −0.3132

• Notice that the eigenvalues are the reciprocals of the characteristic

roots.

Page 13 of 39
Class Notes VAR Models Freddy Espino

5. Identification

• Suppose that you want to recover the SVAR from your estimate VAR.

• Due to the feedback inherent in SVAR system, these equations

cannot be estimated directly.

o In our example, the reason is that 𝑦2,𝑡 is correlated with the error

term 𝜀1,𝑡 and 𝑦1,𝑡 with the error term 𝜀2,𝑡 .

• Standard estimation techniques require that the regressors be

uncorrelated with the error term.

• Note there is no such problem in estimating the VAR. OLS can provide

estimates of the two elements of A’s.

• The issue is whether it is possible to recover all the information

present in the SVAR from VAR.

o Is the SVAR form identifiable given the OLS estimates of the

VAR model?

o The answer to this question is “No, unless we are willing to

appropriately restrict the SVAR”

o The reason is clear if we compare the number of parameters in

the SVAR with the number of parameters recovered from the

VAR model.

o Following our bivariate VAR(1) model, the SVAR has 10

parameters: 2𝑏, 6𝛾, 𝜎1 and 𝜎2 .


Page 14 of 39
Class Notes VAR Models Freddy Espino

o The VAR calculates 9 parameters: 6𝑎, 𝑉𝑎𝑟[𝑢1𝑡 ], 𝑉𝑎𝑟[𝑢2𝑡 ] and

𝐶𝑜𝑣[𝑢1𝑡 , 𝑢2𝑡 ].

o Thus, unless one is willing to restrict one of the parameters, it is

not possible to identify the SVAR.

o One way to identify the model is to use the type of recursive

system proposed by Sims (1980).

• Suppose that you are willing to impose a restriction on SVAR such that

the coefficient 𝑏12 equals zero:

1. 𝑦1,𝑡 = 𝛾10 + (0)𝑦2𝑡 + 𝛾11 𝑦1,𝑡−1 + 𝛾12 𝑦2,𝑡−1 + 𝜀1,𝑡

2. 𝑦2,𝑡 = 𝛾20 + 𝑏21 𝑦1,𝑡 + 𝛾21 𝑦1,𝑡−1 + 𝛾22 𝑦2,𝑡−1 + 𝜀2,𝑡

• Imposing the restriction 𝑏12 = 0 means that 𝐵−1 :

1 0 1 0
𝐵=[ ] ⇒ 𝐵−1 = [ ]
𝑏21 1 −𝑏21 1

• We know that:

𝑢𝑡 = 𝐵−1 𝜀𝑡

𝑢1𝑡 1 0 𝜀1𝑡
[𝑢 ] = [ ][ ]
2𝑡 −𝑏21 1 𝜀2𝑡

• Thus

𝑢1𝑡 𝜀1𝑡
[𝑢 ] = [𝜀 − 𝑏 𝜀 ]
2𝑡 2𝑡 21 1𝑡

Page 15 of 39
Class Notes VAR Models Freddy Espino

• The restriction manifests itself such that 𝜀2𝑡 and 𝜀1𝑡 shocks affect the

contemporaneous value of 𝑦2,𝑡 , but only 𝜀1𝑡 affect the contemporaneous

value of 𝑦1,𝑡 .

• The observed values of 𝑢2𝑡 are completely attributed to pure shocks to

the 𝑦2,𝑡 sequence.

• Decomposing the residuals in this triangular fashion is called a

Cholesky Decomposition.1

• Note both structural shocks can now be identified from the

residuals of the standard VAR:

𝑢𝑡 = 𝐵−1 𝜀𝑡

𝜀𝑡 = 𝐵𝑢𝑡

• The Cholesky Decomposition consists in finding a lower triangular

matrix P so that:

Ω𝑢 = 𝑃𝑃′

• Then, define a new error vector 𝜀𝑡 = 𝑃−1 𝑢𝑡 as linear transformation of

old error vector 𝑢𝑡 . Thus:

𝐵 = 𝑃−1

1
In linear algebra, the Cholesky Decomposition or Cholesky Factorization is a decomposition of a
Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate
transpose:
• 𝑀 = 𝐿𝐿′
• Where L is a lower triangular matrix with real and positive diagonal entries, and 𝐿′ denotes the
conjugate transpose of L.
It was discovered by André-Louis Cholesky for real matrices, and posthumously published in 1924.

Page 16 of 39
Class Notes VAR Models Freddy Espino

• Sims (1980) shows that it produces orthogonal error terms:

𝐸[𝜀𝑡 𝜀𝑡′ ] = 𝐸[𝐵𝑢𝑡 (𝐵𝑢𝑡 )′ ] = 𝐸[𝐵𝑢𝑡 𝑢𝑡′ 𝐵′]

𝐸[𝜀𝑡 𝜀𝑡′ ] = 𝐵𝐸[𝑢𝑡 𝑢𝑡′ ]𝐵′ = 𝐵𝐸[Ω𝑢 ]𝐵′

𝐸[𝜀𝑡 𝜀𝑡′ ] = 𝑃−1 𝑃𝑃′𝑃−1′ = 𝐼

• There are other methods used to identify structural shocks, like

Blanchard and Quah (1989) decomposition, as we will see later.

Page 17 of 39
Class Notes VAR Models Freddy Espino

6. Impulse Response Function (IRF)

• Their main purpose is to describe the evolution of a model’s variables

in reaction to a shock in one or more variables.

• This feature allows to trace the transmission of a single shock within

an otherwise noisy system of equations and, thus, makes them very

useful tools in the assessment of economic policies.

• The VAR

𝑦𝑡 = 𝐴0 + 𝐴(𝐿) 𝑦𝑡 + 𝑢𝑡

𝑢𝑡 ~𝑖𝑖𝑑(0, Ω𝑢 )

𝜔12 𝜔12 … 𝜔1𝑛


Ω𝑢 = 𝜔21 𝜔22 … 𝜔2𝑛
… … … …
[𝜔𝑛1 𝜔𝑛2 … 𝜔𝑛2 ]

• Can be written as:

𝐶(𝐿)𝑦𝑡 = 𝐴0 + 𝑢𝑡

• Pre multiplying by Ψ(𝐿) = 𝐶(𝐿)−1 we get the Vector Moving

Average (VMA) or Wold MA representation:

𝑦𝑡 = 𝜇 + Ψ(𝐿)𝑢𝑡

• In general, 𝑢𝑡 are contemporaneously correlated (not orthogonal), i.e.,

𝜔𝑖𝑗 ≠ 0 for all 𝑖 ≠ 𝑗.

• Therefore, we cannot say, hold 𝑢𝑖𝑡 constant and let only 𝑢𝑗𝑡 vary for all

𝑖 ≠ 𝑗.

Page 18 of 39
Class Notes VAR Models Freddy Espino

• We must identify structural shocks:

𝑢𝑡 = 𝐵−1 𝜀𝑡

𝜀𝑡 = 𝐵𝑢𝑡

• Then

𝑦𝑡 = 𝜇 + Ψ(𝐿)𝐵−1 𝐵𝑢𝑡

𝑦𝑡 = 𝜇 + Θ(𝐿)𝜀𝑡

• In the bivariate model, the VMA representation can be written in

terms of ε’s shocks:



𝑦1𝑡 𝜇1 𝜃 (𝑘) 𝜃12 (𝑘) 𝜀1(𝑡−𝑘)
[𝑦 ] = [𝜇 ] + ∑ [ 11 ][ ]
2𝑡 2 𝜃21 (𝑘) 𝜃22 (𝑘) 𝜀2(𝑡−𝑘)
𝑘=0

𝑦1𝑡 = 𝜇1 + 𝜃11 (0)𝜀1(𝑡) + 𝜃12 (0)𝜀2(𝑡) + ⋯ ∞

𝑦2𝑡 = 𝜇2 + 𝜃21 (0)𝜀1(𝑡) + 𝜃22 (0)𝜀2(𝑡) + ⋯ ∞

• The coefficients of 𝜃𝑖𝑗 (𝑘) can be used to generate the effects of 𝜀𝑡 on the

entire path of the 𝑦𝑡 sequences.

𝜕𝑦𝑡 𝜕𝑦𝑡+𝑘
= = 𝜃(𝑘)
𝜕𝜀𝑡−𝑘 𝜕𝜀𝑡

• Each element of 𝜃𝑖𝑗 (𝑘) are impact multiplier.

• For example, the coefficient 𝜃12 (0) is the instantaneous impact of a

one-unit change in 𝜀2𝑡 on 𝑦1𝑡 .

Page 19 of 39
Class Notes VAR Models Freddy Espino

• The accumulated effects of unit impulses in 𝜀1𝑡 and / or 𝜀2𝑡 can be

obtained by the appropriate summation of the coefficient of the

impulse response functions.

• Letting n approach to infinity yields the long-run multiplier or Θ(𝐿)

evaluated at 1:

Θ(1)

• Notice that the results of the IRF depends on the identification of the

VAR.

• For an VAR(p) model, we can consider the companion form:

𝜉𝑡 = ℱ𝜉𝑡−1 + 𝑣𝑡

• For any horizon ℎ ≥ 0, the IRF to a unit shock at time “t” is just the

matrix 𝐴1 element ℱ ℎ .

• In the case of bivariate stable VAR(1):

𝑦1,𝑡 𝑎10 𝑎11 𝑎12 𝑦1,𝑡−1 𝑢1,𝑡


[𝑦 ] = [𝑎 ] + [𝑎 ] [
𝑎21 𝑦2,𝑡−1 ] + [ 𝑢2,𝑡 ]
2,𝑡 20 21

• The VMA representation is:



𝑦1,𝑡 𝜇10 𝑎11 𝑎12 𝑖 𝑢1,𝑡
[𝑦 ] = [𝜇 ] + ∑ [𝑎 𝑎21 ] [𝑢2,𝑡 ]
2,𝑡 20 21
𝑖=0

• Thus, the impulse-response function is calculated for any horizon 𝑖

• To calculate the structural impulse-response function, a 𝐵 matrix is

needed.

Page 20 of 39
Class Notes VAR Models Freddy Espino

7. Forecast Error Variance Decomposition (FEVD)

• The forecast error variance decomposition tells us the proportion

of the movements in a sequence due to its “own” shocks versus shocks

to the other variable.

• It determines how much of the forecast error variance of each of the

variables can be explained by exogenous shocks to the other variables.

• Given the model:

𝑦𝑡 = 𝐴0 + 𝐴1 𝑦𝑡−1 + 𝑢𝑡

• To simplify the discussion, it is assumed that the actual data-

generating process and the current and past realizations of the {𝜀𝑡 }

and {𝑦𝑡 } sequences are known.

• We want to forecast one period ahead:

𝐸𝑡 𝑦𝑡+1 = 𝐴0 + 𝐴1 𝑦𝑡

• The observed 𝑦𝑡+1 is:

𝑦𝑡+1 = 𝐴0 + 𝐴1 𝑦𝑡 + 𝑢𝑡+1

• Thus, the forecast error is:

𝑒𝑡 (1) ≡ 𝑦𝑡+1 − 𝐸𝑡 𝑦𝑡+1 = 𝑢𝑡+1

• Then, we want to forecast two periods ahead.

𝐸𝑡 𝑦𝑡+2 = 𝐴0 + 𝐴1 𝐸𝑡 𝑦𝑡+1

𝐸𝑡 𝑦𝑡+2 = 𝐴0 + 𝐴1 (𝐴0 + 𝐴1 𝑦𝑡 )

𝐸𝑡 𝑦𝑡+2 = (𝐼 + 𝐴1 )𝐴0 + 𝐴12 𝑦𝑡


Page 21 of 39
Class Notes VAR Models Freddy Espino

• The observed 𝑦𝑡+2 is:

𝑦𝑡+2 = 𝐴0 + 𝐴1 𝑦𝑡+1 + 𝑢𝑡+2

• And the forecast error is:

𝑒𝑡 (2) ≡ 𝑦𝑡+2 − 𝐸𝑡 𝑦𝑡+2 = 𝐴0 + 𝐴1 𝑦𝑡+1 + 𝑢𝑡+2 − 𝐴0 − 𝐴1 𝐸𝑡 𝑦𝑡+1

𝑒𝑡 (2) ≡ 𝑦𝑡+2 − 𝐸𝑡 𝑦𝑡+2 = 𝑢𝑡+2 + 𝐴1 (𝑦𝑡+1 − 𝐸𝑡 𝑦𝑡+1 ) = 𝑢𝑡+2 + 𝐴1 𝑢𝑡+1

• The n-step-ahead forecast:

𝐸𝑡 𝑦𝑡+𝑛 = (𝐼 + 𝐴1 + 𝐴12 + ⋯ + 𝐴1𝑛−1 )𝐴0 + 𝐴1𝑛 𝑦𝑡

• And the associated n-step-ahead forecast error:

𝑒𝑡 (𝑛) ≡ 𝑦𝑡+𝑛 − 𝐸𝑡 𝑦𝑡+𝑛 = 𝑢𝑡+𝑛 + 𝐴1 𝑢𝑡+𝑛−1 + 𝐴12 𝑢𝑡+𝑛−2 + ⋯ + 𝐴1𝑛−1 𝑢𝑡+1

• From the structural VMA representation:



𝑦1𝑡 𝜇1 𝜃 (𝑘) 𝜃12 (𝑘) 𝜀1(𝑡−𝑘)
[𝑦 ] = [𝜇 ] + ∑ [ 11 ][ ]
2𝑡 2 𝜃21 (𝑘) 𝜃22 (𝑘) 𝜀2(𝑡−𝑘)
𝑘=0

• The conditional expectation of 𝑦𝑡+𝑛 is:



𝑦1𝑡+𝑛 𝜇1 𝜃 (𝑘) 𝜃12 (𝑘) 𝜀1(𝑡+𝑛−𝑘)
𝐸𝑡 [𝑦 ] = [𝜇 ] + ∑ [ 11 ][ ]
2𝑡+𝑛 2 𝜃21 (𝑘) 𝜃22 (𝑘) 𝜀2(𝑡+𝑛−𝑘)
𝑘=𝑛

• Thus, the n-period forecast error is:

𝑛−1
𝑦1𝑡+𝑛 𝑦1𝑡+𝑛 𝜃 (𝑘) 𝜃12 (𝑘) 𝜀1(𝑡+𝑛−𝑘)
[𝑦 ] − 𝐸𝑡 [𝑦 ] = ∑ [ 11 ][ ]
2𝑡+𝑛 2𝑡+𝑛 𝜃21 (𝑘) 𝜃22 (𝑘) 𝜀2(𝑡+𝑛−𝑘)
𝑘=0

• Focusing solely on the {𝑦1,𝑡 } sequence, we see that the n-step-ahead

forecast error is:

𝑦1,𝑡+𝑛 − 𝐸𝑡 𝑦1,𝑡+𝑛 = 𝜃11 (0)𝜀1,𝑡+𝑛 + 𝜃11 (1)𝜀1,𝑡+𝑛−1 + ⋯ + 𝜃11 (𝑛 − 1)𝜀1,𝑡+1 + ⋯

Page 22 of 39
Class Notes VAR Models Freddy Espino

+ 𝜃12 (0)𝜀2,𝑡+𝑛 + 𝜃12 (1)𝜀2,𝑡+𝑛−1 + ⋯ + 𝜃12 (𝑛 − 1)𝜀2,𝑡+1

• Denote the n-step-ahead forecast error variance of 𝑦1,𝑡+𝑛 as 𝜎1 (𝑛)2

𝐸[𝑦1,𝑡+𝑛 − 𝐸𝑡 𝑦1,𝑡+𝑛 ]2 = 𝜎1 (𝑛)2

= 𝜎12 [𝜃11 (0)2 + 𝜃11 (1)2 + ⋯ + 𝜃11 (𝑛 − 1)2 ] + ⋯

𝜎22 [𝜃12 (0)2 + 𝜃12 (1)2 + ⋯ + 𝜃12 (𝑛 − 1)2 ]

• Because all values of 𝜃𝑗𝑘 (𝑖)2 are necessarily nonnegative, the variance

of the forecast error increases as the forecast horizon n increases.

• Note that it is possible to decompose the n-step-ahead forecast error

variance into the proportions due to each shock.

• The proportions of 𝜎1 (𝑛)2 due to shocks in the {𝜀1𝑡 } and {𝜀2𝑡 } sequences

are:

𝜎12 [𝜃11 (0)2 + 𝜃11 (1)2 + ⋯ + 𝜃11 (𝑛 − 1)2 ]


𝜎1 (𝑛)2

• And

𝜎22 [𝜃12 (0)2 + 𝜃12 (1)2 + ⋯ + 𝜃12 (𝑛 − 1)2 ]


𝜎1 (𝑛)2

• If 𝜀2𝑡 shocks explain none of the forecast error variance of 𝑦1,𝑡 at all

forecast horizons, we can say that the 𝑦1,𝑡 sequence is exogenous.

• At the other extreme, 𝜀2𝑡 could explain all the forecast error variance

in the 𝑦1𝑡 at all forecast horizons, so that 𝑦1𝑡 would be entirely

endogenous.

Page 23 of 39
Class Notes VAR Models Freddy Espino

• It is typical for a variable to explain almost all its forecast error

variance at short horizons and smaller proportions at longer horizons.

• We would expect this pattern is 𝜀2𝑡 shocks had little contemporaneous

effect on 𝑦1,𝑡 , but acted to affect the 𝑦1,𝑡 sequence with a lag.

• Notice that the results of the FEVD depends on the identification of

the VAR.

Page 24 of 39
Class Notes VAR Models Freddy Espino

8. Granger Causality

• This definition of causality was developed by Granger (1969) and Sims

(1972).

• A test of causality is whether the lags of one variable enter the

equation for another variable.

• In a two-equation model with 1 lag, 𝑦1,𝑡 does not Granger cause 𝑦2,𝑡 if

only if 𝑎21 is equal to zero.

• Thus, if 𝑦1,𝑡 does not improve the forecasting performance of 𝑦2,𝑡 , then

𝑦1,𝑡 does not cause a la Granger 𝑦2,𝑡 .

𝑦1,𝑡 = 𝑎10 + 𝑎11 𝑦1,𝑡−1 + 𝑎12 𝑦2,𝑡−1 + 𝑢1,𝑡

𝑦2,𝑡 = 𝑎20 + 𝑎21 𝑦1,𝑡−1 + 𝑎22 𝑦2,𝑡−1 + 𝑢2,𝑡

• The direct way to determine Granger causality is to use a standard

F-test to test that restriction.

𝐻0 : 𝑎21 = 0

• If 𝑦2,𝑡 is some sort of forecast of the future, such as a future price, then

𝑦2,𝑡 may help to forecast 𝑦1,𝑡 even though it does not cause a la Granger

𝑦1,𝑡 .

Page 25 of 39
Class Notes VAR Models Freddy Espino

9. Structural Decomposition

• Sims’s (1980) VAR approach has the desirable property that all

variables are treated symmetrically so that all variables are jointly

endogenous, and the econometrician does not rely on any “incredible

identification restrictions.”

• However, given the somewhat ad hoc nature of the Cholesky

Decomposition, proposed by Sims (1980), the beauty of the approach

seems diminished when constructing impulse response functions and

forecast error variance decompositions.

• Unless the underlying structural model can be identified from the

reduced-form VAR model, the innovations in a triangular form do not

have a direct economic interpretation.

• The aim of a structural VAR is to use economic theory to identify the

structural innovations from the residuals 𝜀1𝑡 and 𝜀2𝑡 .

• From VAR model, we know that 𝑢𝑡 = 𝐵−1 𝜀𝑡 .

• The problem, then, is to take the observed values of 𝑢𝑡 and to restrict

the system to identify structural shocks 𝜀𝑡 = 𝐵𝑢𝑡 .

• In the case of 𝐾 variables, the selection of the various 𝑏𝑖𝑗 cannot be

completely arbitrary.

Page 26 of 39
Class Notes VAR Models Freddy Espino

• The issue is to restrict the system to (i) identify structural shocks 𝜀𝑡

and (ii) preserve the assumed error structure concerning the

independence of the various 𝜀𝑖𝑡 shocks.

• For example, in the bivariate VAR model:

𝜔12 𝜔12
𝐸[𝑢𝑡 𝑢𝑡′ ] = Ω𝑢 = [ ]
𝜔21 𝜔22

• Given that 𝑢𝑡 = 𝐵−1 𝜀𝑡 , it must be the case that:

Ω𝑢 = 𝐸[(𝐵−1 𝜀𝑡 )(𝐵−1 𝜀𝑡 )′ ] = 𝐸[(𝐵−1 )𝜀𝑡 𝜀𝑡′ (𝐵−1 )′ ] = (𝐵−1 )𝐸[𝜀𝑡 𝜀𝑡′ ](𝐵−1 )′

• Notice that:

𝜎12 0
𝛴𝜀 = 𝐸[𝜀𝑡 𝜀𝑡′ ] =𝐷=[ ]
0 𝜎22

• Thus:

𝜔12 𝜔12 −1 𝜎12 0


Ω𝑢 = [ ] = (𝐵 ) [ ] (𝐵−1 )′
𝜔21 𝜔22 0 𝜎22

• The symmetry of the system is such that 𝜔21 = 𝜔12 so that there are

only three independent equations to determine the four unknown

values 𝑏12 , 𝑏21 , 𝜎12 and 𝜎22 .

• As such, identification is not possible unless another restriction

is imposed.

• It is necessary to impose (𝐾 2 − 𝐾)/2 additional restrictions on 𝐵−1 to

completely identify the system.

Page 27 of 39
Class Notes VAR Models Freddy Espino

• The Cholesky Decomposition requires all elements above the principal

diagonal to be zero.

• Hence, there are a total of (𝐾 2 − 𝐾)/2 restrictions; the system is

exactly identified.

• For example, define matrix 𝐶 = 𝐵−1 with elements 𝑐𝑖𝑗 . Hence, 𝑢𝑡 = 𝐶𝜀𝑡 .

• Consider the following Cholesky Decomposition in a three-variable

VAR:

𝑢1𝑡 = 𝜀1𝑡

𝑢2𝑡 = 𝑐21 𝜀1𝑡 + 𝜀2𝑡

𝑢3𝑡 = 𝑐31 𝜀1𝑡 + 𝑐32 𝜀2𝑡 + 𝜀3𝑡

𝑢1𝑡 1 0 0 𝜀1𝑡
𝑢
[ 2𝑡 ] = [𝑐21 1 0] [𝜀2𝑡 ]
𝑢3𝑡 𝑐31 𝑐32 1 𝜀3𝑡

• An alternative way to model the relationship between the forecast

errors and the structural innovations is:

𝑢1𝑡 = 𝜀1𝑡 + 𝑐13 𝜀3𝑡

𝑢2𝑡 = 𝑐21 𝜀1𝑡 + 𝜀2𝑡

𝑢3𝑡 = 𝑐31 𝜀1𝑡 + 𝜀3𝑡

𝑢1𝑡 1 0 𝑐13 𝜀1𝑡


[𝑢2𝑡 ] = [𝑐21 1 0 ] [𝜀2𝑡 ]
𝑢3𝑡 𝑐31 0 1 𝜀3𝑡

• However, imposing (𝐾 2 − 𝐾)/2 restrictions is not a sufficient condition

for exact identification.

Page 28 of 39
Class Notes VAR Models Freddy Espino

10. Short Run Identifications

• Identify a P matrix by placing restrictions on the contemporaneous

correlations between the variables

10.1 A coefficient restriction

o Coefficient restrictions are necessarily short-run restrictions on

the dynamics of the model.

o The most common restriction is a zero restriction such that one

variable has no contemporaneous effect on another.

o However, unlike a Cholesky decomposition, there is no need to

rely on a triangular formulation.

o Another common type of coefficient restriction involves setting a

coefficient to unity.

o Suppose that we know that a one-unit innovation 𝜀2𝑡 has a one-

unit effect on 𝑦1𝑡 ; hence, suppose we know that 𝑏12 = 1.

10.2 A variance restriction

o One natural restriction is that 𝜎12 = 𝜎22 = 1.

o Rigonon and Sack (2004) illustrate how a volatility break can be

used to identify a structural VAR.

Page 29 of 39
Class Notes VAR Models Freddy Espino

10.3 Symmetry restrictions

o A linear combination of the coefficients and variances can be

used for identification purposes.

o Symmetry restrictions are popular in open-economy models in

that they allow a shock to have equal effects across countries.

10.4 Sign restrictions

o For example, suppose it is known that an oil price shock does not

affect GDP for the first two quarters after the shock.

o Mountford and Uhlig (2008) show how such sign restrictions can

be used in identification.

Page 30 of 39
Class Notes VAR Models Freddy Espino

11. Long Run Identifications

• Place restrictions on the long-term accumulated effects of the

innovations.

• Blanchard and Quah (1989), hereafter BQ, is an example of SVAR

with combinations of I(1) and I(0) Data.

• Consider two observed series 𝑦1𝑡 ~𝐼(1) and 𝑦2𝑡 ~𝐼(0) and define 𝑦𝑡 =

(𝛥𝑦1𝑡 , 𝑦2𝑡 )′ so that 𝑦𝑡 ~𝐼(0), 𝑦1𝑡 is the log of real GNP and 𝑦2𝑡 is the

unemployment rate.

• From the VMA representation



∆𝑦 𝜇1 𝜃 (𝑘) 𝜃12 (𝑘) 𝜀1(𝑡−𝑘)
[ 1𝑡 ] = [𝜇 ] + ∑ [ 11 ][ ]
𝑦2𝑡 2 𝜃21 (𝑘) 𝜃22 (𝑘) 𝜀2(𝑡−𝑘)
𝑘=0

• Regarding the structural innovations, BQ interpret fluctuations in

GNP and unemployment as due to two types of disturbances:

disturbances that have a permanent effect on output and disturbances

that do not.

• BQ interpret the first as supply disturbances (𝜀1𝑡 ), the second as

demand disturbances (𝜀2𝑡 ).

• BQ achieve identification of the SVAR by assuming that demand

shocks (𝜀2𝑡 ) have no long-run impact on the level of output or

unemployment.

Page 31 of 39
Class Notes VAR Models Freddy Espino

• They allow supply shocks (𝜀1𝑡 ) to have a long-run impact on the level

of output but not on the level of unemployment.

• In terms of the long-run impacts, BQ long-run restriction may be

represented as follows:

𝜃12 (𝑠) = ∑ 𝜃12 (𝑠) = 0
𝑠=0

• The restriction that shocks to 𝜀1𝑡 and 𝜀2𝑡 have no long-run effect on the

level of 𝑦2𝑡 is just a restatement of the result because 𝑦2𝑡 ~𝐼(0).

• The long-run restriction makes the long-run impact matrix 𝛩(1) lower

triangular:

𝜃11 (1) 0
𝛩(1) = [ ]
𝜃21 (1) 𝜃22 (1)

• To see how the lower triangularity of 𝛩(1) can be used to identify B in

the SVAR, consider the long-run covariance matrix (𝜦) of 𝑦𝑡 defined

from the Wold MA representation:

𝛬 = 𝛹(1)Ω𝑢 𝛹(1)′

• Where 𝛹(𝐿) = 𝐶(𝐿)−1

𝑦𝑡 = 𝜇 + 𝛹(𝐿)𝑢𝑡

𝑦𝑡 = 𝜇 + 𝛹(𝐿)𝐵−1 𝐵𝑢𝑡

𝑦𝑡 = 𝜇 + Θ(𝐿)𝜀𝑡

• From VAR model, we know that 𝑢𝑡 = 𝐵−1 𝜀𝑡 . Thus:

Ω𝑢 = 𝐵−1 𝐷𝐵−1′

Page 32 of 39
Class Notes VAR Models Freddy Espino

• In addition, we have that 𝛩(1) = 𝛹(1)𝐵−1 , thus:

𝛹(1) = 𝛩(1)𝐵

• The long-run covariance matrix 𝛬 may be re-expressed as:

𝛬 = 𝛹(1)Ω𝑢 𝛹(1)′

𝛬 = 𝛩(1)𝐵Ω𝑢 𝐵′𝛩(1)′

𝛬 = 𝛩(1)𝐵𝐵−1 𝐷𝐵−1′ 𝐵′𝛩(1)′

𝛬 = 𝛩(1)𝐷𝛩(1)′

• To identify matrix B, BQ make the additional assumption:

𝐷=𝐼

• So that the structural shocks 𝜀1𝑡 and 𝜀2𝑡 have unit variances. Thus, the

long-run variance becomes

𝛬 = 𝛩(1)𝛩(1)′

• Notice that since 𝛩(1) is lower triangular, the factorization of 𝛬 can be

obtained using the Cholesky factorization; that is, 𝛩(1) can be

computed as the lower triangular Cholesky factor of 𝛬.

• Given that 𝛩(1) can be computed, B can then be computed using

𝛩(1) = 𝛹(1)𝐵−1 = 𝐶(1)−1 𝐵−1

• So that:

𝐵 = [𝐶(1)𝛩(1)]−1

𝐵 = [(𝐼 − 𝐴(1))𝛩(1)]−1

• Then, compute IRF and FEVD.

Page 33 of 39
Class Notes VAR Models Freddy Espino

12. The A-B Model

• SVAR:

𝐴𝑦𝑡 = 𝐴1𝑆 𝑦𝑡−1 + ⋯ + 𝐴𝑝𝑆 𝑦𝑡−𝑝 + 𝐵𝜀𝑡

• Reduced VAR:

𝑦𝑡 = 𝐴−1 𝐴1𝑆 𝑦𝑡−1 + ⋯ + 𝐴−1 𝐴𝑝𝑆 𝑦𝑡−𝑝 + 𝐴−1 𝐵𝜀𝑡

𝑦𝑡 = 𝐴1 𝑦𝑡−1 + ⋯ + 𝐴𝑝 𝑦𝑡−𝑝 + 𝑢𝑡

𝑢𝑡 = 𝐴−1 𝐵𝜀𝑡

𝐴𝑢𝑡 = 𝐵𝜀𝑡

𝑆 = 𝐴−1 𝐵 ⇒ 𝑢𝑡 = 𝑆𝜀𝑡

• 𝐴: assumptions about the structure of contemporaneous feedback of

variables.

• 𝐵: assumptions about the correlation structure of the errors.

• 𝑆: restrictions on the composite factor loadings but offer no insight into

the decomposition into endogeneity 𝐴 and error loading components 𝐵.

• The number of parameters of the reduced form VAR (leaving out the

parameters attached to the lagged variables) is given by the number

of nonredundant elements of the covariance matrix 𝛺𝑢 , that is, (𝐾 2 +

𝐾)/2.

• Accordingly, it is not possible to identify more than (𝐾 2 + 𝐾)/2

parameters of the structural form.

Page 34 of 39
Class Notes VAR Models Freddy Espino

• However, the overall number of elements of the structural form

matrices A and B is 2𝐾 2 .

• It follows that:

2
(𝐾 2 + 𝐾) 2
𝐾2 − 𝐾
2𝐾 − =𝐾 +
2 2

• Restrictions are required to identify the full model.

• If we set one of the matrices A or B equal to the identity matrix, then

(𝐾 2 − 𝐾)/2 restrictions remain to be imposed.

Blanchard and Perotti (2002)

• The paper characterizes the dynamic effects of shocks in government

spending and taxes on US activity in the postwar period.

• It does by using a mixed SVAR event study approach.

• Identification is achieved by using institutional information about the

tax transfer systems to identify the automatic response of taxes and

spending to activity, and, by implication, to infer fiscal shocks.

𝑌𝑡 = 𝐴(𝐿, 𝑞)𝑌𝑡−1 + 𝑈𝑡

𝑌𝑡 = [𝑇𝑡 , 𝐺𝑡 , 𝑋𝑡 ]′

𝑈𝑡 = [𝑡𝑡 , 𝑔𝑡 , 𝑥𝑡 ]′
𝑔
1. 𝑡𝑡 = 𝑎1 𝑥𝑡 + 𝑎2 𝑒𝑡 + 𝑒𝑡𝑡

Page 35 of 39
Class Notes VAR Models Freddy Espino

𝑔
2. 𝑔𝑡 = 𝑏1 𝑥𝑡 + 𝑏2 𝑒𝑡𝑡 + 𝑒𝑡

3. 𝑥𝑡 = 𝑐1 𝑡𝑡 + 𝑐2 𝑔𝑡 + 𝑒𝑡𝑥
𝑡
1 0 −𝑎1 𝑡𝑡 1 𝑎2 0 𝑒𝑡
𝑔
[ 0 1 −𝑏1 ] [𝑔𝑡 ] = [𝑏2 1 0] [𝑒𝑡 ]
−𝑐1 −𝑐2 1 𝑥𝑡 0 0 1 𝑒𝑡𝑥

𝑔
𝐸𝑡 = [𝑒𝑡𝑡 , 𝑒𝑡 , 𝑒𝑡𝑥 ]′

𝐴𝑈𝑡 = 𝐵𝐸𝑡
𝑔
1. 𝑡𝑡 = 𝑎1 𝑥𝑡 + 𝑎2 𝑒𝑡 + 𝑒𝑡𝑡
𝑔
2. 𝑔𝑡 = 𝑏1 𝑥𝑡 + 𝑏2 𝑒𝑡𝑡 + 𝑒𝑡

3. 𝑥𝑡 = 𝑐1 𝑡𝑡 + 𝑐2 𝑔𝑡 + 𝑒𝑡𝑥

𝑥(𝑖)
4. 𝑥𝑖,𝑡 = 𝑑1 𝑡𝑡 + 𝑑2 𝑔𝑡 + 𝑒𝑡

𝑡
1 0 −𝑎1 0 𝑡𝑡 1 𝑎2 0 0 𝑒𝑡𝑔
0 1 −𝑏1 0 𝑔𝑡 𝑏 1 0 0 𝑒𝑡
[ ][ ] = [ 2 ]
−𝑐1 −𝑐2 1 0 𝑥𝑡 0 0 1 0 𝑒𝑡𝑥
−𝑑1 −𝑑2 0 1 𝑥𝑖,𝑡 0 0 0 1 [𝑒 𝑥(𝑖) ]
𝑡

𝑔 𝑥(𝑖)
𝐸𝑡 = [𝑒𝑡𝑡 , 𝑒𝑡 , 𝑒𝑡𝑥 , 𝑒𝑡 ]′

𝐴𝑈𝑡 = 𝐵𝐸𝑡

Page 36 of 39
Class Notes VAR Models Freddy Espino

13. Over identify systems

• It may be that economic theory suggests more than (𝐾 2 − 𝐾)/2

restrictions.

• However, imposing (𝐾 2 − 𝐾)/2 restrictions is not a sufficient condition

for exact identification.

• Unfortunately, the presence of nonlinearities means that there are no

simple rules that guarantee exact identification.

• The procedure for identifying an overidentified system entails

the following steps:

i. Obtain the unrestricted variance/covariance matrix Ω𝑢 .

ii. Restricting B and / or 𝛴𝜀 will affect the estimate of Ω𝑢 .

Ω𝑢 = (𝐵−1 )𝛴𝜀 (𝐵−1 )′

Select the appropriate restrictions and maximize the likelihood

function with respect to the free parameters of B and 𝛴𝜀 .

This will lead to an estimate of the restricted variance / covariance

matrix. Denote this second estimate by Ω𝑅 .

iii. The difference | Ω𝑅 | − | Ω𝑢 | has a 𝜒2 distribution with degrees of

freedom equal to the number of over identifying restrictions, 𝑅 is

the number of restrictions exceeding (𝐾 2 − 𝐾)/2 .

𝐿𝑅 = |Ω𝑅 | − |Ω𝑢 |~𝜒(𝑅)

Page 37 of 39
Class Notes VAR Models Freddy Espino

Under the null hypothesis (H0) that restrictions are valid, if the

calculated value exceeds that in a 𝜒2 table, the restrictions can

be rejected.

iv. Now, allow for two sets of over identifying restrictions such that the

number of restrictions in 𝑅2 exceeds that in 𝑅1 .

v. In fact, if 𝑅2 > 𝑅1 > (𝐾 2 − 𝐾)/2 the significance of the extra 𝑅2 − 𝑅1

restrictions can be tested as:

𝐿𝑅 = |Ω𝑅2 | − |Ω𝑅1 |~𝜒 2 (𝑅2 − 𝑅1 )

Page 38 of 39
Class Notes VAR Models Freddy Espino

14. Empirical Puzzles

• The Exchange Rate Puzzle

▪ Grilli and Roubini (1995); Sims (1992).- In an open economy

environment a positive innovation in interest rates seems to result

in a depreciation of the local currency rather than an appreciation.

▪ This is “the exchange rate puzzle”.

• Price Puzzle

▪ Sims (1992).- In various empirical VAR studies, a contractionary

monetary shock causes a persistent increase in price level rather

than a decrease.

▪ This odd response of the price level to a restrictive monetary policy

shock is called “the price puzzle”

• The Liquidity Puzzle

▪ Leeper and Gordon (1992).- A similar anomaly has been observed

in the response of interest rates to a shock to monetary aggregates.

▪ Following an expansionary shock to the money variable, the

interest rate exhibits a positive response creating “the liquidity

puzzle”.

Page 39 of 39

You might also like