0% found this document useful (0 votes)
61 views42 pages

Open Macroeconomics

This document is an outline for a course on open macroeconomics. It covers several topics related to international finance and asset pricing, including the intertemporal approach to current accounts, investment and productivity shocks, growth shocks, international asset pricing in discrete time, general equilibrium models with one or two goods, and demand shocks. Each section breaks down various subtopics and models in detail with mathematical notation.

Uploaded by

Martin Zapata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views42 pages

Open Macroeconomics

This document is an outline for a course on open macroeconomics. It covers several topics related to international finance and asset pricing, including the intertemporal approach to current accounts, investment and productivity shocks, growth shocks, international asset pricing in discrete time, general equilibrium models with one or two goods, and demand shocks. Each section breaks down various subtopics and models in detail with mathematical notation.

Uploaded by

Martin Zapata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Open Macro Economics

Roberto Rigobon
MIT

Fall 2010
ii
Contents

1 The intertemporal approach to the current account 1


1.1 Endowment Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Quadratic Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1.1 AR1 case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1.2 AR2 case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1.3 Sheffrin and Woo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Investment and Productivity Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Temporary Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Persistent shock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Growth Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 Aguiar and Gopinath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Algebra behind Aguiar and Gopinath . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 International Asset Pricing: Discrete Time 19


2.1 Small Open Economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 Basic Production Economy under Certainty . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1.1 External Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.2 Asset Pricing in a Small Open Economy . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.2.1 Risk Free Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.2.2 Risk Neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.2.3 Asset Prices under risk neutrality . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.2.4 Message from this model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2.5 Challenging Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 General Equilibrium: Two Countries, Single Good . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.2.1 Single Good Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3.1 Risk Free Bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3.2 Stock Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.4 Log Utility Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.4.1 Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.4.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.4.3 Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.4.4 Stock Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.4.4.1 Veronesi’s method: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.4.4.2 Cochrane-Longstaff-Santa Clara (two trees): . . . . . . . . . . . . . 32
2.2.4.5 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.4.6 Portfolio Holdings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

iii
iv CONTENTS

2.2.5 Issues with this model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


2.2.6 Problem Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.6.1 Stock markets and Output: The Bernoulli World . . . . . . . . . . . . . . . . 33
2.2.6.2 Non-tradables in a single good model . . . . . . . . . . . . . . . . . . . . . . 34
2.3 General Equilibrium: Two Countries, Two Goods. . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.1 Utilities and Pareto Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.1.1 Terms of Trade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1.1.1 Ricardian Effect: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1.1.2 Dependent Economy Effect: . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1.1.3 Wealth Transfer Effect: . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.1.2 Goods’ Prices and Exchange Rates . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.3.1 Issues with this model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4 Demand Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1 Social Planner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.2 Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4.3 Asset Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.4 Interest Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Introduction to Brownian Motion and Stochastic Calculus: Some Applications 49


3.1 Basic Continuous Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1.1 Brownian Motion: Random Walk representation. . . . . . . . . . . . . . . . . . . . . . 49
3.1.1.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.1.2 Some approximations (from the Random Walk) . . . . . . . . . . . . . . . . 51
3.1.2 Brownian Motion: Continuous Time Representation. . . . . . . . . . . . . . . . . . . . 51
3.1.2.1 Itô’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.2.2 Bellman Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.2.2.1 Stationary problem: . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1.2.2.2 Non-Stationary Problem: . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.2.2.3 What makes Brownian motion so special? . . . . . . . . . . . . . . . 54
3.1.3 Constraints and Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.3.1 Absorbing Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.3.2 Reflecting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.3.3 Reseting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.1.3.4 Shifting Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1.4 Distributions and paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.5 Control problem: defining optimal barriers . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2.1 Target Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2.1.1 The Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2.1.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2.1.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Cochrane-Longstaff-Santa Clara . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.1 Evolution of the Share . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.3.2 Solving for Stock Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.3 Problem Set: Numerical Cochrane, Longstaff, and Santa-Clara. . . . . . . . . . . . . . 67
3.4 Sticky prices models in continuous time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.1 Menu Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.2 Observation Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
CONTENTS v

4 Balance of Payment Crises in a Simple Monetary Model 73


4.1 Balance of Payments Crises: First Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1.1 Stochastic Fiscal Reform and Crises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1.2 Model without Debt Constraints: Optimal Monetary Policy . . . . . . . . . . . . . . . 75
4.1.2.1 Environment and Consumers . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.1.2.2 Government . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1.2.3 Central Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1.2.4 Optimal Monetary and Exchange Rate Policy . . . . . . . . . . . . . . . . . 78
4.1.2.4.1 Flexible exchange rate. . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.1.2.4.2 Optimal interest rate path . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.2.4.3 Solution: Formal Derivation . . . . . . . . . . . . . . . . . . . . . . 81
4.1.3 Model with Debt Constraints: Balance of payments crisis . . . . . . . . . . . . . . . . 82
4.1.3.1 Optimal Monetary and Exchange Rate Policy . . . . . . . . . . . . . . . . . 82
4.1.3.1.1 Solution: A heuristic approach . . . . . . . . . . . . . . . . . . . . . 83
4.1.3.1.2 Solution: Formal Derivation . . . . . . . . . . . . . . . . . . . . . . 84
4.1.4 Sequence of Stabilization Programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.1.5 Solution for the Money in the Utility model. . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Balance of Payments Crises: Second Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3 Balance of Payments Crises: Third Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5 Identification in Macroeconomics: Problem 95


5.1 Problems and Biases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.1.1 Simultaneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1.2 Omitted variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.1.3 Error-in-variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 Lack of Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2.1 General set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Standard solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.1 Parameter Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.1.1 Exclusion Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.3.1.1.1 Contemporaneous coefficients: Assuming the problem away . . . . . 103
5.3.1.1.2 Exogenous Variables: Indirect Least Squares . . . . . . . . . . . . . 104
5.3.1.1.3 Instrumental Variables . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.3.1.2 Long Run Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3.2 Variance Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.2.1 Near Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.2.2 Relative variance restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.3 Sign Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.4 Reversed Regressions and ”Bounds” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6 Identification through Heteroskedasticity: Theory. 113


6.1 Preliminary Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.1 Identification under two regimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.2 Identification under more than two regimes. . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3 Identification with common shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.3.1 Related literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.4 Consistency under misspecification of the heteroskedasticity. . . . . . . . . . . . . . . . . . . . 121
6.4.1 Misspecification of the regime windows. . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.4.2 Under-specified number of regimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
vi CONTENTS

7 Purchasing Power Parity: Empirical Issues. 129


7.1 Some small history of PPP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2 Theoretical remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.2.1 A Small Open Economy PPP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.2.2 The Dependent Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.3 Early empirical discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3.1 Directly testing PPP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.3.2 Tests based on the real exchange rate . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3.2.1 AR(1) distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3.2.2 More data! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.3.3 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3.4 Panel Data tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.4 Current discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.4.1 Aggregation Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.4.1.1 Some theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.4.1.2 What they find? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.4.2 The critique! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.4.2.1 How does the data look? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.4.2.2 Micro regression and Error-in-variables . . . . . . . . . . . . . . . . . . . . . 144
7.4.3 Who is right? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.4.4 Using aggregate data to correct the errors in variables. . . . . . . . . . . . . . . . . . . 146
7.4.4.1 Errors-in-variables and Common shocks . . . . . . . . . . . . . . . . . . . . . 147
7.4.4.2 Error-in-variables at both aggregate and individual level . . . . . . . . . . . . 149
7.4.5 Long and short differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.4.5.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Preface

I started writing these notes while I was visiting the Graduate Institute of International Studies at Genève
in 2004. I started with the identification on macroeconomics section, after that, I had far too much free
time (they gave me tenure), and I have continued them through out the years while I was visiting PUC in
Rio, the University of Indiana, the University of Wisconsin at Madison, the Bank of England, the European
Central Bank, the Inter-American Development Bank, Universidad Los Andes in Bogota, and Now the Kiel
Institute. I thank all of them for their tremendous hospitality and for motivating me to organize my thoughts
in these area.
Before starting it is absolutely crucial to start with a disclaimer. Most people have disclaimers and I have
never been able to write one. So, here it goes... Although I do not work on a Central Bank or a multilateral
organization such as the IBD, IMF or WB, my opinions do not reflect the views of those organizations, nor
their board members, nor their staff members, nor their respective significant others, nor their pets either.
Just in case you were wandering.
Now turning to more serious issues, there are three important characteristics that define these notes.
First, there are, probably, a continuum of mistakes. Especially, because they were written in Spanish,
and then translated to English by someone that has a very limited knowledge of both languages − i.e.,
me. Second, these notes try to summarize an extremely extensive literature. I cannot cite everybody that
deserves to be cited. The main reason is that I cannot type all those citations in BibTex. That is quite
embarrassing because a lot of them are actually colleagues, that are still kicking around. If by any chance
you think that I have forgotten to cite 24 of your papers, I apologize, and I can only offer you these words of
profound sympathy: “you are in very good company”. I promise you, though, that I will not forget a single
one of my papers. So, at least someone will be well represented.

vii
viii CONTENTS
Chapter 5

Identification in Macroeconomics:
Problem

The problem of identification in macroeconomics is one of the most studied issues in theoretical and applied
work. Problems of simultaneous equations, omitted variables, and errors in variables have motivated a large
literature in econometric papers. In this notes, my objective is to describe how these problems affect the
estimation of macro-models and to study some of the new methodologies that have been developed to solve
them. We, as a profession, still are far from having a satisfactory answer, but we are clearly moving in the
right direction.
This chapter describes the three problems we are interested in analyzing. First, we discuss the biases
that arise in each of the cases and their properties. Second, we provide a reinterpretation to the biases by
relating the problem of recovering the “true” coefficients from the data − i.e. the lack of identification − to
these three problems. This puts all the problems within the same framework. The third section analyzes the
standard solution that the literature has offered to these problems. The purpose of the section is to provide
a concise summary of the “favorite” techniques within a single framework. By no means, it pretends to be
a survey of the literature.

5.1 Problems and Biases


Economic data suffers from several problems. Indeed, if you have worked in empirical projects the list of
problems you have faced seems infinite; and probably it is. There are problems of simultaneous equations, of
omitted variables, of aggregation, of noisy data, of truncated variables, etc. In this brief notes I would like
to concentrate on three main problems: simultaneous equations, omitted variables, and error-in-variables.
One reason is that I consider them the most important problems, but also, because I find them the most
interesting ones.1
1 Although, lately I have been working on aggregation issues. So, probably the next version of these notes will claim that

aggregation is also crucial. In any case, the choice of topics reflect my preferences, and not the aggregate opinion of the
profession.

95
96 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

5.1.1 Simultaneous equations

The problem of simultaneous equations is perhaps one of the most common issues we face in applied work.
In fact, it is the preferred one of any referee uses to protect his/her personal agenda, and wants to reject any
(of my) papers. In any case, it is also common in practical issues. For instance, the problem of estimating
the slope of the demand curve, when the researcher does not know if the price-quantities observed are the
result of shifts in the supply schedule or those of the demand curve is one of the benchmark models in most
econometric classes. The problem is more generalized than this. For example, estimating the Central Bank
reaction function, the fiscal policy reaction function, savings and investment behavior, the linkages among
asset prices, or among countries (contagion), or the choice of education and wages, or of participation in the
labor market and taxes, or the impact of the quality of institutions on income, estimating the Q theory, are
just a few of all the possible questions where endogeneity is a crucial issue.
In this sub-section, we study the general problem of simultaneous equations in the standard supply and
demand framework. Assume that we are interested in estimating the following relationship:

yt = αxt + εt (5.1)

For simplicity, lets assume that the two variables have mean zero (so there is no constant in the regression),
and that both are univariate with dimensions T × 1. It is well known that the OLS estimated coefficient
takes the form
−1
α̂OLS = (x0t xt ) (x0t yt ) .

The problem of simultaneous equations, however, is that the variable x also depends on y. Assume that
they satisfy the following relationship:
xt = βyt + η t . (5.2)

Equations (5.1) and (5.2) form a system of equations that is known as the structural model:

yt = αxt + εt (Structural Model)


xt = βyt + η t

where εt and η t are known as the structural shocks. In most macro applications the following moments are
usually assumed

Assumption 10 Assume that the structural errors have mean zero

E (εt ) = 0, E (η t ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η ,
and are uncorrelated
E (ε0t η t ) = 0.

The assumptions imply that unconditionally the errors have mean zero, that their variances are finite,
and that their covariance is zero. This last assumption is not required but in most macroeconomic models it
is used. The main reason is that we would like to be able to think of the structural shocks as innovations that
are economically meaningful, such as demand versus supply shocks, nominal versus real shocks, or permanent
versus transitory shocks. In general, it is easier to understand the implications of these shocks when they are
considered as independent or orthogonal. As will become clear below we will relax this assumption. For the
moment, this assumption is innocuous to our discussion and therefore we will keep it because it simplifies
tremendously the algebra.
5.1. PROBLEMS AND BIASES 97

Additionally, this covariance assumption implies that all the joint co-movement between the observed
variables (x and y) is explained by the endogenous coefficients (α and β) and not by the correlation in their
disturbances.2
The structural model implies that the observed variables are given by what is known as the reduced form
1
yt = (αη t + εt ) (Reduced Form Model)
1 − αβ
1
xt = (η + βεt )
1 − αβ t
where, in order to assure that the observed variables have finite variance, we will impose the following
assumption:

Assumption 11 Assume that the structural parameters satisfy:

|αβ| < 1

Under assumptions (10) and (11) there are only three relevant moments that can be estimated in the
sample: the variance of y, the variance of x, and their covariance. If the distributions are not normal there
are also other higher moments that can be estimated in the sample that are relevant, but those issues are
left for later. Mainly because if the distributions are not normal then identification becomes a much easier
problem to solve. We would like to put ourselves in the toughest of all positions and discuss how to solve
the problem there. Furthermore, the assumption that the variables are normal, such that their sum is also
normal is a standard assumption in macro applications. The moments are:
1  2 2
α σ η + σ 2ε

V ar(yt ) = 2 (5.3a)
(1 − αβ)
1  2 2

Covar(xt , yt ) = 2 ασ η + βσ ε (5.3b)
(1 − αβ)
1  2 2 2

V ar(xt ) = 2 ση + β σε (5.3c)
(1 − αβ)

The estimate from equation (5.1) is:

Covar(xt , yt )
α̂OLS =
V ar(xt )
σ 2ε
= α + β (1 − αβ) (5.4)
σ 2η + β 2 σ 2ε

Equation (5.4) shows that the OLS estimate has an additional term which is the bias introduced by
simultaneous equations. There are some properties of this bias that are worth discussing. First, under the
assumption that |αβ| < 1 the sign of the bias is the sign of β. So, if x is a decreasing function of y then the
OLS estimate is smaller (downward biased) than the true one, while the converse occurs if the coefficient is
positive. Notice that nothing prevents the bias to reverse the sign of the coefficient. In other words, the fact
that α is positive (for instance) does not necessarily forces the OLS estimate to be positive. We will see that
this is not the case with some of the other problems discussed below.
2 This does not imply that the disturbances are not correlated in most applications. But any of such correlation can be

transformed into a similar setup as the one studied here.


98 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

Second, the bias is exactly zero if β is zero. Obviously assuming that β is zero and that the covariance
of the structural shocks is also zero is indeed eliminating the problem of simultaneous equation − it is just
assuming the problem away.3 In any case, it is important to highlight that this is the case because some of
the solutions that are widely used in the literature are in fact making this assumption.
Third, notice that there is another condition in which the bias goes to zero:
σ 2η
→∞
σ 2ε
which can happen if the innovations to the first equation are zero (σ 2ε = 0) or if the innovations to the second
equation are infinitely large (σ 2η → ∞), which is the case when the variables are integrated but they are also
cointegrated.
Finally, the bias is small (and goes toward zero), when σ 2η  σ 2ε . This is known in the literature as near
identification and we will return to this issue in the next chapter.

5.1.2 Omitted variables

Omitted variable bias is perhaps the second most important issue afflicting macro applied work. The fact
that it is almost impossible to control for all observables implies that in most of our specifications we always
have some degree of misspecification. Obviously this should not be considered as a justification to never do
applied work. On the contrary, as in the case of endogeneity, this problem should just make our claims less
ambitious.
One of the most important and studied problems of omitted variables is the estimation of the returns of
one more year of schooling. The idea is that there exists an unobservable variable, the individuals ability,
that is both correlated with the decision of participation in the school system, and on the salaries received.
It could be argued that an individual of higher ability would be willing to study more years, and for the
same level of education might receive a higher wage.
As before, we study a simplified model to highlight the problems of estimation. Assume that we are
interested in estimating the following relationship:
yt = αxt + εt
but in this case, the true model is the following:
yt = αxt + γzt + εt (Omitted Variable Model)
xt = zt + η t
where εt and η t are the structural shocks, and zt is an unobservable omitted variable. The following moments
are usually assumed

Assumption 12 Assume that the structural errors have mean zero


E (εt ) = 0, E (η t ) = 0, E (zt ) = 0,
finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (zt0 zt ) = σ 2z ,
and are uncorrelated
E (ε0t η t ) = 0, E (ε0t zt ) = 0, E (η 0t zt ) = 0.
3 By the way, this is indeed very common: Denial is the most important source of happiness. If you have a problem, a

solution is to assume that you do not have one.


5.1. PROBLEMS AND BIASES 99

Which are equivalent to the assumptions made before (Assumption 10). The reduced form is the following

yt = (α + γ) zt + αη t + εt
xt = zt + η t

As before, there are only three relevant moments that can be estimated in the data:
2
V ar(yt ) = (α + γ) σ 2z + α2 σ 2η + σ 2ε (5.5a)
Covar(xt , yt ) = (α + γ) σ 2z + ασ 2η (5.5b)
V ar(xt ) = σ 2z + σ 2η (5.5c)

The OLS estimate is:


σ 2z
α̂OLS = α + γ (5.6)
σ 2η + σ 2z

Equation (5.6) shows the bias introduced by omitted variables. Notice that in this case we have similar
remarks as the ones for the simultaneous equations problem. First, the sign of the bias is the sign of γ. As
before, nothing prevents the bias to reverse the sign of the coefficient, and if γ is zero, then the omitted
variable does not enter the y equation − and hence the bias disappears.
Second, if
σ 2η
→∞
σ 2z
the bias goes to zero. Finally, the bias is small when σ 2η  σ 2z , which is exactly the same condition as before
for near identification.
This parallel will continue to be present, and is part of the purpose of this section to show that indeed
these different problems are in some form all related.

5.1.3 Error-in-variables

Finally, lets study the problem of errors in variables. Assume we are interested in estimating the exact same
relationship but that the true model is

yt = αx∗t + εt (Error-in-variables Model)


xt = x∗t + ηt

where x∗t is the true variable, but one that cannot be observed. We only observe a noisy and unbiased
measure of it (xt ). As before, εt and η t are the structural shocks, and the following moments are usually
assumed:

Assumption 13 Assume that the structural errors have mean zero

E (εt ) = 0, E (η t ) = 0, E (x∗t ) = 0,

finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (x∗0 ∗ 2
t xt ) = σ x∗ ,

and are uncorrelated


E (ε0t η t ) = 0, E (ε0t x∗t ) = 0, E (η 0t x∗t ) = 0.
100 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

These are the conditions that make this a ”classical” error-in-variables problem. The non-classical error-
in-variables produces different implications to the ones discussed here. These are important extensions, but
beyond out scope.
Assumption 13 is equivalent to the assumptions made in the previous two sub-sections. As before, there
are only three relevant moments that can be estimated in the data:

V ar(yt ) = α2 σ 2x∗ + σ 2ε (5.7a)


Covar(xt , yt ) = ασ 2x∗ (5.7b)
V ar(xt ) = σ 2x∗ + σ 2η (5.7c)

The OLS estimate is:


σ 2η
α̂OLS = α − α (5.8)
σ 2x∗ + σ 2η

Equation (5.8) shows the bias introduced by error-in-variables. Although the form of equation (5.8) is
similar to (5.6) their properties are not exactly the same. First, the sign of the bias depends on the coefficient
in the equation to be estimated. Which means that the biased in negative if the coefficient is positive, and
the bias is positive if the coefficient is negative. Second, because the ratio of the variances in the right term
is always smaller than one then the bias is always in absolute terms smaller than α. This implies that the
bias (in this case) can never change the sign of the coefficient − the bias is moving the coefficients toward
zero but never reaching it.4
The only circumstance in which the bias is zero is when5

σ 2x∗
→ ∞.
σ 2η

Finally, as before, the bias is small, when σ 2x∗  σ 2ηz , which is exactly the same condition implied by near
identification.

5.2 Lack of Identification


The previous section has discussed the biases introduced by the three problems under analysis. There is a
deeper issue that we would like to highlight in this section: the lack of identification.
The previous section is saying that OLS estimates are biased. Which it is just telling that OLS is not the
appropriate technique of estimation. Imagine that there would be a different procedure that would allow us
to recover the true coefficients from the data, then the issues highlighted in the previous section are just a
curiosity, and mostly a warning: “Be aware of OLS”. But this is not the case. The main problem when any
of these three problems is present in the data is that the true coefficients cannot be recovered from the data
with any procedure, without further assumptions. This means that there does not exist a single technique
or methodology that could help us solve the problem of estimating equation (5.1).
This is easily seen by counting the number of equations, or moments, that can be computed in the data,
and by comparing it to the number of parameters that describe them. The way we have set up all three
problems there are only three moments that can be computed in the data: the variance of y, the variance
4 This is known in the earlier literature as the ”iron law” of economics. See Hausman (19XX). Obviously this is the the case

in the linear bivatiare setting (as the one described here). If the model is non-linear or there are more regressors then the bias
can go in any direction.
5 There is another circumstance: α = 0, but this is not an interesting case.
5.2. LACK OF IDENTIFICATION 101

of x, and their covariance. These moments are given in equations (5.3), (5.5), and (5.7) for the cases of
simultaneous equations, omitted variables, and error-in-variables problems, respectively.
In the case of simultaneous equations there are four coefficients: α, β, σ 2ε , and σ 2η − three equations and
four unknwons. For omitted variables we have five parameters: α, γ, σ 2ε , σ 2η , and σ 2z − three equations and
five unknwons. Finally, for the error-in-variables problem we have four parameters: α, σ 2ε , σ 2η , and σ 2x∗ −
three equations and four unknwons. In all three problems the number of coefficients or parameters to be
estimated is larger than the number of equations. Furthermore, not only the number of equations is smaller
than the number of unknowns, but there is no linear or non-linear combination of the equations that can
solve for any of the parameters, and specially the parameter of interest - α.
Therefore, without further assumptions there is a continuum of solutions that satisfy the sample moments.
In other words, we cannot recover the true parameters from the data − which is known as an identification
problem.

5.2.1 General set-up

The problem of identification described before can be generalized. In this section we discuss the exact
same issues and we introduce the standard terminology of system of equations. In particular rank and order
conditions. We will come back several times to these concepts and therefore, this is a good time to developed
them.
Assume the model to be estimated is

yt = αxt + εt
0
E (εt xt ) 6= 0

In this model, the OLS estimate is given by

E (ε0t xt )
α̂OLS = α + .
V ar(xt )

Which again indicates the source of the bias is coming from the fact that the right hand side variable is
correlated with the residual.
In this model, the identification problem is due to the same aspect as the previous examples. In the data
we can only compute three moments: var(yt ), var(xt ), and covar(yt , xt ) but there are four parameters: α,
σ ε , var(xt ), and E (ε0t xt ).
In the standard literature on system of equations when the number of equations is smaller than the
number of unknowns it is said that the system of equations does not satisfy the order condition. As should
be expected, a system of equations where the order condition is not satisfied has no hope of actually collecting
the parameters without further assumptions − or equations.
It is important to highlight that the fact that the order condition is satisfied does not guarantee that
the system of equations has a solution. In other words, we can have enough equations, but they are not
independent. The independence of the equations is a condition known as rank condition. It states that
the number of independent equations has to be larger than the number of unknowns. The name “rank”
condition comes from the linear system of equations literature where the independence of the equations is
computed using the rank of the matrix describing the system. Most of the systems of equations we are
faced when estimating parameters involve non-linear relationships and checking their independence is much
harder than just calculating the column rank of a matrix. Nevertheless, the econometric literature adopted
this definitions since the seminal contribution by (Fisher 1976).
102 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

The idea, or purpose of this section is to show (and try to convince the reader) that all these problems
can be described as part of a more general issue in which the number of coefficients that have to be estimated
is smaller than the number of equations or moments that can be computed in the data. The next section
deals with the methods that have been ;proposed in the literature to solve this problem.

5.3 Standard solutions


In this section, we study the standard methods that have been proposed to solve the problem of identification.
Because most of the analysis can be done in the simultaneous equations set up we concentrate entirely on
this framework.
To clarify the intuition, let us remember the set up that we have been using. Consider the following
standard problem of simultaneous equations:

yt = αxt + εt , (5.9)
xt = βyt + η t , (5.10)

where (5.9) is the demand equation, (5.10) is the supply equation, yt and xt are the observed price and
quantity, and εt and η t are the structural shocks. The parameters of interest are α, β, and the variances of
the shocks: σ 2ε , σ 2η . For the moment, assume that the structural shocks are not correlated: σ εη = 0. This
assumption is relaxed below.
It is well known that if α and β are different from zero, equations (5.9) and (5.10) cannot be consistently
estimated without further information. Actually, one can only estimate the covariance matrix of the reduced
form (Ω̂) given by,  2 2
α σ η + σ 2ε ασ 2η + βσ 2ε

1
Ω̂ = .
(1 − αβ)
2 . σ 2η + β 2 σ 2ε
The problem of identification is that the covariance matrix only provides three moments (the variance of yt ,
the variance of xt , and the covariance between yt and xt ) while there are four unknowns: α, β, σ 2η , σ 2ε .
The literature has solved the problem of identification by imposing additional parameter constraints. This
amounts to create or assume additional equations to the system of equations we are studying. These restric-
tions can be divided in the following classes: parameter restrictions, variance restrictions, sign restrictions,
and reverse regressions. In this section we summarize the implications of these assumptions.
The objective of this section, therefore, is to describe briefly most of the assumptions that have been
used in the literature. By no means, this pretends to be an exhaustive survey, it is just a summary of some
of the most used techniques. As will become clear, I will oversimplify what each of the methodologies do,
and indeed, I will present a critical perspective to all of them. It is important to mention that, even though
I will address the methods through this ”critical lens” perspective, these assumptions have proven to be
extremely useful in applied work. We have learned a great deal by using them, and several of the agreements
we have in the profession are the outcome of empirical studies using one or more of these techniques. There
are other economic problems, however, in which none of them can be rationalized and we still are in search
of the answers.

5.3.1 Parameter Restrictions

By far, the assumption that has been extensively used in the literature is parameter restrictions in the form
of exclusion restrictions or long run restrictions. For instance, (i) when we estimate VAR’s and compute a
Cholesky decomposition to estimate the structural equations − we are indeed using an exclusion restriction
5.3. STANDARD SOLUTIONS 103

that is implied by the ordering in the VAR; (ii) when we solve the problem by using instrument variables,
we are imposing an exclusion restriction; (iii) when we solve the problem of error-in-variables by using lags,
we are using an exclusion restriction, etc.

5.3.1.1 Exclusion Restrictions

5.3.1.1.1 Contemporaneous coefficients: Assuming the problem away The first type of exclusion
restriction is one in which we assume that either β = 0, or α = 0. In my view this is just assuming the
problem of endogeneity or omitted variables away. When said it like this, the assumption does not sound
that reasonable, does it? But this is exactly the implicit assumption that we are making when we use the
triangular decomposition − or Cholesky decomposition − in a VAR! This is exactly the assumption implied
when we claim that certain variable is a valid instrument, etc.
The assumption indicated here implies that (lets assume we concentrate on β = 0)

yt = αxt + εt ,
xt = ηt ,
which implies that xt is orthogonal to εt and we can run OLS in the first equation to recover the true
coefficient.
For the multinomial setup the assumptions are very similar. Assume that there are N endogenous
variables, and that the contemporaneous relationship is described by the matrix A.
AXt = εt
where εt are the structural shocks assumed to be uncorrelated and with covariance matrix Σ, and where A
is a matrix with ones on the diagonal and dense. For example, for the bivariate case, the matrix is
 
1 −α
A=
−β 1
and the structural shocks covariance matrix is
σ 2ε
 
0
Σ=
0 σ 2η

Returning to the multinomial setup, the reduced form model is


Xt = A−1 εt .
Notice that in this model we can compute the covariance matrix of the observed variables (Xt ), which
contribute with N (N + 1) /2 moments. These moments are explained by the theoretical covariance matrix
given by A−10 · Σ · A−1 − which has N variances (elements of Σ) and N (N − 1) parameters (the elements of
A. Remember that the diagonal is set to be equal to one). Clearly, there are more unknwons than equations,
which is the identification problem we have been discussing all along. The solution requires several exclusion
restrictions, lets see how many:
N (N + 1)
+r ≥ N + N (N − 1) = N 2
2
N (N − 1)
r ≥
2
Observe that we need to impose as many exclusion restrictions as the ones that will guarantee that A has
zeros in the lower (or upper) triangle. The Cholesky decomposition is exactly doing so.
104 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

5.3.1.1.2 Exogenous Variables: Indirect Least Squares A different set of exclusion restrictions
appear when the variable excluded is exogenous rather than endogenous. Assume the model is the following:
yt = αxt + πwt + εt ,
xt = βyt + γwt + η t ,
where wt is observed. We still make the same Assumption 10 in addition to

Assumption 14 Assume that the observed variable (wt ) has mean zero
E (wt ) = 0
finite variance
E (wt0 wt ) = σ 2w
and is uncorrelated with all the other shocks
E (wt0 εt ) = 0, E (wt0 η t ) = 0.

Where the zero-mean assumption is innocuous. We need it in this setup to assure that we do not have
to estimate a constant term. This is obviously a simplification. The reduced form model is
1
yt = ((αγ + π) wt + εt + αη t )
1 − αβ
1
xt = ((γ + βπ) wt + βεt + η t )
1 − αβ
Although in this model we can compute six moments: three variances for each of the observable variables
and three covariances, there are seven unknwons: three variances (σ 2ε , σ 2η , and σ 2w ) and four coefficients (α,
β, γ, and π). Furthermore, there is no way of re covering even some of the coefficients.
However, it is easy to show that one exclusion restriction is enough to solve the problem of identification.

Assumption 15 Assume that π = 0.

In this case, we are assuming that the variable wt enters the second equation but does not enters the
first one. The reduced form is
1
yt = (αγwt + εt + αη t )
1 − αβ
1
xt = (γwt + βεt + η t )
1 − αβ
Notice that the ratio between the coefficients on the exogenous variable identify α. The regression coefficient
αγ γ
of yt on wt is 1−αβ , while the coefficient of xt on wt is 1−αβ . The ratio is exactly α. This methodology was
developed by (Haavelmo 1947)

5.3.1.1.3 Instrumental Variables Instrumental variables is similar to the indirect least square we have
seen but the required assumptions are smaller. Which also explains why instrumental variables has been
used so much in the literature. The setup is the following:

yt = αxt + πwt + εt ,
xt = βyt + γwt + η t ,
5.3. STANDARD SOLUTIONS 105

where wt is observed and from now on will be denoted as the “instrument”. We change Assumption 10 to
the following:

Assumption 16 Assume that the observed variable (wt ) has mean zero mean zero
E (εt ) = 0, E (η t ) = 0, E (wt ) = 0,
finite variance
E (ε0t εt ) = σ 2ε , E (η 0t η t ) = σ 2η , E (wt0 wt ) = σ 2w ,
and the instrument is uncorrelated with the residual in the first equation:
E (wt0 εt ) = 0.

Lemma 17 The coefficient α can be estimated consistently if and only if the shocks satisfy Assumption 16,
and the exclusion restriction
π = 0,
is imposed. Furthermore, one of the possible ways to estimate α is the following:
αIV = (wt0 xt )−1 (wt0 yt )

Notice that in this case the structural shocks are not required to be uncorrelated, E (η 0t εt ) 6= 0. Moreover,
the instrument can be correlated with the residuals in the second equation E (wt0 η t ) 6= 0. In these circum-
stances, even though there are less equations than unknowns, we still can solve the problem of estimating α.
By all means, this is the beauty of the instrumental variables approach.
First, lets make clear that the number of equations is in principle not enough to solve the problem. This
means that even though one of the coefficients is identified, the other coefficients cannot be recovered without
further assumptions. Second, we derive the instrumental variable estimates. The reduced form is
1
yt = (αγwt + εt + αη t )
1 − αβ
1
xt = (γwt + βεt + η t )
1 − αβ
In this model there are six moments that can be computed in the sample, but there are seven theoretical
moments: three variances (σ 2ε , σ 2η , and σ 2w ) and three coefficients (α, β, and γ), and two covariances of the
structural shocks (E (η 0t εt ) and E (wt0 η t )). This means that not all the coefficients can be recovered from the
data.
However, the amazing implication of instrumental variables is that even though these system is underi-
dentified, in terms of the total number of equations being smaller than the total number of unknowns, still
one of the parameters − actually the parameter of interest − can be recovered from the moments.
Notice that
1 0 γ β 1 1 1
plim w xt = σw + plim wt0 εt + plim wt0 η t
T t 1 − αβ 1 − αβ T 1 − αβ T
 
1 1
= γσ w + plim wt0 η t
1 − αβ T
and
1 0 αγ 1 1 α 1
plim w yt = σw + plim wt0 εt + plim wt0 η t
T t 1 − αβ 1 − αβ T 1 − αβ T
 
α 1
= γσ w + plim wt0 η t
1 − αβ T
106 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

which means that even though when plim T1 wt0 η t 6= 0 still the ratio between these two plim’s is α.6 These
assumptions are much weaker than the ones required by ILS - no wonder why IV made such an incredible
impact in our profession while ILS’s impact has been significantly smaller.
Before turning our attention to the next subject it is important to remember the implicit assumptions of
IV for a much general setup − one that allows random coefficients, for example. We will use this in future
chapters and it is worth including these concepts right away.
Assume we are interested in estimating

yt = αt xt + εt

where
αt = ᾱ + η t
and where
E (x0t εt ) 6= 0.

Assume we have an instrumental variable denoted as wt that satisfies the following assumptions

Assumption 18 The instrumental variable is correlated with the right hand side variable

E (wt0 xt ) 6= 0

but uncorrelated with the residual on the first equation, as well as with the random coefficient

E (wt0 εt ) = 0
E (wt0 η t ) = 0.

Then the average of the random coefficients can be recovered by using the standard instrumental variable
estimator.

It is important to indicate what these assumptions are indeed stating that the instrument is affecting
both endogenous variables, but the effect on yt is entirely through the impact of the instrument on xt . This
means that the residuals in the first equation are unaffected by the instrument, as well as the coefficients.
Under these circumstances IV is a consistent estimate of the average effect (ᾱ).

5.3.1.2 Long Run Restrictions

One of the most used restrictions in VAR’s is the one that was popularized by (Blanchard and Quah 1989).
If it is known that one shock does not have permanent effects, then, under some conditions, it is possible
to obtain identification. For example, assume that nominal shocks are short lived, while real shocks are
permanent. Imposing this constraint (Blanchard and Quah 1989) and (Shapiro and Watson 1988) were able
to estimate the effects of aggregate shocks on aggregate activity and unemployment.
The idea is that we can impose that the long run effect of some shock is zero creating one additional
equation to the system and achieving identification. Obviously, this assumption can be used only when the
system includes lagged dependent variables otherwise it is equivalent to a exclusion restriction.
XXXXXX
1 1 0
6 Usually it is assumed that plim T
wt0 η t = 0 and that plim ε η
T t t
= 0, but at this derivation has shown this is not a
requirement.
5.3. STANDARD SOLUTIONS 107

5.3.2 Variance Restrictions

Finally, constraints on the variances,7 for example, that σ 2η /σ 2ε is equal to some constant, or to infinity. The
case in which the relative variances is restricted to be equal to a constant has not been (frequently) used in
applied work, while the assumption that the ratio goes to zero or to infinite is used as one of the underlying
assumptions of most event studies.

5.3.2.1 Near Identification

Near identification refers to the case in which one of the variances is infinitely large in comparison to the
others. In that case, as has been discussed in Chapter 5, the problem of identification is solved.
Most event studies indeed appeal to this assumption. for example, in corporate finance when we are eval-
uating the impact of earnings announcements on stock prices, the idea is to pool all earning announcements
together in one single day, and the argument is that this process of averaging makes all other shocks in the
economy, such as change in risk premium, interest rates, confidence, etc., smaller. Therefore, it is possible
to measure the impact of the earnings exclusively.
This is the original intuition developed by (Wright 1928) to solve the problem of identification. See
(Fisher 1976) for a general discussion.

5.3.2.2 Relative variance restriction

Setting
σ 2η /σ 2ε = λ
solves the problem of identification in the simultaneous equations problem. In general, this assumption is
hard to justify and therefore, it has not received a lot of attention in applied work. However, it is important
to highlight that in principle, this assumption is as hard to justify as those based on exclusion restrictions.

5.3.3 Sign Restrictions

Sign restrictions: constraining the sign on the slopes of the structural equations can achieve partial identifi-
cation because the two inequalities imply a region of admissible parameters.
Even though a unique estimate cannot be obtained, at least an admissible interval is derived. See
(Fisher 1976) and (Blanchard and Diamond 1989)
[to be completed XXXX]

5.3.4 Reversed Regressions and ”Bounds”

In the standard simultaneous equations problem, it is possible to determine, under certain conditions, what
are the range in which the true coefficients belong. The method was developed by Gini (1926) and it was
later recovered by (?) and (?).8 The purpose of the bounds is to highlight or show the extent of the
misspecification, and offer a range of coefficients that are valid to any possible identification scheme. A
7 See (Rothenberg and Ruud 1990) for a detailled study where covariance restrictions are imposed in linear simultaneous

equation models.
8 See (?) for a discussion along the same lines as here.
108 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM

regressions in which the bounds are tight imply that the biases introduced by simultaneous equations are
small.9
This method was developed for the general problem of misspecification, Assume we are interested in
estimating the simple relationship
yt = axt + ν y,t (5.11)
where the right hand side variable is correlated with the residual because there is a problem of simultaneous
equations. Notice that this is exactly the first equation in our system of equations. It is well known, and
as we have already argued, in the presence of misspecification we cannot estimate a consistently a. Indeed,
because regression 5.11 is misspecified, it is important to realize that there are two forms of estimating a.

yt = axt + ν y,t (5.12)


1
xt = yt + ν x,t (5.13)
a
Observe that under endogeneity both regressions are equally wrong! Gini studied this problem and realized
that depending on the sources of the misspecification, the OLS estimates in these two regressions provide
bounds for the true coefficient. The estimate in equation (5.12) provides one bound, and the inverse of the
estimate on equation (5.13) provides the other bound. The case of simultaneous equations implies that the
OLS estimate in equation (5.12) is (the same as before):

σ 2ε
α̂eq−5.12 = α + β (1 − αβ)
σ 2η + β 2 σ 2ε

while the estimate of 1/a in equation (5.13) is (note that the two expressions are similar):

1
b 1 1 σ2
= − (1 − αβ) 2 2 ε 2
α eq−5.13 α α α ση + σε

We are interested in the estimation of α, hence, we solve for α in the second equation. We can in fact
use both estimates and compute the range where the true coefficient α must lie if the model is correct. To
illustrate the range, consider the two possible cases; where α and β have different or similar signs.
If α and β have different signs, the bias in equation (5.12) makes the OLS coefficient smaller (in absolute
value) than the true one. In other words,
|α̂eq−5.12 | < |α|
Similarly, under the same conditions, the estimate in equation (5.13) is also toward zero. Hence we can write

1
b 1
<

α eq−5.13 α

Therefore,
1
|α̂eq−5.12 | < |α| <
1
bα eq−5.13

In other words, if the two schedules have different signs, then the true coefficient lies between these two
estimates.
The intuition of this result is very simple. First, it is important to realize that equation (5.12) is the
OLS run in one direction, while equation (5.13) is the OLS regression in the other direction. If the schedules
9 Although the bounds were developed for the general misspecification problem, here we concentrate on the simultaneous

equations case.
5.3. STANDARD SOLUTIONS 109

have different signs, simultaneous equations will bias the OLS coefficients toward zero, because the OLS
coefficient is a linear combination of the two coefficients—one positive, and the other negative. Hence the
OLS coefficients in both regressions are smaller in absolute terms than the true ones. However, the coefficient
in the first equation (5.12) attempts to estimate α and the coefficient in the second equation (5.13) estimates
1/α. This is what determines the range.
When the two schedules have the same signs the range of coefficients is different. In this case, the bias
in the OLS in both equations (5.12 and 5.13) is away from zero. So, if both coefficients are positive the OLS
is larger than the true one, and if the coefficients are negative the OLS ones are smaller than the true ones.
This means that in absolute terms the true estimate has to satisfy the following relationship:
 
 1 
|α| < min |α̂eq−5.12 | ,
1 
 bα eq−5.13

Again, this implies a range of coefficients that is admissible. The intuition in this case, follows the same
reasoning as before, where the difference in the two estimates is due to the fact that in both equations the
estimated coefficients are larger than the OLS ones.
These bounds have been extended to study the case of multivariate (here we have discussed only the
bivariate case), and when the type of misspecification is not only simultaneous equations but other forms as
well.
110 CHAPTER 5. IDENTIFICATION IN MACROECONOMICS: PROBLEM
Bibliography

Blanchard, O., and D. Quah (1989): “The Dynamic Effects of Aggregate Demand and Aggregate Supply
Disturbances,” American Economic Review, 79, 655–73.
Blanchard, O. J., and P. Diamond (1989): “The Beveridge Curve,” Brookings Papers in Economic
Activity, 1, 1–76.

Fisher, F. M. (1976): The Identification Problem in Econometrics. Robert E. Krieger Publishing Co., New
York, second edn.
Haavelmo, T. (1947): “Methods of Measuring the Marginal Propensity to Consume,” Journal of the
American Statistical Association, 42, 105–122.

Rothenberg, T. J., and P. A. Ruud (1990): “Simultaneous Equations with Covariance Retrictions,”
Journal of Econometrics, 44(1-2), 25–39.
Shapiro, M. D., and M. W. Watson (1988): “Sources of Business Cycle Fluctuations,” in NBER Macroe-
conomics Annual 1988, ed. by S. Fischer. MIT Press, Cambridge, Mass.

Wright, P. G. (1928): The Tariff on Animal and Vegetable Oils, The Institute of Economics. The Macmil-
lan Conpany, New York.

111
112 BIBLIOGRAPHY
Chapter 6

Identification through
Heteroskedasticity: Theory.

The question of identification when the model includes endogenous variables has been studied for several
decades now.1 The problem arises when the structural form cannot be directly estimated, and the parameters
must be recovered from the reduced form, which has fewer equations than the number of unknowns. Thus, to
solve for the original parameters, more information is required. The typical solution is to impose additional
constraints based on economic knowledge about the particular model that is estimated. Indeed, as was
discussed in the previous chapter assumptions such as exclusion, sign, long-run, and covariance restrictions
have been very useful in numerous applied problems. However, they cannot always be justified.
In this chapter we present an alternative method to solve the identification problem that is based on
the heteroskedasticity that exists in the data. I show that if the structural shocks have a known correlation
(usually zero), the identification problem can be solved by simply appealing to the heteroskedasticity of the
structural shocks. For simplicity, I begin with a case in which there are two endogenous variables and two
regimes. Subsequently, I study the cases in which there are more than two regimes, when there are multiple
endogenous variables, and when common unobservable shocks are present.
The chapter is organized as follows: In section 6.1, we discuss the preliminary intuition of the method of
identification based on the heteroskedasticity. In section 6.2, the typical problem of identification is specified
in the bivariate setting. The methodology based on heteroskedasticity is studied when the data exhibit two
regimes, as well as they exhibit more than two regimes. A GMM interpretation of the estimation problem
is developed. In section 6.3, necessary conditions for identification are derived for multivariate processes
with unobservable common shocks. In section 6.4, the question of consistency under misspecification of the
heteroskedasticity is explored in the bivariate setup. Two cases are studied: First, when the number of
regimes are correctly specified but not the timing of the regimes, or windows, and second, when the number
of regimes is smaller than the actual number of regimes exhibited by the data.

6.1 Preliminary Intuition


The typical problem of identification is depicted in the first panel of Figure 6.1. Assume that in the standard
supply and demand problem we are interested in estimating the slope of the demand curve. The realizations
1 See (Fisher 1976) for the most comprehensive treatment of the subject. See (Haavelmo 1947) and (Koopmans, Rubin, and

Leipnik 1950) for the seminal contributions.

113
114 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

are the outcomes of shocks to both the supply and the demand schedule, so, the OLS estimates would be
biased. The instrumental variable approach solves the problem of identification by finding a variable that
shifts the supply schedule without affecting the demand curve, thus measuring the slope of the demand. The
heteroskedasticity of the structural shocks works in a similar fashion.
The simplest intuition can be developed by looking at a special case: Split the sample in two and assume
that in the second sub-sample the supply shocks are more volatile than in the first sub-sample, while the
demand shocks have a constant variance across the two sub-samples. This increase in the variance of the
supply shocks implies that the “cloud” of realizations enlarges through the demand schedule, as is shown in
the second panel of Figure 6.1. The residuals are distributed along an ellipse, and the shift in the variance
implies a rotation along the demand curve. From the instrumental variables point of view, this is equivalent
to having a “probabilistic” instrument; we cannot assure that the supply curve shifts (as in the standard IV
approach), but in the second sample shocks to the supply are more likely to occur. Thus, the joint behavior
approximates more closely the demand schedule.
In the limit, if the variance of the supply shocks goes to infinity, the ellipse collapses and becomes the
demand curve. In this case, the slope of the demand can be estimated by OLS. This intuition was put
forward by (Wright 1928). This paper extends the original methodology to the case in which the shifts in
the variances are finite, and the form of the heteroskedasticity is unknown. In fact, if the structural shocks
are not correlated, the system is identified just by knowing that there is a change in the relative variance of
the shocks. In particular, if both variances shift by the same amount, then the two ellipses are proportional,
and the system is not identified. On the other hand, if the relative importance changes, then the system will
be identified by the rotation of the ellipse.

6.2 Identification

6.2.1 Identification under two regimes.

Assume there are two regimes in the variances of the structural shocks: high and low volatility. Additionally,
assume that the structural parameters are stable across the regimes. Under these assumptions the two
reduced form covariance matrices have the same structure as before:
 2 2
α σ η,s + σ 2ε,s ασ 2η,s + βσ 2ε,s
  
ω 11,s ω 12,s 1
Ω̂s ≡ = 2 , s ∈ {1, 2} , (6.1)
. ω 22,s (1 − αβ)
2 . σ 2η,s + β σ 2ε,s

where each regime is denoted as s ∈ {1, 2}, where the variances of the structural shocks in regime s are
given by σ ε,s and σ η,s , and where Ω̂s indicates the reduced form covariance matrix in regime s. In this new
system of equations there are six unknowns: α, β, σ 2η,1 , σ 2ε,1 , σ 2η,2 , and σ 2ε,2 , and two covariance matrices
that provide six equations! If the equations are independent, the problem of identification has been solved.
It is essential to restate the assumptions that lead to the identification of the system: (i) the parameters
are stable across the heteroskedasticity regimes, and (ii) the structural shocks are not correlated. These
assumptions are implicit in much of the applied macro work and are further discussed below.
Solving for the variances in equation (6.1), α and β satisfy the following non-linear system of equations:

ω 12,s − β · ω 11,s
α= , s ∈ {1, 2} . (6.2)
ω 22,s − β · ω 12,s

After some algebra, β solves the quadratic equation:

[ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] β 2 − [ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] β + [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] = 0 (6.3)
6.2. IDENTIFICATION 115

There are two solutions to the quadratic equation. It is easy to show that if α, β is one solution to the system
of equations, then β ∗ = 1/α, α∗ = 1/β, is the other solution. Indeed, the solutions are the two possible ways
in which the structural form can be written. In other words, the system is identified up to row permutations
of the original model.

Proposition 19 Let yt and xt be described by equations (5.9) and (5.10), where the parameters (α and β)
determining the law of motion are stable and where the disturbances have finite variance, are not correlated,
and exhibit heteroskedasticity that can be described with two regimes. Then, if the covariance matrices satisfy

w11,2
det Ω̂2 −
Ω̂1 6= 0 (6.4)
w11,1

the structural form is just identified: α and β are consistently estimated from the two estimable covariance
matrices.

Proof. Identification is achieved if equation (6.3) has real solutions. A real solution requires
2
[ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] − 4 [ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] > 0.

After some algebra this is equal to


 2 2
ω 11,2 ω 222,2 [θ11 − θ22 ] − 2ω 11,2 ω 22,2 ω 212,2 [2 (θ11 − θ12 ) (θ12 − θ22 )] > 0,
  

ω 11,1 ω 12,1 ω 22,1


where θ11 = ω 11,2 , θ12 = ω 12,2 , and θ22 = ω 22,2 . A sufficient condition for this inequality to be positive is
 2
ω 11,2 ω 222,2 − 2ω 11,2 ω 22,2 ω 212,2 >
  
0
2
[θ11 − θ22 ] − [2 (θ11 − θ12 ) (θ12 − θ22 )] > 0.

The first one is satisfied because the positive definite properties of the covariance matrix

ω 11,2 ω 22,2 ω 11,2 ω 22,2 − 2ω 212,2 > 0.


 

The second inequality is, after some algebra, equal to


2 2
[θ11 − θ12 ] + [θ22 − θ12 ] > 0,

which is always positive. Therefore, if the coefficients in the quadratic equation are different from zero, then
the two roots are real.
The last requirement is to show when the quadratic equation does not have infinite solutions. This
requires that either
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 6= 0,
or
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 6= 0.
Given the model generating the data, these two assumptions are not satisfied if the heteroskedasticity implies
a proportional change of both structural shocks’ variances. In other words, when Ω2 = aΩ1 , for some scalar
a. This is the only case in which the solution to the quadratic equation (6.3) has infinite solutions.
Note
h that if Ωi2 = aΩ1 then det [Ω2 − a Ω1 ] = 0, which can be tested by computing whether or not
ω 11,2 ?
det Ω2 − ω11,1 Ω1 = 0. By construction this is equivalent to asking if the covariance of the normalized
difference is equal to zero:
?
ω 11,1 ω 12,2 − ω 11,2 ω 12,1 = 0.
116 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

The small sample properties of this statistic are better behaved than the ones from the determinant, and in
the empirical section this is what is implemented to check the rank condition.
Consistent estimates of both covariance matrices imply that the estimate of β solves the following
quadratic equation:

[ω 11,1 ω 12,2 − ω 12,1 ω 11,2 ] β 2 − [ω 11,1 ω 22,2 − ω 22,1 ω 11,2 ] β + [ω 12,1 ω 22,2 − ω 22,1 ω 12,2 ] = 0,

where
1
α2 σ 2η,1 + σ 2ε,1 ασ 2η,2 + βσ 2ε,2 − ασ 2η,1 + βσ 2ε,1 α2 σ 2η,2 + σ 2ε,2
    
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 = 2
(1 − αβ)
1
α2 σ 2η,1 + σ 2ε,1 σ 2η,2 + β 2 σ 2ε,2 − σ 2η,1 + β 2 σ 2ε,1 α2 σ 2η,2 + σ 2ε,2
    
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 = 2
(1 − αβ)
1
ασ 2η,1 + βσ 2ε,1 σ 2η,2 + β 2 σ 2ε,2 − σ 2η,1 + β 2 σ 2ε,1 ασ 2η,2 + βσ 2ε,2 ,
    
ω 12,1 ω 22,2 − ω 22,1 ω 12,2 = 2
(1 − αβ)

which after some algebra are equal to


1
−ασ 2η,1 σ 2ε,2 + ασ 2ε,1 σ 2η,2
 
ω 11,1 ω 12,2 − ω 12,1 ω 11,2 =
1 − αβ
1
−σ 2η,1 σ 2ε,2 (1 + αβ) + σ 2ε,1 σ 2η,2 (1 + αβ)
 
ω 11,1 ω 22,2 − ω 22,1 ω 11,2 =
1 − αβ
1
−βσ 2η,1 σ 2ε,2 + βσ 2ε,1 σ 2η,2 .
 
ω 12,1 ω 22,2 − ω 22,1 ω 12,2 =
1 − αβ
Hence, the two solutions to the quadratic equation are
 
[(1 + αβ) ± (1 − αβ)] −σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2
β=   ,
2α −σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2

where, under the assumption that the rank condition is satisfied (equations (6.4) or (6.5)), the solution of
the system of equations is
[(1 + αβ) ± (1 − αβ)]
β=

where one solution is β = β and the other one is β = 1/α, which are the two permutations of the system
of equations. Thus, if σ 2η,1 , σ 2ε,2 , σ 2ε,1 , and σ 2η,2 are consistently estimated from the data, the consistency
of β is assured. But consistent estimates of the structural variances are indeed obtained from consistent
estimates of the reduced form covariance matrices if the system is linear, the parameters are stable, and the
the residuals have finite variances.
Furthermore, observe that β is consistent if the relative variances of the structural shocks shift:

σ 2η,1 σ 2η,2
−σ 2η,1 σ 2ε,2 + σ 2ε,1 σ 2η,2 6= 0 ⇒ 6
= ,
σ 2ε,1 σ 2ε,2

which is the generalization of the intuition in (Wright 1928).


Equation (6.4) is equivalent to
w11,1 w12,2 − w11,2 w12,1 6= 0. (6.5)
Note that conditions (6.4) and (6.5) are similar to testing the rank condition when the order condition
(number of equations) has been satisfied. In terms of the standard literature on linear systems of equations,
the order condition requires that the number of equations must be at least larger than or equal to the number
6.2. IDENTIFICATION 117

of unknowns. The rank condition requires the number of linearly independent equations to be equal to or
larger than the number of unknowns. In linear systems of equations, this is done by computing the rank
of the matrix. In the case studied here, the system is non-linear, and the rank condition takes the form of
equation (6.4).
Equation (6.4) fails if the two covariance matrices are proportional; i.e., the heteroskedasticity does not
identify the system if the relative variances are constant across regimes. Returning to the intuition given
in the introduction, imagine that the variance of both shocks doubles; then the shape of the ellipse across
the two regimes is the same, and nothing can be learned about the original system. Technically, this is the
case in which we have six equations and six unknowns, but the equations are not independent. On the other
hand, when the relative ratio of the variances shifts, then the heteroskedasticity changes the region in which
the errors are distributed, enlarging the ellipse along one of the structural equations. This rotation in the
ellipse can be estimated from the reduced form covariances allowing us to obtain the slope of the schedules.
The simplest intuition of how identification is achieved can be developed by first analyzing the case in
which the variance changes for only one shock. Assume that it is known that at some point in time there is
an increase in the variance of the supply shocks. During that period, the “cloud” of realizations is going to
widen along the demand curve as depicted in Figure 6.1. Comparing how the ellipse of the realizations has
changed across the two samples allows one to determine the slope of the demand curve. In this particular
case, because it has been assumed that the structural shocks have zero correlation, this is enough to estimate
the slope of the supply curve, too. Moreover, this explanation has an instrumental variable interpretation.
A valid instrument to estimate the demand schedule is one that moves the supply without affecting the
demand. In this example, the rise in the variance of the supply shocks becomes a probabilistic instrument
precisely because it increases the likelihood that the supply equation “moves”.
Finally, when both variances shift, there is an expansion along both schedules. So it is not necessary to
know which shock becomes more important across the regimes. It is enough if the relative variances shift -
equation (6.4) would be satisfied and both schedules identified.

6.2.2 Identification under more than two regimes.

It is easy to extend the previous results to the case where there are more than two regimes. Assume that
the data exhibit multiple finite heteroskedasticity regimes indexed by s ∈ {1, .., S}. For each regime, the
covariance matrix is
 2 2
α σ η,s + σ 2ε,s ασ 2η,s + βσ 2ε,s
  
ω 11,s ω 12,s 1
Ω̂s ≡ = . (6.6)
. ω 22,s (1 − αβ)
2 . σ 2η,s + β 2 σ 2ε,s

This is a system that has 3S equations (one covariance matrix per regime) and 2S + 2 unknowns: S times
two structural variances for each regime, plus two parameters (α and β).
The order condition will be satisfied for any S larger than or equal to two. The rank condition takes the
same form as equations (6.4) and (6.5) for any pair of regimes. Indeed, the system is overidentified if there
are at least three regimes that satisfy the rank condition for all combinations.
Appealing to the probabilistic IV interpretation used before, each new heteroskedastic regime is a valid
instrument if and only if it satisfies the rank condition with respect to all the previous regimes. In this case,
each new covariance matrix adds three equations and only two unknowns. Otherwise, the new heteroskedastic
regime does not increase the number of restrictions on the structural coefficients. Hence, for S larger than
two, and for all covariance matrices satisfying the rank condition, the system of equations is overidentified,
and the underlying assumption - such as that α and β are stable through time - can be tested. The estimation
has a minimum distance interpretation where each heteroskedastic regime is equivalent to one instrument.2
2 The additional equations can also be interpreted as a factor regression model - where the left hand side variables of equation
118 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

6.3 Identification with common shocks


In the previous sections, the stochastic process is bivariate and there are no common shocks. In this section,
these assumptions are relaxed and the necessary conditions to achieve identification are discussed.3
It should be clear that if we allow for a common unobservable heteroskedastic shock in the bivariate
setting, the heteroskedasticity will not be sufficient to achieve identification. Each heteroskedastic regime
adds not only three equations, but also three unknowns. So it is essential to impose some constraints on the
covariances to be able to use the variation in the second moments to solve the problem of identification.
Assume that there are N endogenous variables, K common unobservable shocks, and s ∈ {1, ..., S}
possible regimes or states. Denote the structural form as follows:
     
x1,t z1,t ε1,t
AN ×N  ...  = ΓN ×K  ...  +  ...  , (6.7)
     

xN,t zK,t εN,t


where all the shocks are assumed to have zero correlation at all leads and lags,
E [zi,t , zj,t ] = 0 ∀i 6= j, i, j ∈ {1, K}
E [εi,t , εj,t ] = 0 ∀i 6= j, i, j ∈ {1, N } (6.8)
E [zi,t , εj,t ] = 0 ∀i 6= j, i ∈ {1, K}, j ∈ {1, N },
and where xn,t , n ∈ {1, ..., N } are the N endogenous (row vector) variables; where zk,t , k ∈ {1, ..., K} are
the K unobservable common shocks, assumed to have no correlation, with variance σ z,k,s in state s; and
where εn,t are the structural shocks, assumed not to be correlated, with variance σ ε,n,s in state s.
The matrix AN ×N describes the contemporaneous parameters,
 
1 a12 · · · a1n
 a21 1 · · · a2n 
AN ×N =  . , (6.9)
 
. .. .. ..
 . . . . 
an1 an2 ··· 1
where the assumption of normalization already has been imposed (coefficients along the diagonal are equal
to one). And ΓN ×K are the parameters from the common shocks, where normalization is also assumed; in
this case, it implies a unitary impact on the first equation,
 
1 1 ··· 1
 γ 21 γ 22 · · · γ 2k 
ΓN ×K =  . ..  . (6.10)
 
.. ..
 .. . . . 
γ n1 γ n2 · · · γ nk

Proposition 20 A multivariate system of N equations, with K unobservable common shocks, described by


equations (6.7), (6.8), (6.9), and (6.10) is identified if and only if, for N > 1,
(i) the number of states (S) satisfies,
(N + K) (N − 1)
S≥2 , (6.11)
N 2 − N − 2K
(6.6) are the estimates (or observable), the variances (σ 2η,s and σ 2ε,s ) are the unobservable factors, and the coefficients are the
weights or factor loadings. Factor analysis usually assumes that the ω ij,s ’s are independent. It is unlikely, however, that this
is the case in this setup. Therefore proper corrections have to be considered in the estimation procedure. In this paper, I use
the GMM interpretation.
3 Including common shocks in the model is equivalent to relaxing the assumption on the correlation of the structural shocks.
6.3. IDENTIFICATION WITH COMMON SHOCKS 119

(ii) if there is a minimum number of endogenous variables (or maximum number of common shocks) that
satisfies
N 2 − N − 2K > 0. (6.12)

(iii) and if the covariance matrices constitute a system of equations that is linearly independent.

Proof. Note that the proposition states a necessary condition, but not a sufficient one. Thus it is stating
an order condition. From equation (6.7), the number of equations is given by the covariance matrix in each
regime. This provides N (N2+1) equations in each state. The total number of unknowns is as follows: The
matrix AN ×N has N (N − 1) parameters; the matrix ΓN ×K has K(N − 1) parameters; the variances of the
common shocks in each state is K · S (K variances times S regimes) and the variances of the structural
shocks in each regime are N · S (N variances times S regimes). Identification, then, requires

N (N + 1)
S· ≥ N (N − 1) + K(N − 1) + S · K + S · N
2
(N + K) (N − 1)
S ≥ 2 .
N 2 − N − 2K
Inequality (6.11) indicates the minimum number of states required to obtain identification. Finally, in order
for (6.11) to make sense, there is a minimum number of endogenous variables, which is given by

N 2 − N − 2K > 0.

Equation (6.12) is the “catch up” constraint. It indicates the conditions under which one additional
regime in the variance-covariance adds more equations than unknowns. In the example that motivated
this section, (N = 2 and K = 1) implies that the inequality is not satisfied and no further information
is obtained from the heteroskedasticity. Moreover, if the common shocks are interpreted as the sources of
correlation between the structural shocks, then this constraint indicates that some of the covariances of the
structural shocks must be restricted to be constant or zero. Solving for K it is found that identification
requires K < N (N2−1) , where the right hand side of this inequality is exactly the number of all possible
contemporaneous correlations among structural shocks.
There are two main implications of proposition 20: First, in the absence of common shocks only two states
are required to achieve identification, independently of the number of endogenous variables N . Second, if
K > 0 and N is finite, the number of states required to achieve identification is always larger than two.
The estimation of this model is performed by GMM where the moment conditions are

AΩs A0 = ΓΩz,s Γ0 + Ωε,s , (6.13)

where Ωs is the covariance matrix that can be estimated in the data from the observed variables (xt ) in
regime s, Ωz,s is the covariance matrix of the common unobservable shocks in regime s, which, given the
assumptions in equation (6.8), is a diagonal matrix, and Ωε,s is the covariance matrix of the structural shocks
in regime s, which given the assumptions in equation (6.8), is also diagonal. The parameters of interest are
A and Γ.
As I hope it is clear, the assumptions required to identify the model when there are common shocks is
much harder than in the case in which the covariance assumption of the structural shocks can be imposed
directly. In what follows I would like to discuss two methodologies that Brian Sack and I have used in other
papers to deal with the presence of common shocks. This is an extremely important problem when we are
dealing with macro asset pricing.
120 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

6.3.1 Related literature

At this point it is useful to discuss the relationship between this methodology and the literature on identifi-
cation using heteroskedasticity. As mentioned before, the use of second moments as a source of identification
was firstly introduced by Philip Wright [1928]. He indicated that an increase in the variance of the shocks
in one equation reduces the bias introduced by simultaneous equation problems in the OLS estimate of
the other one. Taking the limit to infinity implies that OLS would estimate the coefficients consistently.
Relatively new research has been conducted extending the original intuition (i) to non-linear models, (ii) to
models with parametric representations of the heteroskedasticity (such as ARCH or GARCH models), and
(iii) to models that are partially identified.
(Klein and Vella 2000b) and (Klein and Vella 2000a) discuss the problem of identification and estimation
in a binary endogenous model when exclusion restrictions (or any other parameter restrictions) are not
available and the case of the triangular model, respectively. They estimate the heteroskedasticity semi-
parametrically and use the residual from the second equation as an additional regressor in the first equation
as the instrument.4
(Sentana 1992) and (Sentana and Fiorentini 2001) study the problem of estimation in factor regressions
when there is conditional heteroskedasticity. The simple case developed in this section (proposition 19) is
a special case of their proposition 3. They study the conditions in which identification is achieved in a
non-triangular system when the common latent factors exhibit heteroskedasticity.
There are important differences between those papers and the approach developed here. First, the
procedure highlighted in this paper requires only the knowledge that a shift in the relative variances has
occurred - that is, the regime shift comes from economic events, such as crisis, policy shifts, or other
characteristics in the data as heteroskedasticity along regions, time, or other cross-sectional characteristics.
The ARCH specification uses the time series heteroskedasticity in the data as an statistical vehicle to achieve
identification. Second, the procedure described in this paper allows us to test for some of the underlying
assumptions, such as parameter stability; the system is overidentified when there are more than two regimes.
The techniques based on conditional heteroskedasticity are unable to provide this test. Third, as is shown
below, if the heteroskedasticity is misspecified in this model, the coefficients are still consistent. This is not
the case when the heteroskedasticity is modeled parametrically; misspecification in those cases could bias
the contemporaneous coefficients as well. Furthermore, if the data exhibit conditional heteroskedasticity,
and the procedure here described is implemented, it is still the case that the coefficients will be consistent.
Fourth, models that rely on conditional heteroskedasticity to achieve identification require the number of
heteroskedastic shocks to be smaller than, or equal to, the number of endogenous variables. As is shown in
Section 6.3, this is not the case in the present procedure. If there are more than two regime shifts, there
exist conditions in which it is possible to have more latent factors than endogenous variables and still being
able to identify the structural system.
Though the estimation procedures among all these papers are very different, they share the same intuition
for solving the problem of endogenous variables: the heteroskedasticity adds equations to the system after
some covariance restrictions have been imposed. It is important to mention that these procedures require
that the system of equations be linear, or in other words, that the coefficients be stable to changes in the
volatility. Future research should consider extending the methodology to non-linear specifications.
Finally, in addition to the papers mentioned above, some applied papers already have used heteroskedas-
ticity to identify a system of equations. In the context of conditional heteroskedasticity, see (Caporale,
Cipollini, and Spagnolo 2002), (Dungey and Martin 2001), (King, Sentana, and Wadhwani 1994), and
(Rigobon 2002). In these papers a structural conditionally heteroskedastic model is estimated from a re-
duced form GARCH model. In the context of regime switches see (?), and (Rigobon and Sack 2003) and
4 See also (Chen and Khan 1999) for a general solution of the problem of identification in sample selection models when the

data exhibit heteroskedasticity.


6.4. CONSISTENCY UNDER MISSPECIFICATION OF THE HETEROSKEDASTICITY. 121

(Rigobon and Sack 2004)


In the context of testing parameter stability see (Rigobon 2000) and (Rigobon 2003). I discuss partial
identification of simultaneous equation models with unobservable common shocks. That paper is more
concerned with developing a test for stability of parameters rather than with identifying the system of
equations. The procedure depends on the presence of a particular form of the heteroskedasticity, where in
the short run only a subset of the variances are allowed to shift.
Relatively new applications are arising in panel data questions. As far as I can tell, it seems that in
those applications the power of the panel is strong enough to produce very tight estimates. (Hogan and
Rigobon 2003) estimate the returns to education using the heteroskedasticity that exists among the different
regions in the U.K. See also, (Rigobon and Rodrik 2005), (Lee, Ricci, and Rigobon 2004), as well as others.

6.4 Consistency under misspecification of the heteroskedasticity.


An important question arising from the previous derivation is the issue of consistency when the heteroskedas-
ticity is misspecified. This section shows that the estimates are consistent even though the regimes might
be misspecified.
In this section two cases are evaluated: (i) when the windows of the heteroskedasticity are wrongly
specified but the number of regimes is correct, (ii) and when the data have more regimes than the ones
assumed in the specification. Without loss of generality, only the bivariate case in which there are no
common shocks is discussed.
The intuition about why consistency is achieved in these two cases is that the misspecified covariance
matrices are linear combinations of the true underlying ones. Therefore, the misspecified system of equations
is a linear transformation of the original problem. If this linear transformation does not drop the rank of the
system, the same solution is obtained. It is not proven in this section, but it should be intuitively obvious
that the misspecification reduces the power of the test by eliminating the differences across regimes. For
example, if in the limit, when the misspecification is so large that the system drops rank, then the estimates
are inconsistent - there is a continuum of them.

6.4.1 Misspecification of the regime windows.

Assume the system is described by equations (5.9) and (5.10), and that the data exhibit heteroskedastic-
ity with only two regimes. If the windows are misspecified, the computed covariance matrices are linear
combinations of the true underlying covariance matrices. Denote

Ωr1 = λr1 Ω1 + (1 − λr1 ) Ω2 ,


Ωr2 = (1 − λr2 ) Ω1 + λr2 Ω2 ,

where Ω1 and Ω2 are the true covariance matrices describing the heteroskedasticity, Ωr1 and Ωr2 are the
estimated covariance matrices, and λr1 and λr2 are weights indicating how “correct” the windows are; when
they are equal to one, the windows coincide with the true regimes.

Proposition 21 Assume the original system satisfies the rank condition (6.4). If the misspecified het-
eroskedasticity satisfies the rank condition (6.4), then the model is identified and its estimators are consis-
tent.
122 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

Proof. After some algebra the two covariance matrices can be written in terms of the underlying variances:
 2 2
α σ η,r1 + σ 2ε,r1 ασ 2η,r1 + βσ 2ε,r1

1
Ωr1 = ,
(1 − αβ)
2 . σ 2η,r1 + β 2 σ 2ε,r1
 2 2
α σ η,r2 + σ 2ε,r2 ασ 2η,r2 + βσ 2ε,r2

1
Ωr2 = ,
(1 − αβ)
2 . σ 2η,r2 + β 2 σ 2ε,r2
where
σ 2η,r1 = λr1 σ 2η,1 + (1 − λr1 ) σ 2η,2 and σ 2ε,r1 = λr1 σ 2ε,1 + (1 − λr1 ) σ 2ε,2 (6.14)
σ 2η,r2 = (1 − λr2 ) σ 2η,1 + λr2 σ 2η,2 and σ 2ε,r2 = (1 − λr2 ) σ 2ε,1 + λr2 σ 2ε,2 . (6.15)

Given that the original heteroskedasticity satisfied the rank condition (σ 2η,1 σ 2ε,2 − σ 2η,2 σ 2ε,1 6= 0), there are
two questions to answer: (i) in which circumstances the misspecified model satisfies the rank condition, and
(ii) in which circumstances the estimates are consistent. After some algebra, Ωr1 and Ωr2 satisfy equation
(6.4) if and only if
σ 2η,r1 σ 2ε,r2 6= σ 2η,r2 σ 2ε,r1 .
Substituting by the definitions of the variances (equations 6.14 and 6.15), the rank condition is not satisfied
if and only if
λr1 = 1 − λr2 .
In other words, the rank condition is not satisfied if the windows are so badly specified that they imply the
same weights on the true regimes. Thus, the two computed matrices are identical.
Assume the rank condition is satisfied; then the question is whether the solution of the new system of
equations is consistent. Substituting equations 6.14 and 6.15 into equation (6.3), the estimated β solves.
   
Φα 2 1 β
3 β − +β β+ = 0, (6.16)
(1 − αβ) α α
where
Φ = σ 2η,1 σ 2ε,2 − σ 2η,2 σ 2ε,1 (1 − λr1 − λr2 ) .


Note that under the assumption that the original heteroskedasticity satisfies the rank condition, and that
λr1 6= 1 − λr2 , then Φ is different from zero. Hence, equation (6.16) solves the exact same quadratic equation
as the well-specified model. Thus the consistency is assured if the covariance matrix is consistently estimated.
The two solutions are β and 1/α. Therefore, if the regimes are misspecified and the system satisfies the rank
condition, then the estimates are consistent.
In other words, if the computed covariance matrices satisfy the rank condition, then the estimates are
consistent even if the regimes have been slightly misspecified. On the other hand, if the misspecification is so
large that the system fails the rank condition, then the coefficients are not identified. Hence, the estimated
coefficients should be consistent for small perturbations of the regime definitions.
Remember that the equivalent rank condition is testable. Therefore, the degree of misspecification can
be detected in the applications.

6.4.2 Under-specified number of regimes.

Assume the system is described by equations (5.9) and (5.10), and that the data exhibit heteroskedasticity
with S ∗ regimes, where there are no restrictions to the form of the heteroskedasticity. For simplicity denote
the variances of the structural shocks in each regime as follows:
σ 2η,s = (1 + δ η,s ) σ 2η,0
∀s 6= 0,
σ 2ε,s = (1 + δ ε,s ) σ 2ε,0
6.4. CONSISTENCY UNDER MISSPECIFICATION OF THE HETEROSKEDASTICITY. 123

where σ 2η,s and σ 2ε,s represent the variances of the idiosyncratic shocks in regime s, and δ η,s and δ ε,s are the
changes of those variances relative to the variances from regime s = 0.
Assume that only two regimes are used in the estimation. Without loss of generality assume that the
first window corresponds to the first set of ŝ < S ∗ regimes and that the second window corresponds to the
second set of S ∗ − ŝ regimes. The covariance matrices of each of the misspecified periods are given by:
 21 P 2
σ η,s + 1ŝ σ ε,s α 1ŝ σ η,s + β 1ŝ
P 2 P 2 P 2 
α ŝ σ ε,s
1 s<ŝ s<ŝ s<ŝ s<ŝ
Ωr1 = 1
σ 2η,s + β 2 1ŝ
 P P 2 
2 . σ ε,s
(1 − αβ) ŝ
s<ŝ s<ŝ

for the first window, and

α2 S ∗1−ŝ σ 2η,s + 1
σ 2ε,s α S ∗1−ŝ σ η,s + β S ∗1−ŝ
 P P P 2 P 2 
S ∗ −ŝ σ ε,s
1 s>ŝ s>ŝ s>ŝ s>ŝ
Ωr2 = 1
σ 2η,s + β 2 S ∗1−ŝ
 
σ 2ε,s
2
P P
(1 − αβ) . S ∗ −ŝ
s>ŝ s>ŝ

for the second one. The two matrices can be rewritten as


(1 + δ η,r1 ) α2 σ 2η,0 + (1 + δ ε,r1 ) σ 2ε,0 (1 + δ η,r1 ) ασ 2η,0 + (1 + δ ε,r1 ) βσ 2ε,0
 
1
Ωr1 =
(1 − αβ)
2 (1 + δ η,r1 ) σ 2η,0 + (1 + δ ε,r1 ) β 2 σ 2ε,0
(1 + δ η,r2 ) α2 σ 2η,0 + (1 + δ ε,r2 ) σ 2ε,0 (1 + δ η,r2 ) ασ 2η,0 + (1 + δ ε,r2 ) βσ 2ε,0
 
1
Ωr2 = ,
(1 − αβ)
2 (1 + δ η,r2 ) σ 2η,0 + (1 + δ ε,r2 ) β 2 σ 2ε,0

where
1X 1 X
δ η,r1 = δ η,s and δ η,r2 = δ η,s (6.17)
ŝ S ∗ − ŝ
s<ŝ s>ŝ
1X 1 X
δ ε,r1 = δ ε,s and δ ε,r2 = ∗ δ ε,s . (6.18)
ŝ S − ŝ
s<ŝ s>ŝ

Proposition 22 Assume the true heteroskedasticity is described by S ∗ regimes and that those covariance
matrices satisfy the rank condition (6.4). Assume that only two regimes have been used in the estimation;
then, if the following conditions are satisfied, the system is identified and its estimates are consistent.

1. The misspecified covariance matrices have to exhibit heteroskedasticity: Ωr1 6= Ωr2


2. The misspecified covariance matrices satisfy the rank condition (6.4).

Proof. The first assumption in the proposition is to guarantee that the original system can be identified
if the heteroskedasticity is well specified. In the ill-specified model, identification is achieved if the relative
volatilities change. This is equivalent to

δ η,r1 6= δ η,r2 or δ ε,r1 6= δ ε,r2 . (6.19)

Equation (6.19) indeed guarantees that the two estimated covariance matrices are different. In other words,
it guarantees that the order condition will be satisfied; there is heteroskedasticity in the estimated model.
The next question is, as before, what are the conditions for consistency. Substituting into equation (6.3)
for the computed covariance matrices (Ωr1 and Ωr2 ) the estimated β satisfies,

σ 2η,0 σ 2ε,0 Φα
   
2 1 β
3 β − +β β+ = 0, (6.20)
(1 − αβ) α α
124 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.

where
Φ = (1 + δ ε,r1 ) (1 + δ η,r2 ) − (1 + δ ε,r2 ) (1 + δ η,r1 ) .
Note that if Φ is different from zero, then β solves the same quadratic equation as the original model. Φ is
different from zero if condition (6.19) is satisfied, and

δ η,r1 δ ε,r1
6= . (6.21)
δ η,r2 δ ε,r2

Condition (6.21) indicates that the change in the variances across the misspecified regimes cannot be pro-
portional. In other words, this is equivalent to the rank condition discussed before. Again, the two roots
solving equation (6.20) are β and 1/α.
In summary, even though the assumed form of the heteroskedasticity implies a smaller number of regimes
than those exhibited in the data, the system is identified and its estimates are consistent if and only if the
order and rank conditions are satisfied by the misspecified matrices.

It is important to mention that if the number of true regimes is smaller than the number of regimes used
in the estimation, then the system of equations does not satisfy the rank condition. In other words, there
are not enough independent equations to identify the system. It should be clear that in those cases the
estimates are inconsistent, and the confidence intervals are infinitely large.
The two cases analyzed in this section are probably the most common forms of misspecification. However,
they are not exhaustive. Depending on the particular application in which the identification is used, and the
possible misspecification problems that could be encountered, the consistency of the methodology should be
explored further.
6.4. CONSISTENCY UNDER MISSPECIFICATION OF THE HETEROSKEDASTICITY. 125

Figure 6.1: Identification Problem.


126 CHAPTER 6. IDENTIFICATION THROUGH HETEROSKEDASTICITY: THEORY.
Bibliography

Caporale, G. M., A. Cipollini, and N. Spagnolo (2002): “Testing for Contagion: A Conditional
Correlation Analysis.,” CEMFE Mimeo.

Chen, S., and S. Khan (1999): “ n-Consistent Estimation of Heteroskedastic Sample Selection Models,”
University of Rochester, Mimeo.

Dungey, M., and V. L. Martin (2001): “Contagion Across Financial Markets: An Empirical Assessment,”
Australian National University Mimeo.
Fisher, F. M. (1976): The Identification Problem in Econometrics. Robert E. Krieger Publishing Co., New
York, second edn.

Haavelmo, T. (1947): “Methods of Measuring the Marginal Propensity to Consume,” Journal of the
American Statistical Association, 42, 105–122.
Hogan, V., and R. Rigobon (2003): “Using Unobserved Supply Shocks to Estimate the Returns to
Education,” NBER working paper 9145.
King, M., E. Sentana, and S. Wadhwani (1994): “Volatility and Links Between National Stock Markets,”
Econometrica, 62, 901–33.
Klein, R., and F. Vella (2000a): “Employing Heteroskedasticity to Identify and Estimate Triangular
Semiparametric Models,” Rutgers mimeo.
(2000b): “Identification and Estimation of the Binary Treatment Model Under Heteroskedasticity,”
Rutgers mimeo.
Koopmans, T., H. Rubin, and R. Leipnik (1950): Measuring the Equation Systems of Dynamic Eco-
nomicsvol. Statistical Inference in Dynamic Economic Models of Cowles Commission for Research in
Economics, chap. II, pp. 53–237. John Wiley and Sons, New York.
Lee, H. Y., L. Ricci, and R. Rigobon (2004): “Once Again, is Account Openness Good for Growth?,”
Journal of Development Economics, 75(2), 451–472.
Rigobon, R. (2000): “A Simple Test for the Stability of Linear Models under Heteroskedasticity, Omitted
Variable, and Endogneous Variable Problems.,” MIT Mimeo: http://web.mit.edu/rigobon/www/.
(2002): “The Curse of Non-Investment Grade Countries,” Journal of Development Economics,
69(2), 423–449.

(2003): “On the Measurement of the International Propagation of Shocks: Is the Transmission
Stable?,” Journal of International Economics, 61, 261–283.

127
128 BIBLIOGRAPHY

Rigobon, R., and D. Rodrik (2005): “Rule of Law, Democracy, Openness, and Income: Estimating the
Interrelationships,” The Economics of Transition, 13(3), 533–64.
Rigobon, R., and B. Sack (2003): “Measuring the Reaction of Monetary Policy to the Stock Market,”
Quarterly Journal of Economics, 118, 639–669.

(2004): “The Impact of Monetary Policy on Asset Prices,” Journal of Monetary Economics, 51,
1553–75.
Sentana, E. (1992): “Identification of Multivariate Conditionally Heteroskedastic Factor Models,” LSE,
FMG Discussion Paper, 139.

Sentana, E., and G. Fiorentini (2001): “Identification, Estimation and Testing of Conditional Het-
eroskedastic Factor Models,” Journal of Econometrics, 102(2), 143–164.
Wright, P. G. (1928): The Tariff on Animal and Vegetable Oils, The Institute of Economics. The Macmil-
lan Conpany, New York.

You might also like