An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
An Introduction To Modern Bayesian Econometrics: Tony Lancaster May 26, 2003
ii
Contents
Introduction 1 The Bayesian Algorithm 1.1 Econometric Analysis . . . . . . . . . . . 1.2 Statistical Analysis . . . . . . . . . . . . . 1.3 Bayes Theorem . . . . . . . . . . . . . . . 1.3.1 Parameters and Data . . . . . . . 1.3.2 The Bayesian Algorithm . . . . . . 1.4 The Components of Bayes Theorem . . . 1.4.1 The Likelihood p(y|) . . . . . . . 1.4.2 The Prior p() . . . . . . . . . . . 1.4.3 The Posterior p(|y) . . . . . . . . 1.5 Conclusion and Summary . . . . . . . . . 1.6 Exercises and Complements . . . . . . . . 1.7 Appendix to Chapter 1: Some Probability 1.8 2 xi 1 1 2 3 8 9 9 10 28 39 56 57 63 66 69 70 71 72 76 79 79 79 81 83 84 87 89 90 97 99
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distributions
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . .
Prediction and Model Criticism 2.1 Methods of Model Checking . . . . . . . . . . . . . . . . . . . . . 2.2 Informal Model Checks . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Residual QQ Plots . . . . . . . . . . . . . . . . . . . . . . 2.3 Uncheckable Beliefs? . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Formal Model Checks . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Predictive Distributions . . . . . . . . . . . . . . . . . . . 2.4.2 The Prior Predictive Distribution. . . . . . . . . . . . . . 2.4.3 Using the Prior Predictive Distribution to Check your Model 2.4.4 Improper Prior Predictive Distributions . . . . . . . . . . 2.4.5 Prediction from Training Samples . . . . . . . . . . . . . 2.5 Posterior Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Posterior Model Checking . . . . . . . . . . . . . . . . . . 2.5.2 Sampling the Predictive Distribution . . . . . . . . . . . . 2.6 Posterior Odds and Model Choice . . . . . . . . . . . . . . . . . 2.6.1 Two Approximations to Bayes Factors . . . . . . . . . . . iii
iv 2.7 2.8 2.9 2.10 Enlarging the Model . . . . . . . Summary . . . . . . . . . . . . . Exercises and Further Examples Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 105 106 110 113 113 113 115 115 117 120 121 128 131 134 135 136 143 147 150 150 155 155 161 172 174 176 178 179 182 184 187 189 192 192 193 196 196 198 198 199 202 204 204
3 Linear Regression Models 3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Economists and Regression Models. . . . . . . . . . . . . . 3.2.1 Mean Independence . . . . . . . . . . . . . . . . . 3.3 Linear Regression Models . . . . . . . . . . . . . . . . . . 3.3.1 Independent, Normal, Homoscedastic Errors . . . . 3.3.2 Vague Prior Beliefs about and . . . . . . . . . . 3.3.3 The Two Marginals under a Vague Prior . . . . . . 3.3.4 Highest Posterior Density Intervals and Regions . 3.3.5 The Least Squares Line . . . . . . . . . . . . . . . 3.3.6 Informative Prior Beliefs . . . . . . . . . . . . . . . 3.3.7 Sampling the posterior density of . . . . . . . . . 3.3.8 An Approximate Joint Posterior Distribution . . . 3.4 A Multinomial Approach to Linear Regression . . . . . . 3.4.1 Comments on the Multinomial Approach . . . . . 3.5 Checking and Extending the Normal Linear Model . . . . 3.5.1 Checking . . . . . . . . . . . . . . . . . . . . . . . 3.6 Extending the Normal Linear Model: . . . . . . . . . . . . 3.6.1 Criticizing the Gasoline Model . . . . . . . . . . . 3.6.2 Generalizing the Error Distribution . . . . . . . . . 3.6.3 Model Choice . . . . . . . . . . . . . . . . . . . . . 3.7 Conclusion and Summary of the Argument. . . . . . . . . 3.8 Appendix: Analytical Results in the Normal Linear Model 3.9 Appendix: Simulating Dirichlet Variates . . . . . . . . . . 3.10 Appendix: Some Probability Distributions. . . . . . . . . 3.11 Exercises and Complements . . . . . . . . . . . . . . . . . 3.12 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . 4 Bayesian Calculations 4.1 Normal Approximations. . . . . . . . . . . . . . . 4.2 Exact Sampling in One Step. . . . . . . . . . . . 4.2.1 Rejection Sampling . . . . . . . . . . . . . 4.2.2 Inverting the Distribution Function . . . . 4.3 Markov Chain Monte Carlo . . . . . . . . . . . . 4.3.1 MarkovChains and Transition Kernels . . 4.3.2 The State Distribution, pt (x) . . . . . . . 4.3.3 Stationary Distributions . . . . . . . . . . 4.3.4 Finding the Stationary Distribution Given 4.3.5 Finite Discrete Chains . . . . . . . . . . . 4.3.6 More General Chains . . . . . . . . . . . . 4.3.7 Convergence . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . a Kernel . . . . . . . . . . . . . . . . . .
CONTENTS 4.3.8 Ergodicity . . . . . . . . . . . . . . . . . . 4.3.9 Speed . . . . . . . . . . . . . . . . . . . . 4.3.10 Finding Kernels With A Given Stationary Two General Methods of Constructing Kernels: . 4.4.1 The Gibbs Sampler . . . . . . . . . . . . . 4.4.2 The Metropolis Method . . . . . . . . . . 4.4.3 Metropolis-Hastings . . . . . . . . . . . . 4.4.4 Practical Convergence . . . . . . . . . . . 4.4.5 Using Samples from the Posterior . . . . . 4.4.6 Calculating the Prior Predictive Density . 4.4.7 Implementing Markov Chain Monte Carlo Exercises and Complements . . . . . . . . . . . . Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v 206 206 207 208 209 213 215 216 217 222 223 226 228 231 231 233 235 236 238 238 240 242 242 242 243 243 244 247 249 250 254 255 257 258 258 259 259 260 261 261 261 262 263 264 266
4.4
4.5 4.6
5 Nonlinear Regression Models 5.1 Estimation of Production Functions . . . . . . . . . . . . . . . . 5.1.1 Criticisms of this Model . . . . . . . . . . . . . . . . . . . 5.2 Binary Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Probit Likelihoods . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Criticisms of the Probit Model . . . . . . . . . . . . . . . 5.2.3 Other Models for Binary Choice . . . . . . . . . . . . . . 5.3 Ordered Multinomial Choice . . . . . . . . . . . . . . . . . . . . . 5.3.1 Data Augmentation . . . . . . . . . . . . . . . . . . . . . 5.3.2 Parameters of Interest. . . . . . . . . . . . . . . . . . . . . 5.4 Multinomial Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Tobit Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Censored Linear Models . . . . . . . . . . . . . . . . . . . 5.5.2 Censoring and Truncation: . . . . . . . . . . . . . . . . . 5.5.3 Selection Models . . . . . . . . . . . . . . . . . . . . . . . 5.6 Count Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Unmeasured Heterogeneity in Nonlinear Regression . . . . 5.6.2 Time Series of Counts . . . . . . . . . . . . . . . . . . . . 5.7 Duration Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Exponential Durations . . . . . . . . . . . . . . . . . . . . 5.7.2 Weibull Durations. . . . . . . . . . . . . . . . . . . . . . . 5.7.3 Piecewise Constant Hazards . . . . . . . . . . . . . . . . . 5.7.4 Heterogeneous Duration Models . . . . . . . . . . . . . . 5.7.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . 5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 Appendix to Chapter 5: Some Distributions . . . . . . . . . . . . 5.9.1 The Lognormal Family . . . . . . . . . . . . . . . . . . . 5.9.2 Truncated Normal Distributions . . . . . . . . . . . . . . 5.9.3 The Poisson Family . . . . . . . . . . . . . . . . . . . . . 5.9.4 The Heterogeneous Poisson or Negative Binomial Family 5.9.5 The Weibull Family . . . . . . . . . . . . . . . . . . . . . 5.10 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . .
vi 6 Randomized, Controlled and Observational Data 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 6.2 Designed Experiments . . . . . . . . . . . . . . . . . 6.2.1 Randomization . . . . . . . . . . . . . . . . . 6.2.2 Controlled Experimentation . . . . . . . . . . 6.2.3 Randomization and Control in Economics . . 6.2.4 Exogeneity and Endogeneity in Economics . . 6.3 Simpsons Paradox . . . . . . . . . . . . . . . . . . . 6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . 6.5 Appendix: Koopmans Views on Exogeneity . . . . . 6.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . Models for Panel Data 7.1 Panel Data . . . . . . . . . . . . . . . . . . . . . . 7.2 How Do Panels Help? . . . . . . . . . . . . . . . . 7.3 Linear Models on Panel Data . . . . . . . . . . . . 7.3.1 Likelihood . . . . . . . . . . . . . . . . . . . 7.3.2 A Uniform Prior on the Individual Eects . 7.3.3 Exact Sampling . . . . . . . . . . . . . . . . 7.3.4 A Hierarchical Prior . . . . . . . . . . . . . 7.3.5 Bugs Program . . . . . . . . . . . . . . . . 7.3.6 Shrinkage . . . . . . . . . . . . . . . . . . . 7.4 Panel Counts . . . . . . . . . . . . . . . . . . . . . 7.4.1 A Uniform Prior on the Individual Eects . 7.4.2 A Gamma Prior for the Individual Eects. 7.4.3 Calculation in the Panel Count Model . . . 7.5 Panel Duration Data . . . . . . . . . . . . . . . . . 7.6 Panel Binary Data . . . . . . . . . . . . . . . . . . 7.6.1 Parameters of Interest . . . . . . . . . . . . 7.6.2 Choices of Prior . . . . . . . . . . . . . . . 7.6.3 Orthogonal Reparametrizations . . . . . . . 7.6.4 Implementation of the model . . . . . . . . 7.7 Concluding Remarks . . . . . . . . . . . . . . . . . 7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS 269 269 270 270 271 272 273 276 278 278 279 281 281 282 285 286 288 292 292 294 295 299 300 301 302 302 304 304 305 307 308 311 312
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS
vii
313 7.9 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . . 313 8 Instrumental Variables 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Randomizers and Instruments . . . . . . . . . . . . . . . . . 8.3 Models and Instrumental Variables . . . . . . . . . . . . . . 8.4 The Structure of a Recursive Equations Model . . . . . . . 8.4.1 Identication . . . . . . . . . . . . . . . . . . . . . . 8.5 Inference in a Recursive System . . . . . . . . . . . . . . . . 8.6 A Numerical Study of Inference with Instrumental Variables 8.6.1 Generating data for a simulation study . . . . . . . . 8.6.2 A BUGS model statement . . . . . . . . . . . . . . . 8.6.3 Simulation Results . . . . . . . . . . . . . . . . . . . 8.7 An Application of IV Methods to Wages and Education . . 8.7.1 Is Education Endogenous? . . . . . . . . . . . . . . . 8.8 Simultaneous Equations . . . . . . . . . . . . . . . . . . . . 8.8.1 Likelihood Identication . . . . . . . . . . . . . . . . 8.9 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . 8.10 Instruments via Equilibrium . . . . . . . . . . . . . . . . . . 9 Some Time Series Models 9.1 First Order Autoregression . . 9.1.1 Likelihoods and Priors . 9.1.2 BUGS Implementations 9.1.3 Prediction . . . . . . . . 9.2 Extensions . . . . . . . . . . . . 9.3 Stochastic Volatility . . . . . . 9.4 A Second Order Autoregression 9.5 Exercises . . . . . . . . . . . . 9.6 Bibliographic Notes . . . . . . . 315 315 316 316 318 319 320 326 327 327 328 332 338 342 345 345 347 349 349 353 355 356 358 359 360 361 361
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Appendix 1: A Conversion Manual 363 .0.1 The Frequentist Approach . . . . . . . . . . . . . . . . . . 363 .0.2 The Bayesian Contrast . . . . . . . . . . . . . . . . . . . . 364 Appendix 2: Programming .0.3 S . . . . . . . . . . . . . . . . . . . . . . . . .0.4 WinBUGS . . . . . . . . . . . . . . . . . . . .0.5 Formulating the Model and Inputting Data .0.6 Special Likelihoods and the Ones Trick . . .0.7 Running the Sampler . . . . . . . . . . . . .0.8 Computing References . . . . . . . . . . . . 369 369 371 371 374 375 376
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
viii App endix 3: .0.9 .0.10 .0.11 .0.12 .0.13 .0.14 .0.15 .0.16 .0.17 .0.18 .0.19 .0.20 .0.21 .0.22 .0.23 .0.24
PREFACE BUGS Code Heteroscedastic Regression . . . . . . . . . . . . . . . . Regression with Autocorrelated Errors . . . . . . . . . . CES production Function . . . . . . . . . . . . . . . . . Probit Model . . . . . . . . . . . . . . . . . . . . . . . . Tobit Model . . . . . . . . . . . . . . . . . . . . . . . . . Truncated Normal . . . . . . . . . . . . . . . . . . . . . Ordered Probit . . . . . . . . . . . . . . . . . . . . . . . Poisson Regression . . . . . . . . . . . . . . . . . . . . . Heterogeneous Poisson Regression . . . . . . . . . . . . Right Censored Exponential Data (using the ones trick) Right Censored Weibull Data . . . . . . . . . . . . . . . A Censored Heterogeneous Weibull Model . . . . . . . . A Simultaneous Equations Model . . . . . . . . . . . . . A Panel Data Linear Model . . . . . . . . . . . . . . . . A Second O Stochastic Volatility . . . . . . . . . . . . . . . . . . . . 379 379 379 380 380 381 381 382 382 382 383 383 384 384 385
. . . . . . . . . . . . . .
. 385
Introduction
This book is an introduction to the Bayesian approach to econometrics. It is written for students and researchers in applied economics. The book has developed out of teaching econometrics at Brown University where the typical member of the class is a graduate student, in his second year or higher. If he is an economics student he has taken in his rst year a semester course on probability and random variables followed by a semester dealing with the elements of inference about linear models from a classical point of view. It is desirable that the reader is familiar with the laws of probability, the ideas of scalar and vector random variables and the notions of marginal, joint and conditional probability distributions and the simpler limit theorems. It could, therefore, be studied by upper level undergraduates, particularly in Europe and other countries with European style undergraduate programs. The mathematics used in the book rarely extends beyond introductory calculus and the rudiments of matrix algebra and I have tried to limit even this to situations where mathematical analysis clearly seems to give additional insight into a problem. Some facility with computer software for doing statistical calculations would be an advantage because the book contains many examples and exercises that ask the reader to simulate data and calculate and plot the probability distributions that are at the heart of Bayesian inference. For simple cases these sums can be done in, for example, Matlab or one of the several variants of the S language. I supply code written in S for many of the examples. More complicated calculations rely on purpose built Bayesian sofware, specically a package with the unlikely name of BUGS, and to make full use of this book it is necessary to obtain and learn to use this package. Whether it useful to have previous knowledge of econometrics is debatable. On the one hand it is helpful to have some understanding of the method of least squares and of regression, and of fundamental econometric notions such endogeneity and structure. On the other hand this book deals exclusively with Bayesian econometrics and this is a radically dierent approach to our subject than that used in all1 existing introductory texts. Because Bayesian inference is dierent from what is customary it is, in my experience, extraordinarily dicult for ordinary mortals to change their way of thinking from the traditional way to the Bayesian way or vice versa. At least it is for me, and I notice that most of
1 947
as of January 2003
xi
xii
INTRODUCTION
my students face the same problem. This means that someone whose training has been conned to the conventional approach may nd this immersion to be a barrier to understanding the Bayesian method. This book is about the Bayesian approach to inference; it is not a book about comparative methods and it contains little about traditional approaches which are covered in many textbooks. My aim has been to answer two rather simple questions. The rst is What is Bayesian Econometrics? and the second is How do I do it? In the rst chapter I explain that Bayesian Econometrics is nothing more than the systematic application of a single theorem, Bayes theorem. I also provide a brief answer to the second question, namely that to apply this theorem in an econometric investigation the best method, in general, is to use our new computer power to sample from the probability distributions that the theorem requires us to calculate. This is the meaning of the word Modern in the books title. In 1989 the methods described here were scarcely known; in 1995 they would have been dicult for a beginner to apply; in 2003 application of these computer intensive methods is little, if any, more dicult than application of the methods traditionally used in applied econometrics. The remainder of the book essentially provides applications of Bayes theorem and illustrations of the method of calculation using mostly the simplest models, extensions to more complex structures will in many cases be fairly obvious. These illustrations are not comprehensive, indeed, for an (imaginary) reader who gets the point of the opening chapters, they are unnecessary! Bayesian analysis of important economic models has has been going on since the 1960s and signicant progress has been made with a number of applications. I do not even deal with all those cases in which the method has been applied, but rather conne my examples to cases that I feel comfortable explaining. My hope is that just a few examples will be sucient to enable the reader to tackle his own problem using what I shall later call The Bayesian Algorithm. The book could be used as the basis for a one semester course at graduate or advanced undergraduate level. I have used it as such on several occasions with a teaching style that emphasizes calculations; the practicality of Bayesian methods; and demonstrates sampling algorithms including use of markov chain monte carlo procedures in class and requires students to solve problems numerically. One way to read the book is to get the gist of the Bayesian method from chapters one and two, without necessarily going into the more detailed discussion in these chapters; then to read chapter three to get a broad understanding of markov chain monte carlo methods. The reader could then choose among the remaining chapters, which are illustrations of the use of Bayesian methods in particular areas of application, according to his or her interests.