100% found this document useful (2 votes)
957 views

Econometrics: Specification Errors

1. The document discusses specification errors that can occur when choosing a regression model. 2. It identifies three main types of specification errors: misspecifying the set of regressors by omitting relevant variables or including irrelevant ones, adopting the wrong functional form, and errors of measurement. 3. Omitting relevant variables biases regression results, while including irrelevant variables makes estimates unbiased but inefficient. The correct specification is needed to obtain best linear unbiased estimates.

Uploaded by

Carlos Abeli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
957 views

Econometrics: Specification Errors

1. The document discusses specification errors that can occur when choosing a regression model. 2. It identifies three main types of specification errors: misspecifying the set of regressors by omitting relevant variables or including irrelevant ones, adopting the wrong functional form, and errors of measurement. 3. Omitting relevant variables biases regression results, while including irrelevant variables makes estimates unbiased but inefficient. The correct specification is needed to obtain best linear unbiased estimates.

Uploaded by

Carlos Abeli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

r

Econometrics
Specification Errors

Specifications Errors
Not
! Until now we have been assuming that the regression abhliible
model chosen for the empirical analysis is “correctly”
specified.

! Under this assumption our main concern has been on


estimating the parameters of the model and testing
hypothesis about them.

! Before we proceed to the analysis of the various


specification errors, the important question is: Which
criteria should be used when chosing a model?

Econometrics Patrícia Cruz 7-2


Specifications Errors
" Parsimony: A model can never be a completely accurate
description of reality. One should introduce in the model a
few key variables that capture the essence of the
phenomenon under study and relegate all minor and random
influences to the error term, ei.
" Goodness of fit: The basic drive of the regression analysis
is to explain as much of the variation in the dependent
variable as possible by the explanatory variables included in
the model. So a high R2, a priori expected signs or values of
the coefficients or significant variables, are all welcome.
" Theoretical consistency: A model may not be good, despite
a high R2, if one or more of the estimated coefficients have
the wrong signs.
Econometrics Patrícia Cruz 7-3

Types of Specifications Errors


" Predictive power: One of most important tests of the
validity of a model is its predictive power outside the
sample period (the R2 measures the predictive power of
a model within the sample).

■ Types of Specification Errors:


1) Misspecifying the set of regressors
1.1) Omitting a relevant variable
1.2) Inclusion of an irrelevant variable
2) Adopting the wrong functional form
3) Errors of measurement

Econometrics Patrícia Cruz 7-4


Misspecifying the Set of Regressors
1) Misspecifying the set of regressors
■ Economic theory usually provides a basis for choosing
the variables to include in the regressors matrix, X.
However, economic theory often provides only a
general guideline and, so, there is uncertainty relative
to variables to include or exclude.
■ We now analyze the consequences of misspecifying
the key set of regressors relevant to explain the
behavior of the dependent variable. We can have two
types of errors: omit a relevant variable or include
an irrelevant variable.

Econometrics Patrícia Cruz 7-5

Misspecifying the Set of Regressors


■ To motivate the problem of specifying the set of
regressors, consider as an example the following
economic model for a sample of n individuals
ci = γ 1 + γ 2 ri + δ1 gi + δ 2 si (Model A)
where c is beer expenditure, r is income, g is gender and
s is years of schooling.
■ Economic theory suggests that income (r) should be
included in this model as an explanatory variable. The
other variables represent characteristics of the
individuals that may or may not have a systematic effect
on beer consumption.

Econometrics Patrícia Cruz 7-6


Misspecifying the Set of Regressors
■ So, alternatively, we may have decided a priori that
neither gender nor years of schooling have a systematic
effect on beer expenditure and propose the model
ci = γ 1 + γ 2 ri (Model B)
■ Given the economic models A and B, consider the two
competing statistical models:
Model 1: y = X1β1 + X 2β 2 + e1
Model 2: y = X1β1 + e 2
where
y =  c1  X1 = 1 r1  X2 =  g1 s1  β1 = γ 1  β2 = δ1 
c  1 r2  g s  γ  δ 
 2   2 2  2  2
M M M M M
     
cn  1 rn   gn sn 
Econometrics Patrícia Cruz 7-7

Misspecifying the Set of Regressors


■ We also have that
e1 ~ N (0; σ 12 I n ) e 2 ~ N (0; σ 22 I n )
■ Model 2 is a restricted version of model 1 that results
from assuming that δ1 = δ2= 0.
■ We now assume in turn that each model is correct and
analyze the bias and sampling variability of the
correspondent OLS estimators of Model 1 and Model 2.

i. If the variables in X2 do have a systematic impact in y


(b2 ∫ 0) model 1 is the correct model. The OLS
estimator in this model has all the desirable properties
(BUE).
Econometrics Patrícia Cruz 7-8
Misspecifying the Set of Regressors
ii. If model 1 is the correct model (b2 ∫ 0) and model 2 is
estimated by OLS, the OLS estimator in this model is
biased unless the variables in X1 and X2 are “orthogonal”,
or have zero cross-products. However, this OLS estimator
of b1 has smaller sampling variability than the OLS
estimator of b1 in the correctly specified model.
In fact, in this case e2 = X2 b2 + e1, that is, the effects of
omitted variables are captured by and included in the
error term. So,
E (e 2 | X) = E ( X 2β 2 + e1 | X) = X 2β 2 ≠ 0
which is a violation of one of the standard model
assumptions.
Econometrics Patrícia Cruz 7-9

Misspecifying the Set of Regressors


The OLS estimator of b1 in model 2 is,
for model 2 ?
why replace 2
−1 g −1
(
βˆ 1 = X1' X1 ) (
X1' y = X1' X1 ) '
X1 ( X1β1 + X 2β 2 + e1 )
−1 −1
(
= β1 + X1' X1 ) X1' X 2β 2 + X1' X1 ( ) X1' e1

Since b2 ∫ 0, unless X1' X2 = 0, the expected value of β̂1 is


−1
E (βˆ 1 | X) = β1 + X1' X1 ( ) X1' X 2β 2 ≠ β1
and so β̂1 is a biased estimator for β1 in model 2.
Notice that this bias does not decrease as the sample size
increases (this OLS estimator is not consistent for β1 )
Econometrics Patrícia Cruz 7-10
Misspecifying the Set of Regressors
iii. If the variables in X2 do not have a systematic impact in y
(b2 = 0) model 2 is the correct model. The OLS estimator
in this model has all the desirable properties (BUE).

iv. If b2 = 0 and we estimate model 1 then the OLS estimator


in this model is unbiased but it is inefficient relative to the
OLS estimator from the correctly specified model 2.
In fact, in this case, the OLS estimator of b in model 1 is,
−1
 ' X 2 ]  X1'  y
ˆβ = ( X ' X ) X ' y =   X1  [ X1
−1
  '
 '
  X 2    X 2 
with X = [ X1 X 2 ] .
Econometrics Patrícia Cruz 7-11

Misspecifying the Set of Regressors


This estimator is still unbiased because
−1 −1
βˆ = ( X ' X ) X ' y = ( X ' X ) X '( X1β1 + X 2 β 2 + e1 )
{
=0

−1 β 
= ( X ' X ) X '( Xβ + e1 ) with β =  1 
0
−1
= β + ( X ' X ) X ' e1
so
β 
E (βˆ | X) = β =  1 
0
Econometrics Patrícia Cruz 7-12
!
unbiased
'

For f
be

q

not consistent

; not efficient
choose
rejectunrestricted
It
;
(A)
,
Misspecifying the Set of Regressors
The referred inefficiency of this estimator follows because
it is known that it is the OLS estimator for the correctly
specified model 2 that is the minimum variance estimator.
The inefficiency arises because there is information about
b2 that has not been employed, namely that b2 = 0.
Resume
" If a relevant variable is omitted from a model, the OLS
estimator is biased (and is not consistent) but has smaller
sampling variability than the unbiased OLS estimator in
the correct model.
" If an irrelevant variable is included in the model then
the OLS estimator is unbiased but it is not efficient.
. Econometrics Patrícia Cruz 7-13

Incorrect Functional Form


2) Incorrect Functional Form
■ A question related to variable selection is that of
choosing the functional form appropriate for a
particular economic relation.
■ Economic theory, while useful in helping to identify the
variables that may be relevant in a particular problem,
gives very little guidance in choosing the adequate
functional form.
■ Having this in mind, we should opt for a functional
form whose characteristics reflect the economic nature
of the relationship.
Econometrics Patrícia Cruz 7-14
Incorrect Functional Form w/ nested :

start from
■ When the economic logic is not sufficient to prescribe
a particular kind of functional form it might be the largest
desirable to see which functional form is most
supported by the data.

■ We already saw how to do this when the several


models are nested or at least have the same dependent
variable. The other situations (such as choosing
between a log-linear or a linear-log model) are not in
the scope of the course.

Econometrics Patrícia Cruz 7-15

Errors of Measurement
3) Errors of Measurement
■ So far we have been assuming that both the dependent
variable and the explanatory variables are measured
without any errors.
■ That is, we are assuming that the data on these variables
is accurate - they are not guess estimates, extrapolated,
interpolated or rounded off in any systematic manner.
■ Unfortunately this ideal is not met in practice for
several reasons, such as nonresponse errors, reporting
errors and computing errors.
■ So, error of measurement is a potential problem and
constitutes another example of specification bias.
Econometrics Patrícia Cruz 7-16
Errors of Measurement
" Errors of measurement in the dependent variable
When the errors of measurement are in the dependent
variable, the OLS estimators for the b’s are still
unbiased and consistent but their estimated variances
are now larger (less efficient) than in the case where
there are no such errors of measurement.

" Errors of measurement in the explanatory variables


When the errors of measurement are in the explanatory
variables, the OLS estimators for the b’s are not only
biased but also not consistent, that is, they remain
biased even when the sample size n tends to infinity.
Econometrics Patrícia Cruz 7-17

Errors of Measurement
■ Example: The permanent income hypothesis states
that consumption at a point in time is determined not
only by the individual’s current income but also by their
expected income in future years, that is, their "permanent
income". Changes in permanent income, rather than
changes in temporary income, are what has a significant
impact on the individual's consumption pattern.

Following what was said, consider the model


yi = β1 + β 2 xi∗ + ei
where y is current consumption expenditure and x* is
permanent income.
Econometrics Patrícia Cruz 7-18
Errors of Measurement
■ Since x* is not directly measurable, suppose we use an
observable income variable, x, such that
xi = xi∗ + wi
where w represents errors of measurement in x*.
Therefore, we estimate the model
yi = β1 + β 2 ( xi − wi ) + ei
= β1 + β 2 xi + (ei − β 2 wi )
1424 3
ui
■ Assuming that E(wi) = 0 and that w and e are unrelated,
cov(ui , xi ) = E[(ui − E (ui ))( xi − E ( xi ))] = E[(ei − β 2 wi ) wi ]
{ 1424 3
=0 wi
2 2
= − β 2 E ( w ) = − β 2σ ≠ 0
i w

Econometrics Patrícia Cruz 7-19

Errors of Measurement

■ So, in the example above, the regressor is not strictly


exogenous and the OLS estimators for the regression
coefficients are biased and also not consistent.

■ Even if the errors of measurement are detected or


suspected, the remedies are often not easy. Thus, it is
very important that the researcher is careful in stating
the sources of his/her data, how they were collected,
what definitions were used, etc.

Econometrics Patrícia Cruz 7-20

You might also like