Tutorial Session 12 - Model Selection Solution
Tutorial Session 12 - Model Selection Solution
1. Explain in your own words, what is meant by each of the following types of specification errors:
2. What is omitted variable bias? Why is it a problem? How do you try to prevent it? Explain.
Omitted variable bias occurs when an independent variable that is correlated with the
dependent variable is omitted from the multiple linear regression analysis.
Omitted variable bias is a serious problem because it results in biased coefficient estimates.
You can attempt to control for potential omitted variable bias by using economic theory to guide
your choice of independent variables included in your multiple linear regression analysis or use
more advanced econometric techniques e.g. introducing control variables, etc.
3. What is the inclusion of an irrelevant variable? Why is it a problem? How do you try to prevent
it? Explain.
The inclusion of an irrelevant variable occurs when an independent variable that is not actually
correlated with the dependent variable is included in the multiple linear regression analysis.
You can attempt to prevent the inclusion of an irrelevant variable by using economic theory to
guide your choice of independent variables included in your multiple linear regression analysis.
• omitted variables.
7. What is the basic difference between the traditional approach to model selection and Hendry’s
approach?
The traditional approach moves form the simple to the general or the bottom-up approach,
whereas Hendry’s approach is to go from the general to the specific.
8. A major coffee importer is interested in knowing what the sensitivity of the demand for coffee
is to its own price, and whether other drinks such as tea and cola are substitutes. It can supply
you with a set of annual data giving the demand for coffee, the prices of coffee, tea and cola,
for the period 1960-1999. You estimate the following model:
------------------------------------------------------------------------------
ccoffee | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ptea | 11.69218 15.90781 0.73 0.467 -20.57035 43.9547
pcola | 8.514633 8.677161 0.98 0.333 -9.083465 26.11273
pcoffee | -10.19486 10.07888 -1.01 0.319 -30.63578 10.24606
_cons | 160.0731 35.40204 4.52 0.000 88.27439 231.8717
------------------------------------------------------------------------------
Page 2 of 4
Comment on each of the following:
8.4 Do you think that you have omitted any important variables from your model? If so,
which ones?
Possibly income, the prices of complements, size of population, etc.
You then add in an additional variable in your model, per capita income, obtaining the
following results:
------------------------------------------------------------------------------
ccoffee | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ptea | 18.9075 8.399881 2.25 0.031 1.854837 35.96017
pcola | 5.942891 4.571652 1.30 0.202 -3.338056 15.22384
pcoffee | -17.78683 5.358168 -3.32 0.002 -28.66448 -6.909167
pcy | .4572321 .0468804 9.75 0.000 .36206 .5524043
_cons | -312.566 51.91447 -6.02 0.000 -417.9579 -207.174
------------------------------------------------------------------------------
Comment on how the addition of the new variable has impacted on each of the following:
Page 4 of 4