Problem Set 6
Problem Set 6
1. The critical value in the F-distribution depends on the degrees of freedom in the
numerator and denominator. How do you find the degrees of freedom in the nu-
merator?
(a) It is the number of observations minus the number of coefficients estimated
(N − K)
(b) It is the number of hypotheses being tested simultaneously (J)
(c) It is the number of coefficients being estimated (K)
(d) It is the number of observations minus the number of hypotheses tested (N −J)
2. The critical value in the F-distribution depends on the degrees of freedom in the
numerator and denominator. How do you find the degrees of freedom in the de-
nominator?
(a) It is the number of observations minus the number of coefficients estimated
(N − K)
(b) It is the number of hypotheses being tested simultaneously (J)
(c) It is the number of coefficients being estimated (K)
(d) It is the number of observations minus the number of hypotheses tested (N −
J)
3. When performing an F-test, if the null hypothesis is H0 : β1 = β2 = 0. What is
the alternative hypothesis?
(a) β1 6= 0 and β2 6= 0
(b) β1 6= 0 or β2 6= 0
(c) (β1 6= 0 and β2 = 0) or (β1 = 0 and β2 6= 0)
(d) β1 = β2 6= 0
4. How does omitting a relevant variable from a regression model affect the estimated
coefficient of other variables in the model?
(a) they are biased downward and have smaller standard errors
(b) they are biased upward and have larger standard errors
(c) they are biased and the bias can be negative or positive
(d) they are unbiased but have larger standard errors
5. How does including an irrelevant variable in a regression model affect the estimated
coefficient of other variables in the model?
(a) they are biased downward and have smaller standard errors
(b) they are biased upward and have larger standard errors
(c) they are biased and the bias can be negative or positive
1
(d) they are unbiased but have larger standard errors
6. Which of the following measures is NOT used to evaluate model specification?
(a) The adjusted R2
(b) Akaike Information Criterion
(c) Bayesian Information Criterion
(d) Jarque-Bera test
7. When are the R2 and adjusted R2 equal?
(a) When the model is correctly specified
(b) When K = 1
(c) When the error terms are normally distributed
(d) When an unrestricted model is estimated
8. When highly collinear variables are included in an econometric model coefficient
estimates are
(a) biased downward and have smaller standard errors
(b) biased upward and have larger standard errors
(c) biased and the bias can be negative or positive
(d) unbiased but have larger standard errors
9. When a set of variables with perfect collinearity is included in an econometric
model coefficient estimates are
(a) undefined
(b) unbiased
(c) biased upward
(d) biased, but the direction is unclear
10. If your regression results show a high R2 , adj R2 , and a significant F-test, but low
t-values for the coefficients, what is the most likely cause?
(a) omitted relevant variables
(b) irrelevant variables have been included
(c) multicolinearity
(d) heteroskedasticity
Analytical Questions
2
of costs on the level of output, the square of output, (output sq) and the cube of
output (output cub). Some of the regression output has been hidden.
Model 1
reg costs output
Source | SS df MS Number of obs =
-------------+------------------------------ F( 1, 58) = 662.73
Model | 733.336303 1 733.336303 Prob > F = 0.0000
Residual | 97.3749935 58 1.10653402 R-squared = 0.8828
-------------+------------------------------ Adj R-squared = 0.8814
Total | 830.711297 59 9.33383479 Root MSE = 1.0519
------------------------------------------------------------------------------
costs | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
output | .5000000 .0250000 0.000
_cons | .6501553 .1677777 3.88 0.000 .3167323 .9835782
------------------------------------------------------------------------------
Model 2
reg log_cost log_output
Model 3
reg costs output output_sq output_cub
3
12. You have time series data for the period 1935-2000. You are given an estimate
of the effects of income (measured in £billion) and interest rates, (measured in
percentage points) on aggregate consumption expenditure (measured in £billion).
ˆ =
Cons 10.00 + 0.90Income − 6.00IntRate T SS = 70 ĒSS = 10
(1.00) + (0.45) + (2.00)
You then split the data into two periods, and run 2 separate regressions
Test the hypothesis that the data could be pooled across both time periods and
estimated as a single equation.
13. Consider the model Y = β0 + β1 X + u
(a) What is the formula for the Ordinary Least Squares estimate of β1 ?
(b) Under what conditions will Ordinary Least Squares produce an unbiased and
efficient estimate of β1 ?
(c) Prove that the Ordinary Least Squares estimate of β1 is unbiased.
14. A researcher is interested how the proportion of household budget spent on trans-
portation (W T RAN S) depends on total household expenditure (measured in logs
- LOGEXP ), the age of the household head (AGE) and the number of children
in the household (N U M KIDS). The researcher produces the following table of
estimates:
WTRANS
Log expenditure 0.0414
(0.0071)
Age of HH head -0.0001
(0.0004)
No. of children -0.0130
(0.0055)
Constant -0.0315
(0.0322)
R2 0.0247
N 1,519
Standard errors reported in parentheses
(a) What was the theoretical model the researcher took to the data?
(b) Write down the estimated model
(c) Interpret the estimates
(d) Are there any variables you would exclude from the model? Why, or why
not?
4
(e) Predict the proportion of a budget that will be spent on transportation for a
one-child household when total expenditure and age are set at their sample
means (98.7 and 36 respectively)
Practical Questions
15. When estimating wage equations we expect that young, experienced workers will
have relatively low wages and that with additional experience their wages will rise,
but then begin to decline after middle age, as the worker nears retirement. This
lifecycle pattern of wages can be captured by introducing experience and the square
of experience to explain the level of wages.
Consider the theoretical model
5
16. The file cocaine.dta available on Moodle contains 56 observations on variables
related to sales of cocaine powder in northeastern California over the period 1984-
1991. The data are a subset of those used in the study
Caulkins, J.P. and R. Padman (1993) “Quantity Discounts and Quality Premia
for Illicit Drugs” Journal of the American Statistical Association, 88, 748-757
P RICE = β0 + β1 QU AN T + β2 QU AL + β3 T REN D + u
(a) What signs would you expect for the coefficients β1 , β2 and β3 . Explain
(b) Estimate the model in STATA and interpret the coefficient estimates. Do the
signs of the coefficients conform to your expectations?
(c) What proportion of the variation in cocaine prices is explained jointly by
variation in quantity, quality and time?
17. Use the data cpseduc.dta to estimate the following wage equation: