Basic Econometrics Chapter 3 Solutions
Basic Econometrics Chapter 3 Solutions
CHAPTER 3:
TWO-VARIABLE REGRESSION MODEL:
THE PROBLEM OF ESTIMATION
3.1
(1) Yi = 1 + 2 Xi + ui . Therefore,
E(Yi X i ) = E[( 1 + 2 Xi + ui) X i ]
= 1 + 2 Xi + E (ui X i ), since the 's are constants and X
is nonstochastic.
= 1 + 2 Xi , since E(ui X i ) is zero by assumption.
(2) Given cov(uiuj) = 0 for for all i,j (i j), then
cov(YiYj) = E{[Yi - E(Yi)][Yj - E(Yj)]}
= E(uiuj), from the results in (1)
= E(ui)E(uj), because the error terms are not
correlated by assumption,
= 0, since each ui has zero mean by assumption.
(3) Given var(ui\Xi) = 2 , var (Yi\Xi) = E[Yi - E(Yi)]2 = E(ui2) =
var(ui\Xi) = 2 , by assumption.
3.2
Yi
Xi
yi
xi
xiyi
xi2
4
1
-3
-3
9
9
5
4
-2
0
0
0
7
5
0
1
0
1
12
6
5
2
10
4
------------------------------------------------------------------------sum 28
16
0
0
19
14
------------------------------------------------------------------------Note: Y = 7 and X = 4
Therefore, 2 =
xy
i i
3.3
=
2
19
= 1.357; 1 = Y 2 X = 1.572
14
17
3.4
ui = (Yi - 1 - 2 Xi) = 0
Simplifying this yields the first normal equation.
Imposing the second restriction, we obtain:
u Xi = [(Y
- 2 Xi)Xi ] = 0
Simplifying this yields the second normal equation.
The first restriction corresponds to the assumption that E(ui\Xi) = 0.
The second restriction corresponds to the assumption that the
population error term is uncorrelated with the explanatory variable
Xi, i.e., cov(uiXi) = 0.
i
3.5
2
Now r =
1 , by analogy with the Cauchy-Schwarz
xi 2 yi 2
inequality. This also holds true of 2 , the squared population
correlation coefficient.
3.6
Note that:
yx =
xiyi
and xy =
xy
y
i i
i
Multiplying the two, we obtain the expression for r2, the squared
sample correlation coefficient.
3.7
3.8
Y =X=
x y
i
n +1
2 and the
(1)
where small letters as usual denote deviation from the mean values.
Since the rankings are permutations of the first n natural numbers,
18
( Xi ) 2
= Xi
=
6
4
12
and similarly,
n(n 2 1)
2
yi = 12 , Then
= ( Xi Yi ) =
=
(X
+ Yi 2 2 XiYi )
2n(n + 1)(2n + 1)
2 XiYi
6
Therefore,
XY =
i i
n(n + 1)(2n + 1)
xy = XY
i i
i i
n(n + 1)(2n + 1)
(2)
X Y
i
Since
2
2
n(n + 1)
n(n 1)
=
4
12
(3)
Now substituting the preceding equations in (1), you will get the
answer.
3.9
var( 1 ) =
n xi 2
x =0
i
2 and var( 1 ) =
n xi 2
2 =
2
n
(b) 2 =
xy
xy
i i
i i
and 1 =
2
19
, since xi = (Xi - X )
2
That is, the estimates and variances of the two slope estimators are
the same.
(c) Model II may be easier to use with large X numbers, although
with high speed computers this is no longer a problem.
3.10
Since
( x x)( y y )
i
xy
2 =
i i
( x x)
i
3.11
zw
i
ac xiyi
r2 =
z w
i
= r1 in Eq.(3.5.13)
ac
x y
i
3.12 (a) True. Let a and c equal -1 and b and d equal 0 in Question 3.11.
20
zw
i
( x + x )( x
1
rzw =
+ x 3)
z w
i
(x + x ) (x
+ x3) 2
=
( x + x 2 )( x 2 + x3 )
2
1
zw =
3.14 The residuals and fitted values of Y will not change. Let
Yi = 1 + 2 Xi + ui and Yi = 1 + 2 Zi + ui , where Z = 2X
Using the deviation form, we know that
2 =
2 =
xy
x
zy
2 xiyi
i i
=
i
4 xi 2
21
1
2
2
1 = Y 2 X ; 1 = Y 2 Z = 1 (Note: Z = 2 X )
That is the intercept term remains unaffected. As a result, the fitted
Y values and the residuals remain the same even if Xi is multiplied
by 2. The analysis is analogous if a constant is added to Xi.
3.15 By definition,
2
( yiy i )
( y i + ui )( yi )
=
=
( yi 2 )( yi 2 )
( yi 2 )( yi 2 )
2
ryy 2
since
y u
i i
(
= 0. =
2 2 xi 2
x )2
2 i
=
2
3.16 (a) False. The covariance can assume any value; its value depends
on the units of measurement. The correlation coefficient, on the
other hand, is unitless, that is, it is a pure number.
(b) False. See Fig.3.11h. Remember that correlation coefficient
is a measure of linear relationship between two variables. Hence,
as Fig.3.11h shows, there is a perfect relationship between Y and
X, but that relationship is nonlinear.
(c) True. In deviation form, we have
yi = yi + ui
Therefore, it is obvious that if we regress yi on yi , the slope
coefficient will be one and the intercept zero. But a formal proof can
proceed as follows:
If we regress yi on yi , we obtain the slope coefficient, say, as:
y y
xiyi
i i
=
2
2 xi 2
2
= 1 , because
2
22
with the only unknown parameter and set the resulting expression to
zero, to obtain:
2
d (ui. )
= 2 (Yi 1 )(1) = 0
d
1
y2
, where n is the
n
sample size, and 2 is the variance of Y. The RSS is
yi2
RSS
2
2
2
(Yi Y ) = yi and = (n 1) = (n 1) . It is worth adding the
X variable to the model if it reduces 2 significantly, which it will if
X has any influence on Y. In short, in regression models we hope
that the explanatory variable(s) will better predict Y than simply its
mean value. As a matter of fact, this can be looked at formally.
Recall that for the two-variable model we obtain from (3.5.2),
RSS = TSS - ESS
= yi2 yi2
- 22 xi2
Therefore, if is different from zero, RSS of the model that
=
2
i
contains at least one regressor, will be smaller than the model with no
regressor. Of course, if there are more regressors in the model and
their slope coefficients are different from zero, the RSS will be much
smaller than the no-regressor model.
Empirical Exercises
3.18 Taking the difference between the two ranks, we obtain:
d -2 1 -1 3 0 -1 -1 -2 1 2
d2
4 1
9 0
1 1
1 4 ; d2 = 26
6(26)
= 0.842
n(n 1)
10(102 1)
Thus there is a high degree of correlation between the student's
midterm and final ranks. The higher is the rank on the midterm, the
higher is the rank on the final.
rs = 1
= 1
3.19 (a) The slope value of 2.250 suggests that over the period 1985-2005,
for every unit increase in the ratio of the US to Canadian CPI, on
average, the Canadian to US dollar exchange rate ratio increased by
about 2.250 units. That is, as the US dollar strengthened against the
23
Canadian dollar, one could get more Canadian dollars for each US
dollar. Literally interpreted, the intercept value of -0.912 means that
if the relative price ratio were zero, a US dollar would exchange for 0.912 Canadian dollars (would lose money). Of course, this
interpretation is not economically meaningful. With a fairly low to
moderate r2 of 0.440, we should realize that there is a lot of
variability in this result.
(b) The positive value of the slope coefficient makes economic sense
because if U.S. prices go up faster than Canadian prices, domestic
consumers will switch to Canadian goods because they can buy more,
thus increasing the demand for GM, which will lead to appreciation
of the German mark. This is the essence of the theory of purchasing
power parity (PPP), or the law of one price.
(c) In this case the slope coefficient is expected to be negative, for the
higher the Canadian CPI relative to the U.S. CPI, the lower the
relative inflation rate in Canada which will lead to depreciation of the
U.S. dollar. Again, this is in the spirit of the PPP.
160.0
140.0
120.0
100.0
80.0
60.0
40.0
20.0
0.0
40.0
60.0
80.0
100.0
Output per Hour
24
120.0
140.0
160.0
160
140
120
100
80
60
40
20
0
40
50
60
70
80
90
100
110
120
130
Nonfarm Business:
25
140
Y X
3.21
X Y X Y
2
i
i i
3.22 (a)
Gold Prices, CPI, and the NYSE Index Over Time
9000.00
8000.00
7000.00
6000.00
5000.00
Gold Price
NYSE
CPI
4000.00
3000.00
2000.00
1000.00
20
06
20
04
20
02
20
00
19
98
19
96
19
94
19
92
19
90
19
88
19
86
19
84
19
82
19
80
19
78
19
76
19
74
0.00
If you plot these variables against time, you will see that there is
considerable price volatility for gold, but the NYSE and CPI seem
relatively stable.
(b) If the hypothesis were true, we would expect 2 1 .
Gold Pricet = 215.286 + 1.038 CPIt
se = (54.469)
(0.404)
NYSEt = -3444.992 + 50.297 CPIt
se
(533.966) (3.958)
26
r2 = 0.1758
r2 = 0.8389
It seems the stock market is a better hedge against inflation than gold.
3.23 (a) The plot is as follows, where NGDP and RGDP are nominal and real
GDP.
NGDP and RGDP Over Time
14,000.0
12,000.0
10,000.0
8,000.0
NGDP
RGDP
6,000.0
4,000.0
2,000.0
19
59
19
61
19
63
19
65
19
67
19
69
19
71
19
73
19
75
19
77
19
79
19
81
19
83
19
85
19
87
19
89
19
91
19
93
19
95
19
97
19
99
20
01
20
03
20
05
0.0
(b)
r2 = 0.926
r2 = 0.972
(c) The slope here gives the rate of change of GDP per year.
(d) The difference between the two represents inflation over time. As
the figure and regression results indicate, nominal GDP has been
growing at a faster rate than real GDP suggesting that inflation has
been rising over time.
27
Yt = 31.76 + 1.0485 X t
se = (47.80)
( 0.0937 )
r 2 = 0.786
where Y = female reading score and X = male reading score.
(c) As pointed out in the text, a statistical relationship, however
strong, does not establish causality, which must be established
a priori. In this case, there is no reason to suspect causal relationship
between the two variables.
Yt = 257.02 + 1.416 X t
se= (29.35)
(0.0559)
r 2 = 0.950
28
3.28
Cell Phone Subscribers vs PC Ownership
120
100
80
60
40
20
0
0
10
20
30
40
50
60
70
PC Ownership
29
80