0% found this document useful (0 votes)
40 views16 pages

Answer Key To Exercises - LN3 - Ver2

Uploaded by

lijuncheng0219
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views16 pages

Answer Key To Exercises - LN3 - Ver2

Uploaded by

lijuncheng0219
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Sample Answers to Analytical Exercises of Chapter 3

0 k 1
Exercise
h i1 Suppose y = x + u for some 2 R and u need not satisfy E[xu] = 0. Show that
2
if E kxk < 1 and E[u2 ] < 1, then E[y 2 ] < 1.

h i r h i
2
Solution: E[y 2 ] = 0
E [xx0 ] 0
+ 2 E [xu] + E[u2 ] E kxk k k + 2 k k E kxk2 E [u2 ] +
2

E[u2 ] < 1, where the …rst inequality is due to the Cauchy-Schwarz inequality. This is why we only
assume the second moments of x and u (instead of y) are …nite because the …niteness of the second
moment of y is implied.

Exercise 2 In Exercise 11 of Chapter 2, is the linear regression model homoskedastic? (Hint: If


j
P (y = j) = exp( j! ) , then V ar(y) = .)

Solution: Since
V ar(ui j xi ) = V ar( yi x0i xi ) = V ar( yi j xi ) = x0i ;

the model is heteroskedastic.

Exercise 3 Consider the OLS regression of the n 1 vector y on the n k matrix X. Consider an
alternative set of regressors Z = XC, where C is a k k non-singular matrix. Thus, each column
of Z is a mixture of some of the columns of X. Compare the OLS estimates and residuals from the
regression of y on X to the OLS estimates from the regression of y on Z.

Solution: Let the OLS estimate and residuals from the original regression be b and u b ; then those
from the other regression are:
(a) b = (Z0 Z) 1 Z0 y = (C0 X0 XC) 1 C0 X0 y = C 1 (X0 X) 1 (C0 ) 1 C0 X0 y = C 1 (X0 X) 1 X0 y =
C 1 b;
b = y Z b = y XC b = y XCC 1 b = y X b = u
(b) u b:

Exercise 4 (i) Explain why the df of SSE is k 1. (Hint: e e0i b , where x is the nonconstant
ybi = x
0 0
covariate, is the associated coe¢ cient.) (ii) Show that SSE = b X e X e b = b 0X
e 0 y.

Email: [email protected]
1
This means that need not be the true coe¢ cient 0 in linear projection.

1
Solution: (i) e ybi = ybi yb = x0i b x0 b = x e0i b = xe0i b , where the last equality is because the …rst
element of xi is 1. So ye
b=X e
e b , that is, y
b must stay in span(X) e which is k 1 dimension. (ii) SSE =
0 0 0 0 0
e
by
y e
b=bX e Xe b , we need only show that X e Xeb =X e y. Since y = X b + u b= X e + 1x0 b + u b , and
0
h i 0 0 0 0
e = X 1x0 , X
X e y = (X 1x0 )0 X b + u b =X e Xb = X e X e + 1x0 b = X e X eb = Xe X e b , where

the second equality uses X0 u e 0 1 = 0 since X


b = 0, the fourth equality uses X e is demeaned, and the
last equality is because the …rst column of Xe is zero.

Exercise 5 (i) Show that R2 remains the same if we demean yi as yi y. (ii) Prove that R2 is the
square of the simple correlation between y and y [
b . (iii) Show that R2 = Corr(y; [
x)0 Corr(x; [
x) 1 Corr(y; x),
[
where Corr(y; [
x) is the sample correlation vector of y with each element of x, and Corr(x; x)
is the sample correlation matrix between every two elements of x. When Corr(x; [ x) = Ik 1 ,
Pk 1
2 [ 2 2
R = l=1 Corr(y; xl ) , so R, the square root of R , is also called the multiple correlation
coe¢ cient in statistics.

Solution: (i) If we demean yi as yi y, b remains the same and the intercept b 1 decreases by y.
Nevertheless, ub remains the same. From R2 = 1 u b0u y0 y
b =e e , it is obvious that R2 remains the same.
(ii) This result is right only if there is a column 1’s in X: In this case, X0 u b = 0 ) 10 u
b=0)u b = 0,
so y = yb + u
b = yb. Now,

[(y y 1y)]2
1y)0 (b
b )2 =
Corr(y; y
[(y 1y)0 (y y 1y)0 (b
1y)] [(b y 1y)]
2
[(b
y+ub (b 1y)0
y 1y)]
= 0
[(y y 1y)0 (b
1y) (y 1y)] [(b y 1y)]
2
[(b
y 1y)0 (b
y 1y)]
=
[(y 1y)0 (y y 1y)0 (b
1y)] [(b y 1y)]
y 1y)0 (b
(b y 1y)
= = R2 ;
(y 1y)0 (y 1y)

b0
where the third equality is because u (b
y 1y) = 0. (iii) From the FWL theorem, b =
0 1 0
e X
X e e y. From the last exercise,
X

1
b 0X
e 0X
eb y0 X e 0X
e X e e 0y
X
R2 = =
nVd
ar(y) nVd
ar(y)
1
n e
1 y0 X n e 0X
1X e n e 0y
1X
=
Vd
ar(y)
1=2 1=2 1 1=2
1 0e 1e0e 1e0e 1e0e 1e0e
= n y X diag n XX Vd
ar(y) diag n XX n XX diag n XX
1=2
1e0e 1e0
diag n XX Vd
ar(y) n Xy
[
= Corr(y; x)0 Corr(x;
[ x) 1[
Corr(y; x):

2
y2
Exercise 6 Show that 1 R2 = 1 + (1 Ru2 ), where Vd
ar (y) = SST
n .
Vd
ar(y)
Pn Pn
y2 2 2 2 y2 y2
Solution: 1+ = 1+ Pn ny 2 = Pn(yi y) +ny
i=1
2 = Pn i=1 i 2 , so 1 + (1 Ru2 ) =
d
V ar(y) i=1 (yi y) i=1 (yi y) i=1 (yi y) Vd
ar(y)
Pn 2
Pn 2
Pn 2
y b
u b
u
Pn i=1 i 2 Pi=1
n 2 =
i Pn i=1 i 2 = 1 R2 . Obviously, if y = 0, R2 = Ru2 .
i=1 (yi y) i=1 yi i=1 (yi y)

Exercise 7 In the linear regression model under the restrictions R0 e be the vector of
= c, let u
RLS residuals. Show that under R0 = c,

(a) R0 b c = R0 (X0 X) 1 X0 u

1
(b) b R = (X0 X) 1 X0 u (X0 X) 1R R0 (X0 X) 1R R0 (X0 X) 1 X0 u

e = (I
(c) u P + A)u for P = X(X0 X) 1 X0 and some matrix A (…nd this matrix A).

(d) Show that A is symmetric and idempotent, tr(A) = q, and PA = A.

Solution: (a) R0 b c = R0 + (X0 X) 1 X0 u c = R0 c + R0 (X0 X) 1 X0 u = R0 (X0 X) 1 X0 u


if R0 = c.
1
(b) From Exercise 26(c) of Chapter 2, b R =b (X0 X) 1 R R0 (X0 X) 1 R R0 b c
1
= (X0 X) 1 X0 u (X0 X) 1R
R0 (X0 X) 1 R R0 (X0 X) 1 X0 u from (a).
e=y
(c) u XbR = X + u XbR = u X bR
n o
1
= u X (X0 X) 1 X0 u (X0 X) 1R R0 (X0 X) 1R R0 (X0 X) 1 X0 u =u Pu + Au,
1
where A = X(X0 X) 1 R R0 (X0 X) 1 R R0 (X0 X) 1 X0 .
(d) It is easy to check that A is symmetric and idempotent.
1 0
PA = X(X0 X) 1 X0 X(X0 X) 1 R R0 (X0 X) 1 R R (X0 X) 1 X0
1
= X(X0 X) 1 R R0 (X0 X) 1 R R0 (X0 X) 1 X0 = A.
1
tr(A) =tr( R0 (X0 X) 1 R R0 (X0 X) 1 X0 X(X0 X) 1 R) =tr(Iq ) = q.
P P
Exercise 8 Show that V ar b j jX = ni=1 wij 2i =SSRj , j = 1; ; k, where wij > 0, ni=1 wij =
1, and SSRj is the SSR in the regression of xj on all other regressors. (Hint: Use the FWL
theorem.)

Solution: From the FWL theorem,

b 1 1
j = X0j M j Xj X0j M jy = X0j M j Xj X0j M j (X + u)
Xn Xn Xn
2
= bji u
u bji j + ui = bji =
u j + bji ui =SSRj ;
u
i=1 i=1 i=1

where ubji is the ith residual in the regression of xj on all other regressors, M j is the associated
annihilator, and M j X = M j X j j + Xj j = u b j j because all regressors except xj are
annihilated by M j . Now,
Pn
b2ji V
i=1 u ar (ui jX) Xn
V ar b j jX = = wij 2
i =SSRj ;
SSRj2 i=1

3
b2ji =SSRj , where u
with wij = u bji is a function of X so can be taken out of the conditional expectation,
Pn
wij > 0, and i=1 wij = 1.

Exercise 9 (i) In the homoskedastic linear regression model under the restrictions R0 = c, …nd
P
an unbiased estimator of 2 . (Hint: n 1k+q ni=1 u e2i , where u
ei is de…ned in Exercise 7.). (ii) In
the heteroskedastic linear regression model with V ar(ujX) = 2 , …nd an unbiased estimator of
2 . (Hint: 1
n kue0 1u e = y X0 e , and e is the GLS estimator with the weight matrix
e , where u
W= 1 .) (iii) In the homoskedastic linear regression model, provide an unbiased estimator of

V ar b jX in the form of (X0 X) 1 X0 DX b (X0 X) 1 for some D. b (Hint: Calculate E u b2i jX .)

Solution: (i)

e0u
E u e jX = E u0 (I P + A)(I P + A)ujX
= tr E (I P + A)(I P + A)uu0 jX
= tr (I P + A)E uu0 jX
2
= tr (I P + A)
2
= (n k + q) ;
Pn
where the third equality is because I P + A is idempotent. So n 1k+q e2i
i=1 u is an unbiased
estimator of 2 .
(ii)
e = y X(X0 1 X) 1 X0 1 y = I X(X0 1 X) 1 X0
u 1
u;

so
h i
e0
E u 1
e jX
u = E u0 I 1
X(X0 1
X) 1
X0 1
I X(X0 1
X) 1
X0 1
ujX
h i
= tr 1 1
X(X0 1
X) 1
X0 1
E uu0 jX
h i
= 2
tr 1 1
X(X0 1
X) 1
X0 1

h i
= 2
n tr 1
X(X0 1
X) 1
X0 = 2
(n k) :

In other words, n 1 k u
e0 u
e is an unbiased estimator of 2 .
b = Mu, letting ei be the ith basis vector of Rn , we have
(iii) Because u

b2i jX = E e0i Muu0 Mei jX =


E u 2 0
ei MMei = 2 0
ei (I P) ei = 2
(1 hi ) ;

where hi = x0i (X0 X) 1x


i. So the unbiased estimator of V ar b jX can be (X0 X) 1 b (X0 X)
X0 DX 1

b21 2
; 1ubnhi .
b =diag u
with D 1 hi ;

Exercise 10 Show that in the linear regression y = X +u, if E [ujX] = 0 and V ar (ujX) = 2 ,
then the BLUE is the GLS estimator b GLS (X0 1 X) 1 X0 1 y.

4
Solution: By a parallel analysis as in the proof of the Gauss-Markov Theorem, for a linear estima-
tor e = A0 y, unbiasedness implies A0 X = Ik and V ar e jX = 2 A0 A. Since V ar b GLS jX =
2 (X0 1 X) 1 , it is su¢ cient to show that A0 A (X0 1 X) 1 0. Set C = 1=2 A

1=2 X(X0 1 X) 1 . Note that X0 1=2 C = 0. Then we calculate that

A0 A (X0 1
X) 1
0
1=2
= C+ X(X0 1
X) 1
C+ 1=2
X(X0 1
X) 1
(X0 1
X) 1

= C0 C + C0 1=2
X(X0 1
X) 1
+ (X0 1
X) 1
X0 1=2
C + (X0 1
X) 1
(X0 1
X) 1

= C0 C 0:

1=2 y = 1=2 1=2 1=2 ujX = 2,


Intuitively, y = X + u implies X + u, where V ar
so the Gauss-Markov Theorem can be applied to the transformed data 1=2 y and 1=2 X.

Because 1=2 is invertible, any linear estimator in the original model can be expressed as a

linear estimator in the transformed model. As a result, the LSE using the transformed data,
1 0
X0 1=2 1=2 X X 1=2 1=2 y = (X0 1 X) 1 X0 1 y, is the BLUE in the original model.

0 h i h i 0
Exercise 11 Show that E e e e = V ar e + E e E e . When
dim( ) = 1, how to interpret this decomposition?

Solution:
0 h i h i h i h i 0
E e e e = E e E e +E e e E e +E e e
h i h i 0 h i h i 0
= E e E e e E e + E e E e
h i h i 0 h i h i 0
+E E e E e e +E E e e E e
h i h i 0
= V ar e + E e E e ;

h i h i 0 h h ii h i 0 h i h i 0
where E e E e E e =E e E e E e =0=E E e e e E e .

When dim( ) = 1, this means that the MSE of e equals its bias squared plus its variance.

Exercise 12 In the two-dimensional case, show that Vy is an ellipse tangent to a box with di-
mensions that coincide with the lengths (standard deviations) of the random variables. When
Corr(y1 ; y2 ) = 1, what does Vy look like?

5
Solution: In the two-dimensional case,

Vy = w 2 Rn jw0 1 w 1
8 ! ! 9
< 1 =
2 w1
1 1 2
= w 2 R2 j (w1 ; w2 ) 1
: 1 2
2
2 w2 ;
( )
w1 2 w1 w2 w2 2
= w 2 R2 2 + 1 2
:
1 1 2 2

The largest value of w1 occurs when


1
dw1 w1 w2 w2 w1
= 2 2 =0
dw2 1 1 2 2 2 1

or w2 = 2 w1 = 1 . Solving for the intersection of this line with the boundary, we …nd that

2 2
w1 w1 2 w1 = 1 2 w1 = 1 2
2 + =1 ;
1 1 2 2

which simpli…es to w12 = 2


1 , jw1 j = 1.
When = 1,
( )
2 2
w1 w1 w2 w2
Vy = w 2 R2 2 + =0 ;
1 1 2 2

or w2 = 2 w1 = 1 , w1 2 [ 1; 1 ], that is, Vy reduces to a line segment.

Exercise 13 What is V b in the homoskedastic linear regression model? Is it a sphere?

Solution: Recall that V ar( b ) = 2 (X0 X) 1 in the homoskedastic linear regression model, so

V b = w 2 Rn jw0 X0 Xw 2
:

Generally, V b is an ellipsoid rather than a sphere unless X0 X is diagonal with all diagonal element
the same, i.e., X is orthonormalized.

Exercise 14 (i) Show that

b =b (1 hi ) 1
(X0 X) 1
bi ;
xi u (1)
( i)

where
hi = x0i (X0 X) 1
xi

is the ith diagonal element of the projection matrix X(X0 X) 1 X0 , called the leverage score for

6
xi . Hint: If T = A + CBD, then

1 1 1 1 1 1
T =A A CB(B + BDA CB) BDA : (2)
Pn
(ii) Show that 0 hi 1 and i=1 hi = k. (iii) Denote ybi = x0i b and yb i = x0i b ( i) . Show that
ybi = hi yi + (1 hi )b
y i.

Solution: (i) In this problem, let A = X0 X, C = xi , B = 1 and D = x0i in (2). So

1 1 1 1
X0 X xi x0i = X0 X + (1 hi ) 1
X0 X xi x0i X0 X

and thus

b 1
( i) = X0 X xi x0i X0 y xi yi
1 1 1 1
= X0 X X0 y X0 X xi yi + (1 hi ) 1
X0 X xi x0i X0 X X0 y xi yi
1 1
= b X0 X xi yi + (1 hi ) 1
X0 X xi x0i b hi y i
1
= b (1 hi ) 1
X0 X xi (1 hi )yi x0i b + hi yi

= b (1 hi ) 1
(X0 X) 1
bi :
xi u

(ii) hi 0 is obvious. 1 hi is the ith diagonal element of the annihilator MX . Given that
MX 0, its ith diagonal element is nonnegative. As a result, 1 hi 0 or hi 1. Finally,
Pn 0 1 0
i=1 hi =tr X(X X) X = k.
(iii) From (1), yb i = ybi (1 hi ) 1 hi (yi ybi ). Solve out ybi we get ybi = hi yi + (1 hi )b
y i.

Exercise 15 Let b n = (X0n Xn ) 1 Xn yn denote the OLS estimate when yn is n 1 and Xn is n k.


A new observation (yn+1 ; xn+1 ) becomes available. Prove that the OLS estimate computed using
this additional observation is

b 1
n+1 = bn + (X0n Xn ) 1
xn+1 (yn+1 x0n+1 b n ):
1 + x0n+1 (X0n Xn ) 1x
n+1

Solution: In this problem, we let A = X0n Xn ; C = xn+1 ; B = 1; D = x0n+1 in (2). Then

1 1
1 1 (X0n Xn ) xn+1 x0n+1 (X0n Xn )
X0n+1 Xn+1 = X0n Xn 1 :
1 + x0n+1 (X0n Xn ) xn+1

7
So

b b 1 1
n+1 n = X0n+1 Xn+1 X0n+1 yn+1 X0n Xn X0n yn
1 1
= X0n+1 Xn+1 X0n yn + xn+1 yn+1 X0n Xn X0n yn
1 1
1 (X0n Xn ) xn+1 x0n+1 (X0n Xn )
= X0n Xn xn+1 yn+1 1 X0n yn + xn+1 yn+1
1 + x0n+1 (X0n Xn ) xn+1
(X0n Xn ) 1 xn+1 h
1
= 0 1 1 + x0n+1 X0n Xn xn+1 yn+1
0
1 + xn+1 (Xn Xn ) xn+1
i
1
x0n+1 b n x0n+1 X0n Xn xn+1 yn+1
1 h i
1
= X0n Xn xn+1 yn+1 x0n+1 b n :
1+ x0n+1 (X0n Xn ) 1 xn+1

Exercise 16 For any predictor g(xi ) for yi , the mean absolute error (MAE) is E [jyi g(xi )j].
Show that the function g(x) which minimizes the MAE is the conditional median m(x) =Med(yi jxi ).
Based on your answer, can you provide any intuition why (??) will pick out the th conditional
quantile of yi .

Solution: For now, we assume the following condition: There is a version of conditional density
of yi (given xi = x) f (yjx) which is positive for any y: By the LIE, we have

E jyi g (xi )j = E [E [jyi g (xi )j j xi = x]] :

We show that for any given xi = x, the median of the distribution of yi (conditional on xi = x) is
arg min E [jyi aj j xi = x]. It holds that
a
Z 1
E [jyi aj j xi = x] = jy aj f (yjx) dy
1
Z a Z 1
= ( y + a) f (yjx) dy + (y a) f (yjx) dy:
1 a

By the Leibnitz’s rule2 , we have


Z a Z 1
d
E [jyi aj j xi = x] = f (yjx) dy f (yjx) dy
da 1 a
= F (ajx) [1 F (ajx)]
= 2F (ajx) 1:
2
The Leibnitz’ rule: Let ( ) and ( ) be di¤erentiable functions and ( ; ) be di¤erentiable with (@=@t) ( ; t)
continuous, if we de…ne
Z (t)
(t) := (x; t) dx:
(t)

then Z (t)
d 0 0
(t) = ( (t) ; t) (t) ( (t) ; t) (t) + t (x; t) dx:
dt (t)

8
1
Thus, the …rst order condition is F (a jx) = . Since the second order condition is satis…ed,
2
d2 =da2 E [jyi aj j xi = x] = 2f (ajx) > 0; a is a solution to our minimization problem, which
is the median of the conditional distribution of yi . This is true for all x 2 Rk , and so we have now
shown that the minimizer of the MAE is the conditional median function.
Suppressing the conditioning on x, we want to show that arg min E [ (y )] is the th quantile
of y. Di¤erentiating E [ (y )] with respect to , we have

E[ 1 (y 0)]
= E [ 1 (y > ) (1 ) 1 (y )]
= P (y > ) (1 ) P (y )=0

or
P (y )
= ;
P (y > ) 1
which implies that is the th quantile of y.

Exercise 17 (i) Find the MLE by writing out the average concentrated log-likehood function `cn 2
where is concentrated out …rst; (ii) (i) Find the MLE by writing out the average concentrated
log-likehood function `cn ( ) where 2 is concentrated out …rst. (iii) Find the maximized value of
the average log-likelihood function.

Solution: (i) It is not hard to see that b 2 = b OLS does not depend on 2. Now,

n
1 1 1X ub2i
`cn 2
= log (2 ) log 2
;
2 2 n 2 2
i=1

which is concave in 2, so b2 is the solution to the FOC:


n n
1 1X ub2i 2 1X 2
2
+ = 0 =) b = bi :
u
2 n 2 4 n
i=1 i=1

(ii) If we concentrate out 2 , then from the FOC above, it is not hard to see that b2 ( ) =
1 P n
n i=1 (yi x0i )2 = n1 SSR ( ). Now,

n
1 1 1 1 X (yi x0i )2
`cn ( ) = log (2 ) log SSR ( ) 2
2 2 n n SSR ( )
i=1 n
1 log n 1 1
= log (2 ) + log SSR ( ) :
2 2 2 2

Maxmizing `cn ( ) is equivalent to minimizing SSR ( ), so b = b OLS ; as a result, b2 = b2 b =


1 Pn
n b2i .
i=1 u

9
(iii) From (i) and (ii),

1 log n 1
`cn b = `cn b2 = log (2 ) + log SSR;
2 2 2

where recall that SSR SSR b .

SSE 2
Exercise 18 Show that 2 k 1 if = 0, where is excluding the intercept.

0 0 1
Solution: From Exercise 4, SSE = b X
e X e 0X
e b . By the FWL theorem, b = X e e 0y =
X +
1 1
e 0X
X e e 0u = X
X e 0X
e e 0 u, where the last equality is from
X e 0 u=
= 0. Since X e 0X
N 0; X e ,
we have

SSE u0 e e 0 e 1
e 0X e 0X
e X e
1
e0u
2
= X XX X X
u0 e e 0 e 1
e0u 2
= X XX X k 1:

Exercise 19 Suppose y = X + u, where u N (0; D) with D being a known diagonal matrix.


0
Show that W SSRU y Xb D 1 y Xb 2 b = X0 D 1 X 1 X0 D 1 y is a
n k , where
GLS estimator.

Solution: First,
b 1
= X0 D 1
X X0 D 1
u;

so h i
1
y Xb = u X b = I X X0 D 1
X X0 D 1
u:

As a result,
h i h i
1 0 1
W SSRU = u0 I D 1 X X0 D 1 X X D 1 I X X0 D 1 X X0 D 1
u
h i
1 0
= u0 D 1=2 I D 1=2 X X0 D 1 X X D 1=2 D 1=2 u;

1
where D 1=2 u N (0; I). Since I D 1=2 X X0 D 1 X X0 D 1=2 is idempotent with rank n k
(which is also its 1=2 X X0 D 1 X 1 X0 D 1=2
! trace), there exists an orthogonal matrix H such that I D =
In k 0
H H0 , so
0 0
!
0 In k 0 2
W SSRU = v v n k
0 0

where v = H0 D 1=2 u N (0; I).

10
Exercise 20 Show that
0 0 1
SSRR SSRU = b e X0 X b e = R0 b c R0 (X0 X) 1
R R0 b c
0 0
= b R0 (X0 X) 1
Rb = y Xe P y Xe :

(Hint: Use the results in Exercise 26 of Chapter 2).

Solution: Recall that

e=b 1
(X0 X) 1
R R0 (X0 X) 1
R R0 b c ; (3)

and
b = R0 (X0 X) 1
1
R R0 b c : (4)

e=y
De…ne u X e , which can be expressed as y Xb + X b e =u
b+X b e . Now,

h i0 h i
SSRR SSRU = b+X b
u e b+X b
u e b0u
u b
0
b+ b
b0u
= u e X0 X b e b0u
u b
0
= b e X0 X b e
0 1
= R0 b c R0 (X0 X) 1
R R0 b c
0
= b R0 (X0 X) 1
Rb
e 0 Pe
= u u;

where the second equality follows from X0 u


b = 0, the fourth equality is from substituting the formula
of in (3), the …fth equality is from substituting the formula of b in (4), and the last equality is
e
from the FOC for e , X0 u
e = Rb.

Exercise 21 In the setup of Exercise 19, show that F with R2 replaced by Buse (1973)’s R2 ,

W SSRR W SSRU
R2 =
W SSRR

(y0 D 1 1)2
follows Fk 1;n k under the null (ii) above, where W SSRR = y0 D 1y
10 D 11 .

Solution: Note that

R2 =(k 1)
F =
(1 R2 ) =n k
(W SSRR W SSRU ) =(k 1)
= ;
W SSRU =(n k)

so as long as we can show that W SSRR W SSRU 2 and is independent of W SSRU , we are
k 1

11
y0 D 11
done by combining the results in Exercise 19. First, under the null, y = 1 1 + u, so b 1R = 10 D 11 ,
and
0
W SSRR = y 1 b 1R D 1
y 1 b 1R

10 D 11 2 y0 D 1y 10 D 1y 2 10 D 11
=
(10 D 1 1)2

(y0 D 1 1)2
= y0 D 1 y
10 D 1 1
2
2 0 1 0 u0 D 1 1
= 11 D 1 +2 11 D
1
u + u0 D 1
u 2 0 1
11 D 1 2 0 1
1u D 1
10 D 1 1
2
0 1 u0 D 1 1
= uD u :
10 D 1 1

Now,

h i u0 D 1 1
2
1
W SSRR W SSRU = u0 D 1=2
D 1=2
X X0 D 1
X X0 D 1=2
D 1=2
u
h 10 D 1 1 i
1 1
= u0 D 1=2
D 1=2
X X0 D 1
X X0 D 1=2
D 1=2
1 10 D 1
1 10 D 1=2
D 1=2
u.

Check that the matrix in the square bracket is idempotent with rank k 1 and
h ih i
1=2 1 1 1
D X X0 D 1
X X0 D 1=2
D 1=2
1 10 D 1
1 10 D 1=2
I D 1=2
X X0 D 1
X X0 D 1=2
= 0;

(the key step is to show that


h ih i
1=2 1 1 1
D X X0 D 1
X X0 D 1=2
D 1=2
1 10 D 1
1 10 D 1=2
=D 1=2
1 10 D 1
1 10 D 1=2
;

1
which follows by noticing that X0 D 1 X X0 D 1 1 = e1 , the …rst standard basis vector of Rk )
so W SSRR W SSRU 2
k 1 and is independent of W SSRU .

Exercise 22 Consider a special case of F where R = Ik , i.e., we want to test whether all coe¢ -
0 0 0
cients are zero. Show that (i) kF = y ys2ub ub ; (ii) If 2 (X0 X) 1 = Ik is known, kF = b b ; (iii) If
0 R2
s2 ( 1 X0 X) 1 = Ik , then b b = U 2 .
n k 1 RU

Solution: (i) This is from the likelihood-ratio principle where SSRR = y0 y and SSRU = u b0u
b.
2 0 0 1 1
(ii) This is from the Wald principle, where s R (X X) R is replaced by Ik . (iii) This is from
0
h i 1 2
b s2 ( 1 X0 X) 1 b =k = RU =k2 . In the population, if 2 E [xx0 ] 1 = Ik , then 0 = R2 2 ,
n k 1 R U
1 R
where R2 = 1 2= 2
y
2.
is the probability limit of RU

Exercise 23 If E [ ] = 0 and E [y] = y0 in the above theorem, what is the minimum-variance


estimator of ?

12
2 h i
Solution: We try to …nd an estimator b which minimizes E b such that E b = E [ ] =
0. Note that
2 2
E b =E ( b
0) 0 ;

so if we substitute b ; b ; y in the original problem, all the conditions


0; 0; y y0 for
in the original problem hold. Thus the solution should be

b 1
0 =E ( 0 ) (y y0 )0 E (y y0 ) (y y0 )0 (y y0 ) ;

and b can be expressed in the form of Ky + b.

0
Exercise 24 Suppose 2 R2 , E [ ] = 0, and E = I2 . If we observe a single perfect
measurement y of the …rst components 1 of , what is the minimum-variance estimator of ?

Solution: In the setup of the theorem,

y = (1; 0) + 0;

i.e., X = (1; 0), and = 0. So


!" !# 1 !
b= 1 1 1 y
X0 X X 0 + = (1; 0) y= ;
0 0 0

i.e., the …rst coordinate of b is the perfect measurement of 1 and the second coordinate is 0.

Exercise 25 Show that the minimum-variance linear estimator of T is T b , where T 2 Rp k.

Solution: If y is the optimal estimate of T , we must have

E y (T y)0 = 0;

and thus the normal equation for are

E yy0 0
= E y 0 T0

so that
1
= TE y0 E yy0 ;

which, by comparison with the minimum-variance estimator of , yields the desired result.

Exercise 26 Show that if b is the minimum-variance linear estimator of , then is also minimizing
0
E b P b for any P > 0.

13
Solution: Let P1=2 be the unique positive-semide…nite square root of P. According to Exercise
25, P1=2 b is the minimum-variance estimator of P1=2 and thus b minimizes

2 0
E P1=2 P1=2 b =E b P b :

Exercise 27 Show that the ARMA(p; q) model can be expressed as an r(= max (p; q + 1))-dimensional
dynamic model of a random process.

Solution: The ARMA(p; q) model is de…ned as

yt = 1 yt 1 + + p yt p + ut + 1 ut 1 + + q ut q :

Its state-space representation is (i) state equation,


0 1 0 1
1 2 r 1 r ut+1
B C B C
B 1 0 0 0 C B 0 C
B C B C
B 0 1 0 0 C B 0 C
y (t + 1) = B C y (t) + B C;
B .. .. .. .. C B .. C
B . . . . C B . C
@ A @ A
0 0 1 0 0

where r = max (p; q + 1), j = 0 with j > p; (ii) observation equation,

y (t) = (1; 1; ; r ) y (t) ;

where j = 0 with j > q. To verify these two equations indeed describe the ARMA(p; q) process,
let yj (t) denote the jth element of y (t). Thus the second row of the state equation asserts that

y2 (t + 1) = y1 (t) ;

the second row asserts that


y3 (t + 1) = y2 (t) = y1 (t 1) ;

and in general, the jth row implies that

yj (t + 1) = Lj 1
y1 (t + 1) ;

where L is the lag operator. Thus the …rst row of the state equation implies that

r 1
y1 (t + 1) = 1 + 2L + + rL y1 (t) + ut+1 ; (5)

or
2 r
1 1L 2L rL y1 (t + 1) = ut+1 :

14
The observation equation states that

r
y (t) = (1 + 1L + + r L ) y1 (t) : (6)

Multiplying (6) by 1 2 r
1L 2L rL and using (5) gives

2 r r
1 1L 2L rL y (t) = (1 + 1L + + r L ) ut ;

which is exactly the ARMA(p; q) model.

Exercise 28 Show that

b (tj ) =
y (t 1) (t 2) b ( + 1j )
( + 1) y

for t > 1.

Solution: Recursively applying y (t + 1) = (t) y (t) + u (t), we have

y (t) = (t 1) y (t 1) + u (t 1)
= (t 1) (t 2) y (t 2) + (t 1) u (t 2) + u (t 1)
=
= (t 1) (t 2) ( + 1) y ( + 1)
+ (t 1) ( + 2) u ( + 1) + + (t 1) u (t 2) + u (t 1) :

By Exercise ??, the optimal estimate of (t 1) ( + 1) y ( + 1) is (t 1) b ( + 1j ),


( + 1) y
t 1
and since fu (t)gt= +1 is orthogonal to Y ( ), the optimal estimate of y (t) is

b (tj ) =
y (t 1) (t 2) b ( + 1j ) :
( + 1) y

Exercise 29 (i) Show that (??) can be re-written as

1
b (tjt) =
y (t b (t
1) y 1jt 1) + (tjt) M0 (t) Q (t) [y (t) M (t) (t b (t
1) y 1jt 1)] :

(ii) If there is a control vector xt such that

y (t + 1) = (t) y (t) + H (t) x (t) + u (t) , t = 0; 1; 2; ;

show that

1
b (tjt) =
y (t b (t
1) y 1jt 1) + H (t 1) x (t 1) + (tjt) M0 (t) Q (t)
[y (t) M (t) (t b (t
1) y 1jt 1) M (t) H (t 1) x (t 1)] ;

b (tjt) is the minimum-variance estimator of y (t) given Y (t) and fx ( )gt =0


where y 1
.

15
Solution: (i) If we can show

1 1
(tjt) M0 (t) Q (t) = (t) M0 (t) M (t) (t) M (t)0 + Q (t) ;

then we are done. By the expression of (tjt), we have

1
(tjt) M0 (t) Q (t)
n o
1
= (t) (t) M0 (t) M (t) (t) M (t)0 + Q (t) M (t) (t) M0 (t) Q (t) 1
n o
1
= (t) M0 (t) Q (t) 1 M (t) (t) M (t)0 + Q (t) M (t) (t) M0 (t) Q (t) 1
n o
1
= (t) M0 (t) M (t) (t) M (t)0 + Q (t) M (t) (t) M (t)0 + Q (t) Q (t) 1 M (t) (t) M0 (t) Q (t) 1

1
= (t) M0 (t) M (t) (t) M (t)0 + Q (t) :

(ii) When the control vector is present, the minimum-variance estimator of y (t) given Y (t 1) and
fx ( )gt =0
1
, is (t 1) y b (t 1jt 1) + H (t 1) x (t 1), and then applying Example ??, we have
the formula for yb (tjt), where note that the recursive formulae for (tjt) and (t) do not change.

16

You might also like