Reading 5 A
Reading 5 A
yi = β0 + β1 xi + ei , i = 1...n (5)
β̂0 = ȳ − β̂1 x̄
In this lecture we are concerned with the fitted values, the residuals, and
the hat matrix, their properties and usage.
Fitted values I
n
X n
X
1 (xi − x̄ )(xk − x̄ )
ŷi = + yk = hik yk (6)
n Sxx
k =1 k =1
Show (6)
The distribution of ŷi is described by the following:
3 For any i , j ,
2 1 (xi − x̄ )(xj − x̄ )
cov(ŷi , ŷj ) = σ + = σ 2 hij
n Sxx
iid
4 If the errors are normally distributed, i.e. ei ∼ N (0, σ 2 ), then
1 (xi − x̄ )(xj − x̄ )
hij = +
n Sxx
Pn
Since ŷi = k =1 hik yk for all i = 1, . . . , n, we have, in vector notation
ŷ = H y
H is called the hat matrix, since “it puts the hat on y”, or the projection
matrix, for geometric reasons to be discussed later.
Hat Matrix II
Example 18
In the SLR model, take n = 4 and x> = (x1 , x2 , x3 , x4 ) = (1, −1, 2, 2).
The model matrix is
1 1
1 −1
X = 1 x = 1 2
1 2
We have x̄ = 1, Sxx = 6.
Example 20
Consider the example above, with n = 4 and x> = (1, −1, 2, 2).
( Intercept ) xx
1 1 1
2 1 -1
3 1 2
4 1 2
attr ( ," assign ")
[1] 0 1
Hat Matrix IV
1 2 3 4
1 0.25 0.2500 0.2500 0.2500
2 0.25 0.9167 -0.0833 -0.0833
3 0.25 -0.0833 0.4167 0.4167
4 0.25 -0.0833 0.4167 0.4167
> H %* % H
1 2 3 4
1 0.25 0.2500 0.2500 0.2500
2 0.25 0.9167 -0.0833 -0.0833
3 0.25 -0.0833 0.4167 0.4167
4 0.25 -0.0833 0.4167 0.4167
ŷ ∼ MVN(X β, σ 2 H )
E (yi ) = β0 + β1 · 15.11179
We estimate it by
ŷi = β̂0 + β̂1 xi
So, for MA ŷi = 215.5162 + 25.2530 · 15.11179 = 597.1342. The variance
is
2 1 (xi − x̄ )2
var(ŷi ) = σ + = σ 2 hii
n Sxx
A 95% confidence interval for the mean, at x = xi , if σ is known, is
s
p 1 (xi − x̄ )2
ŷi ± 1.96 ∗ σ hii = ŷi ± 1.96 ∗ σ +
n Sxx
Hua Liang (GWU) 2118-M 252 /
Estimating the mean response II
where q.025 is the quantile of the tn−2 distribution which leaves .025
probability in the upper tail.
Data for state AA is not available in the dataset. Suppose we learn that
the purchase rate in state AA is x ∗ = 17.29; what is the expected Fuel for
state AA, based on our analysis?
E (y|x = x ∗ ) = β0 + β1 x ∗
estimated by
µ̂∗ = β̂0 + β̂1 x ∗
For state AA, µ̂∗ = 215.5162 + 25.2530 ∗ 17.2 = 649.8678, with a 95% CI
649.8678 ± 32.232.
We can draw this 95% CI for each value of x ∗ and get a confidence band
(see figure).
800
700
600
Fuel
500
400
300
12 14 16 18
Prediction I
How can we predict the true value y ∗ ? Is this the same question as
estimating the expected value E (y|x = x ∗ )?
We know that
y ∗ = β0 + β1 x ∗ + e ∗
where e ∗ ∼ N (0, σ 2 ) is a new error, independent of the observed data.
We estimate β0 , β1 by β̂0 , β̂1 , and therefore estimate y ∗ by
ŷ ∗ = β̂0 + β̂1 x ∗ . (Note that ŷ ∗ = µ̂∗ .)
What is the error of this prediction?
which is good!
Hua Liang (GWU) 2118-M 256 /
Prediction II
Show that the variance of the prediction error is
∗ − x̄ )2
1 (x
var(y ∗ − ŷ ∗ ) = . . . = σ 2 1 + +
n Sxx
As with the 95% CI and confidence band for the mean, we can compute a
95% prediction interval (and prediction band):
s
1 (x ∗ − x̄ )2
ŷ ∗ ± 1.96 ∗ σ 1+ +
n Sxx
649.8678 ± 166.9234
Residuals I
The residuals are
so
yi = ŷi + êi
In vector notation: ê = (ê1 , ê2 . . . , ên )> ,
y = ŷ + ê
Residuals III
Residuals V