0% found this document useful (0 votes)

23 views

Reading 5 A

Uploaded by

pratham.khanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Reading 5 A

Uploaded by

pratham.khanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

SLR-Residuals and Fitted Values I

We continue to consider the Simple Linear Model

yi = β0 + β1 xi + ei , i = 1...n (5)

where the errors e1 , . . . , en are uncorrelated, E (ei ) = 0, and var(ei ) = σ 2 .

The least squares estimators are
Pn
(x − x̄ )(yi − ȳ) Sxy
β̂1 = i=1 Pn i 2
=
i=1 (xi − x̄ ) Sxx

β̂0 = ȳ − β̂1 x̄

In this lecture we are concerned with the fitted values, the residuals, and
the hat matrix, their properties and usage.

Hua Liang (GWU) 2118-M 243 /

Fitted values I

The fitted values, or predicted values, are

ŷi = β̂0 + β̂1 xi

i = 1, 2, . . . , n. These correspond to the y-value on the fitted regression

line at x = xi .
Two alternative forms are:

ŷi = ȳ + β̂1 (xi − x̄ )

n
X n
X
1 (xi − x̄ )(xk − x̄ )
ŷi = + yk = hik yk (6)
n Sxx
k =1 k =1

Hua Liang (GWU) 2118-M 244 /

Fitted values II

The vector of fitted values is

 
ŷ1
 
ŷ =  ... 
ŷn

Show (6)
The distribution of ŷi is described by the following:

Hua Liang (GWU) 2118-M 245 /

Fitted values III

Theorem 17 (Distribution of Fitted Values)
1 For i = 1, . . . , n, E (ŷi ) = E (yi ) = β0 + β1 xi ; i.e., ŷi is unbiased for
β0 + β1 xi , the mean response at xi .
2 For i = 1, . . . , n,

2 1 (xi − x̄ )2
var(ŷi ) = σ + = σ 2 hii
n Sxx

3 For any i , j ,

2 1 (xi − x̄ )(xj − x̄ )
cov(ŷi , ŷj ) = σ + = σ 2 hij
n Sxx

iid
4 If the errors are normally distributed, i.e. ei ∼ N (0, σ 2 ), then

ŷi ∼ N (β0 + β1 xi , σ 2 hii )

Hua Liang (GWU) 2118-M 246 /
Hat Matrix I

The coefficients of yk in (6) define the (n × n) matrix H = (hij )1≤i,j ≤n ,

1 (xi − x̄ )(xj − x̄ )
hij = +
n Sxx

Pn
Since ŷi = k =1 hik yk for all i = 1, . . . , n, we have, in vector notation

ŷ = H y

H is called the hat matrix, since “it puts the hat on y”, or the projection
matrix, for geometric reasons to be discussed later.

Hua Liang (GWU) 2118-M 247 /

Hat Matrix II

Example 18
In the SLR model, take n = 4 and x> = (x1 , x2 , x3 , x4 ) = (1, −1, 2, 2).
The model matrix is
 
1 1
1 −1
X = 1 x = 1 2 


1 2

We have x̄ = 1, Sxx = 6.

H plays a very, very important role in regression, almost as important as

the model matrix X . It has a few important properties:

Hua Liang (GWU) 2118-M 248 /

Hat Matrix III

Theorem 19 (The Hat Matrix H )

H = X (X > X )−1 X >
H > = H (H is symmetrical)
H 2 = H (H is idempotent)
H > H = H , and HH > = H

Example 20
Consider the example above, with n = 4 and x> = (1, −1, 2, 2).

( Intercept ) xx
1 1 1
2 1 -1
3 1 2
4 1 2
attr ( ," assign ")
[1] 0 1

Hua Liang (GWU) 2118-M 249 /

Hat Matrix IV

> H <- X %* % solve ( t (X) %* % X ) %* % t (X)

> H

1 2 3 4
1 0.25 0.2500 0.2500 0.2500
2 0.25 0.9167 -0.0833 -0.0833
3 0.25 -0.0833 0.4167 0.4167
4 0.25 -0.0833 0.4167 0.4167

> H %* % H

1 2 3 4
1 0.25 0.2500 0.2500 0.2500
2 0.25 0.9167 -0.0833 -0.0833
3 0.25 -0.0833 0.4167 0.4167
4 0.25 -0.0833 0.4167 0.4167

Hua Liang (GWU) 2118-M 250 /

Hat Matrix I

Theorem 21 (Distribution of the fitted values)

>
The vector of fitted values, ŷ = ŷ1 ŷ2 . . . ŷn has the following
properties:
1 ŷ = H y
2 E (ŷ) = X β = E (y)
3 var(ŷ) = σ 2 H
4 If e1 , · · · , en are iid N (0, σ 2 ), then

ŷ ∼ MVN(X β, σ 2 H )

Hua Liang (GWU) 2118-M 251 /

Estimating the mean response I

For Massachussets, the Fuel yi = 543.2321, is this less than expected for
the xi = 15.11179
The expected value is

E (yi ) = β0 + β1 · 15.11179

We estimate it by
ŷi = β̂0 + β̂1 xi
So, for MA ŷi = 215.5162 + 25.2530 · 15.11179 = 597.1342. The variance
is
2 1 (xi − x̄ )2
var(ŷi ) = σ + = σ 2 hii
n Sxx
A 95% confidence interval for the mean, at x = xi , if σ is known, is
s
p 1 (xi − x̄ )2
ŷi ± 1.96 ∗ σ hii = ŷi ± 1.96 ∗ σ +
n Sxx
Hua Liang (GWU) 2118-M 252 /
Estimating the mean response II

Usually, σ is unknown, estimated by σ̂. The 95% CI is then

s
p 1 (xi − x̄ )2
ŷi ± q.025 · σ̂ hii = ŷi ± q.025 · σ̂ +
n Sxx

where q.025 is the quantile of the tn−2 distribution which leaves .025
probability in the upper tail.
Data for state AA is not available in the dataset. Suppose we learn that
the purchase rate in state AA is x ∗ = 17.29; what is the expected Fuel for
state AA, based on our analysis?

E (y|x = x ∗ ) = β0 + β1 x ∗

estimated by
µ̂∗ = β̂0 + β̂1 x ∗

Hua Liang (GWU) 2118-M 253 /

Estimating the mean response III

The mean and variance of this estimator are given by E (µ̂∗ ) = β0 + β1 x ∗ ,
and
1 (x ∗ − x̄ )2
var(µ̂∗ ) = σ 2 +
n Sxx
with a 95% CI for the mean
s
1 (x ∗ − x̄ )2
µ̂∗ ± q.025 · σ̂ +
n Sxx

For state AA, µ̂∗ = 215.5162 + 25.2530 ∗ 17.2 = 649.8678, with a 95% CI

649.8678 ± 32.232.

We can draw this 95% CI for each value of x ∗ and get a confidence band
(see figure).

Hua Liang (GWU) 2118-M 254 /

Estimating the mean response IV

800
700
600
Fuel

500
400
300

12 14 16 18

Hua Liang (GWU) 2118-MLog(Mile) 255 /

Prediction I

How can we predict the true value y ∗ ? Is this the same question as
estimating the expected value E (y|x = x ∗ )?
We know that
y ∗ = β0 + β1 x ∗ + e ∗
where e ∗ ∼ N (0, σ 2 ) is a new error, independent of the observed data.
We estimate β0 , β1 by β̂0 , β̂1 , and therefore estimate y ∗ by
ŷ ∗ = β̂0 + β̂1 x ∗ . (Note that ŷ ∗ = µ̂∗ .)
What is the error of this prediction?

Prediction error = y ∗ − ŷ ∗ = (β0 + β1 x ∗ + e ∗ ) − (β̂0 + β̂1 x ∗ )

The mean prediction error is

E (y ∗ − ŷ ∗ ) = E (β0 + β1 x ∗ + e ∗ ) − E (β̂0 + β̂1 x ∗ ) = 0,

which is good!
Hua Liang (GWU) 2118-M 256 /
Prediction II
Show that the variance of the prediction error is
∗ − x̄ )2

1 (x
var(y ∗ − ŷ ∗ ) = . . . = σ 2 1 + +
n Sxx

As with the 95% CI and confidence band for the mean, we can compute a
95% prediction interval (and prediction band):
s
1 (x ∗ − x̄ )2
ŷ ∗ ± 1.96 ∗ σ 1+ +
n Sxx

For state AA, the 95% prediction interval is

649.8678 ± 166.9234

Hua Liang (GWU) 2118-M 257 /

Residuals I
The residuals are

êi = yi − ŷi = yi − (β̂0 + β̂1 xi )

so
yi = ŷi + êi
In vector notation: ê = (ê1 , ê2 . . . , ên )> ,

y = ŷ + ê

Hua Liang (GWU) 2118-M 258 /

Residuals II

Hua Liang (GWU) 2118-M 259 /

Residuals III

Theorem 22 (Distribution of Residuals)

Pn
1 êi = yi − k =1 hik yk
2 E (êi ) = 0, i = 1, . . . , n
3

1 (xi − x̄ )2
var(êi ) = 1 − − = (1 − hii )σ 2
n Sxx

1 (xi −x̄ )(xj −x̄ )
4 for i 6= j , cov(êi , êj ) = −σ 2 n + Sxx = −σ 2 hij
iid
5 If ei ∼ N (0, σ 2 ), then b
ei ∼ N (0, (1 − hii )σ 2 )

Hua Liang (GWU) 2118-M 260 /

Residuals IV

Theorem 23 (Distribution of Residuals)

1 ê = (I − H )y
2 E (ê) = 0
3 var(ê) = σ 2 (In − H )
4 If e ∼ MVN(0, σ 2 I), then ê ∼ MVN(0, σ 2 (I − H ))

Properties of the residuals and fitted values

yi = ŷi + êi
P
êi = 0
P
xi êi = 0
P 2 P 2 P 2
yi = ŷi + êi
E (yi ) = E (ŷi ) + E (êi )
cov(ŷi , êi ) = 0!!!

Hua Liang (GWU) 2118-M 261 /

Residuals V

var(yi ) = var(ŷi ) + var(êi )

cov(yi , yj ) = cov(ŷi , ŷj ) + cov(êi , êj )

Hua Liang (GWU) 2118-M 262 /

Stat 331 Course Notes
No ratings yet
Stat 331 Course Notes
79 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
Student Workbook
100% (10)
Student Workbook
58 pages
Reading 5b
No ratings yet
Reading 5b
6 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
SimpleLinearRegression PDF
No ratings yet
SimpleLinearRegression PDF
86 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Math170S_Lecture6
No ratings yet
Math170S_Lecture6
13 pages
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
No ratings yet
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
23 pages
Day 24 Supervised Learning - REgression Analysis - 2
No ratings yet
Day 24 Supervised Learning - REgression Analysis - 2
18 pages
ch12 0
No ratings yet
ch12 0
82 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
WST 311 Notes part 2 2024
No ratings yet
WST 311 Notes part 2 2024
21 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
ch12_0
No ratings yet
ch12_0
43 pages
PE Civil: Transportation e-book Practice Exam
No ratings yet
PE Civil: Transportation e-book Practice Exam
41 pages
Inference in The Regression Model
No ratings yet
Inference in The Regression Model
4 pages
Regression Analysis
No ratings yet
Regression Analysis
37 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
Multiregression
No ratings yet
Multiregression
34 pages
1 Preliminaries: 1.1 Motivation
No ratings yet
1 Preliminaries: 1.1 Motivation
7 pages
Formula Sheet
No ratings yet
Formula Sheet
8 pages
LM Week1 1 2019
No ratings yet
LM Week1 1 2019
28 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
ExamFinal Topics
No ratings yet
ExamFinal Topics
9 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
Simple Linear Regression (Chapter 11) : Review of Some Inference and Notation: A Common Population Mean Model
No ratings yet
Simple Linear Regression (Chapter 11) : Review of Some Inference and Notation: A Common Population Mean Model
24 pages
FCDS - RA ch4 Sp21
No ratings yet
FCDS - RA ch4 Sp21
18 pages
Econ 471 Notes 1
No ratings yet
Econ 471 Notes 1
14 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
Seattle SISG 18 IntroQG Lecture08
No ratings yet
Seattle SISG 18 IntroQG Lecture08
21 pages
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
No ratings yet
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
14 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Regression 101
No ratings yet
Regression 101
18 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
AllNotes-4 (2)
No ratings yet
AllNotes-4 (2)
56 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
Lecture BDS 3 23 24 Print
No ratings yet
Lecture BDS 3 23 24 Print
20 pages
Topic09. Multiple Regression
No ratings yet
Topic09. Multiple Regression
36 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
Reading 4
No ratings yet
Reading 4
15 pages
Module 3 - SimpleLinearRegression - Afterclass1b
No ratings yet
Module 3 - SimpleLinearRegression - Afterclass1b
26 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
2.4 Confidence Intervals and Prediction Intervals in Linear Models
No ratings yet
2.4 Confidence Intervals and Prediction Intervals in Linear Models
7 pages
Transformation of Axes (Geometry) Mathematics Question Bank
From Everand
Transformation of Axes (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
3/5 (1)
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
9691 s12 QP 13
No ratings yet
9691 s12 QP 13
12 pages
Asia Pacificregionalsigmetguide
No ratings yet
Asia Pacificregionalsigmetguide
65 pages
GEO 702: Technology & Contemporary Environment-Fall 2014
No ratings yet
GEO 702: Technology & Contemporary Environment-Fall 2014
6 pages
Visakhapatnam
0% (1)
Visakhapatnam
227 pages
Presentation Topics BK Executive MBA
No ratings yet
Presentation Topics BK Executive MBA
4 pages
Integrated Circuits: Ain Shams University Faculty of Engineering
No ratings yet
Integrated Circuits: Ain Shams University Faculty of Engineering
11 pages
QDRECL03003 Document and Record Storage Retention and Disposal Procedure
100% (1)
QDRECL03003 Document and Record Storage Retention and Disposal Procedure
4 pages
Aidoo, Everything Counts
No ratings yet
Aidoo, Everything Counts
5 pages
Payment Systems
No ratings yet
Payment Systems
7 pages
Community Mental Health Nursing (CMHN)
100% (2)
Community Mental Health Nursing (CMHN)
44 pages
Solar Power Generation - Technology, New Concepts & Policy
No ratings yet
Solar Power Generation - Technology, New Concepts & Policy
249 pages
Love Song of J Alfred Prufrock
100% (1)
Love Song of J Alfred Prufrock
3 pages
Not Afraid Eminem
No ratings yet
Not Afraid Eminem
6 pages
Turnitin Introduction
No ratings yet
Turnitin Introduction
13 pages
Safety Clamp Catalog
No ratings yet
Safety Clamp Catalog
90 pages
Auditor Training Content Files Combined
No ratings yet
Auditor Training Content Files Combined
211 pages
Angga Pratama Putra: Bachelor of Mining Engineering
No ratings yet
Angga Pratama Putra: Bachelor of Mining Engineering
1 page
Kami Export - Setting Financial Goals Note Taking Guide 2 1 4 l1 2
No ratings yet
Kami Export - Setting Financial Goals Note Taking Guide 2 1 4 l1 2
2 pages
Interally Summary
No ratings yet
Interally Summary
3 pages
Python For Chemistry in 21 Days: Minutes
No ratings yet
Python For Chemistry in 21 Days: Minutes
32 pages
Analysis of New Metal From Roswell
No ratings yet
Analysis of New Metal From Roswell
9 pages
Science Questions For SSC CGL and MTS Exam 2017: Chemistry/Biology/Physics
No ratings yet
Science Questions For SSC CGL and MTS Exam 2017: Chemistry/Biology/Physics
4 pages
English As The Global Language of Science
No ratings yet
English As The Global Language of Science
4 pages
VSX 1021 K
No ratings yet
VSX 1021 K
184 pages
Envirolastic 980 PA
No ratings yet
Envirolastic 980 PA
4 pages
UnME Jeans
No ratings yet
UnME Jeans
5 pages
Chapter 13 - Rscmodified
No ratings yet
Chapter 13 - Rscmodified
59 pages
XCP or "Universal Measurement and Calibration Protocol" Is A Network Protocol
No ratings yet
XCP or "Universal Measurement and Calibration Protocol" Is A Network Protocol
2 pages
Darwin S Dangerous Idea Daniel Den Nett
100% (1)
Darwin S Dangerous Idea Daniel Den Nett
86 pages