CE 207 Lecture 10 - Linear Regression
CE 207 Lecture 10 - Linear Regression
Lecture# 10
Linear Regression
(Ref: Chapter 9 of Sheldon M. Ross)
E(Y|x) = 0 + 1x
𝐿 = 𝜀𝑖2 = (𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 )2
𝑖=1 𝑖=1
n n
yi xi
n
i =1
yi xi − i =1
n
ˆ =
i =1
2
n
xi
n
i
x 2
− i =1
i =1 n
ˆ +
yˆ = ˆ x
yi = ˆ + ˆ xi + ei , i = 1, 2, , n
Where 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 is called the residual. The residual describes the
error in the fit of the model to the ith observation yi.
𝑖=1 𝑖=1
2
𝑆𝑆𝐸
𝜎ො =
𝑛−2
Degrees of freedom n – 2.
January 2023 Semester - SMR CE 207_Linear Regression
10
Variance of the Error, 2
Estimating 2
𝑛 𝑛
2
𝑆𝑆𝐸 = 𝑦𝑖 − 𝑦ො𝑖 2
= 𝑦𝑖 − (β 0 + β 1 𝑥𝑖 )
𝑖=1 𝑖=1
Estimators
• Slope Properties
ˆ )= 2
E (1 1
ˆ )=
V (1
S xx
• Intercept Properties
1 x 2
E (ˆ 0 ) = 0 and V (ˆ 0 ) = 2 +
n S xx
ෝ2
σ
𝑠𝑒(β 1 ) =
𝑆𝑥𝑥
1 𝑥lj 2
𝑠𝑒(β 0 ) = ෝ2
σ +
𝑛 𝑆𝑥𝑥
Regression
Test of Slope
H0: 1 = 1,0
H1: 1 1,0
An appropriate test statistic would be
β 1 − β1,0
𝑇0 =
ෝ2
σ
𝑆𝑥𝑥
β 1 − β 1,0
𝑇0 =
𝑠𝑒(β 1 )
We would reject the null hypothesis if |t0| > ta/2,n - 2
Regression
Test of Intercept
H0: 0 = 0,0
H1: 0 0,0
An appropriate test statistic would be
β 0 − β0,0 β 0 − β0,0
𝑇0 = =
1 𝑥lj 2 𝑠𝑒(β 0 )
ෝ2
σ +
𝑛 𝑆𝑥𝑥
Solution:
𝑛=7 𝑥𝑖 = 420 𝑦𝑖 = 156.4
𝑥lj = 60 𝑦lj = 22.34
𝑥𝑖2 = 25900
𝑥𝑖 𝑦𝑖 = 9265
∑ 𝑥𝑖 2
𝑆𝑥𝑥 = 𝑥𝑖2 − = 700
𝑛
𝑆𝑥𝑦 119
β 1 = =− = −0.17
𝑆𝑥𝑥 700
𝑦ො = 32.54 − 0.17𝑥
Solution:
𝜀𝑖2 = 1.527 β 1
𝑇= = −8.14
𝑠𝑒(β1 )
𝜎 2 = 0.305
ෝ2
σ β 0
𝑠𝑒 β 1 = = 0.0208 𝑇= = 25.61
𝑆𝑥𝑥 𝑠𝑒(β 0 )
1 𝑥lj 2
𝑠𝑒 β 0 = ෝ2
σ + = 1.27
𝑛 𝑆𝑥𝑥
Regression Statistics
Multiple R 0.964
R Square 0.930
Adjusted R Square 0.916
Standard Error 0.553
Observations 7
ANOVA
df SS MS F p-value
Regression 1 20.23 20.23 66.23 0.00045476
Residual 5 1.53 0.31
Total 6 21.76
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 32.54 1.27 25.61 1.6943E-06 29.28 35.81 29.28 35.81
Speed -0.17 0.02 -8.14 0.00045476 -0.22 -0.12 -0.22 -0.12
Residual Analysis
• Fitting a regression model requires several
assumptions.
▪ Errors are uncorrelated random variables with mean
zero;
▪ Errors have constant variance; and,
▪ Errors be normally distributed.
• Analysis of the residuals (𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 ) is frequently
helpful in checking the assumption that the errors are
approximately normally distributed with constant
variance, and in determining whether additional terms
in the model would be useful.
Residual Analysis
Patterns for residual plots.
(a) Satisfactory.
(b) Funnel.
(c) Double bow.
(d) Nonlinear.
𝑦𝑖 − 𝑦lj 2 = 𝑦ො𝑖 − 𝑦lj 2 + 𝑦𝑖 − 𝑦ො𝑖 2
• Chapter 9:
5, 11, 12, 18