0% found this document useful (0 votes)
9 views

CE 207 Lecture 10 - Linear Regression

Uploaded by

Diganta Nandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

CE 207 Lecture 10 - Linear Regression

Uploaded by

Diganta Nandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CE 207: Applied Mathematics for Engineers

Lecture# 10
Linear Regression
(Ref: Chapter 9 of Sheldon M. Ross)

Dr. Sheikh Mokhlesur Rahman


Associate Professor, Dept. of CE
Contact: [email protected]
2
Simple Linear Regression

January 2023 Semester - SMR CE 207_Linear Regression


3
Simple Linear Regression

➢The simple linear regression considers a single


regressor or predictor x and a dependent or
response variable Y.
➢The expected value of Y at each level of x is a
random variable:

E(Y|x) = 0 + 1x

➢ We assume that each observation, Y, can be


described by the model
Y = 0 + 1x + 

January 2023 Semester - SMR CE 207_Linear Regression


4
Least Squares Estimates

Objective of estimating the parameters β0 and β1 is to minimize the


sum of the squares of the vertical deviations.

Known as Least Square method

the n observations in the sample 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀

the sum of the squares of the deviations


𝑛 𝑛

𝐿 = ෍ 𝜀𝑖2 = ෍(𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 )2
𝑖=1 𝑖=1

January 2023 Semester - SMR CE 207_Linear Regression


5
Least Squares Estimates

least squares normal equations


January 2023 Semester - SMR CE 207_Linear Regression
6
Least Squares Estimates

The least-squares estimates of the intercept and slope in the simple


linear regression model are
ˆ = y−
 ˆ x
 

 n  n 
  yi    xi 
n   
   i =1 
 yi xi − i =1
n
ˆ =
 i =1
 2
 n 
  xi 
n  
 i
x 2
−  i =1 
i =1 n

where 𝑦lj = (1/𝑛) ∑𝑛𝑖=1 𝑦𝑖 and 𝑥lj = (1/𝑛) ∑𝑛𝑖=1 𝑥𝑖 .

January 2023 Semester - SMR CE 207_Linear Regression


7
Simple Linear Regression
The fitted or estimated regression line is therefore

ˆ +
yˆ =  ˆ x
 

Note that each pair of observations satisfies the relationship

yi = ˆ  + ˆ xi + ei , i = 1, 2, , n
Where 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 is called the residual. The residual describes the
error in the fit of the model to the ith observation yi.

January 2023 Semester - SMR CE 207_Linear Regression


8
Simple Linear Regression
Notation
2
 n 
  xi 
n n  
S xx =  ( xi − x )2 =  xi2 −  i =1 
i =1 i =1 n
𝑛 𝑛
∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖
𝑆𝑥𝑦 = ෍ 𝑦𝑖 − 𝑦lj 𝑥𝑖 − 𝑥lj = ෍ 𝑥𝑖 𝑦𝑖 −
𝑛
𝑖=1 𝑖=1

So, ෠β1 = 𝑆𝑥𝑦


𝑆𝑥𝑥

January 2023 Semester - SMR CE 207_Linear Regression


9
Variance of the Error, 2
Estimating 2
Residual error/ residual: 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖

The error sum of squares is


𝑛 𝑛

𝑆𝑆𝐸 = ෍ 𝑒𝑖2 = ෍ 𝑦𝑖 − 𝑦ො𝑖 2

𝑖=1 𝑖=1

2
𝑆𝑆𝐸
𝜎ො =
𝑛−2

Degrees of freedom n – 2.
January 2023 Semester - SMR CE 207_Linear Regression
10
Variance of the Error, 2
Estimating 2
𝑛 𝑛
2
𝑆𝑆𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖 2
= ෍ 𝑦𝑖 − (β෠ 0 + β෠ 1 𝑥𝑖 )
𝑖=1 𝑖=1

𝑆𝑆𝐸 = ∑𝑛𝑖=1 𝑦𝑖 2 − β෠ 0 ∑𝑛𝑖=1 𝑦𝑖 − β෠ 1 ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖

Simplifying, 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − β෠ 1 𝑆𝑥𝑦

where, 𝑆𝑆𝑇 = ∑𝑛𝑖=1 𝑦𝑖 − 𝑦ത 2


= ∑𝑛𝑖=1 𝑦𝑖 2 − 𝑛𝑦ത 2

January 2023 Semester - SMR CE 207_Linear Regression


Properties of the Least Squares 11

Estimators

• Slope Properties

ˆ )=  2
E (1 1
ˆ )=
V (1
S xx

• Intercept Properties

 1 x 2 
E (ˆ 0 ) = 0 and V (ˆ 0 ) =  2  + 
 n S xx 

January 2023 Semester - SMR CE 207_Linear Regression


12
Estimated Standard Error

Standard error of the slope

ෝ2
σ
𝑠𝑒(β෠ 1 ) =
𝑆𝑥𝑥

Standard error of the intercept

1 𝑥lj 2
𝑠𝑒(β෠ 0 ) = ෝ2
σ +
𝑛 𝑆𝑥𝑥

January 2023 Semester - SMR CE 207_Linear Regression


Hypothesis Tests in Simple Linear 13

Regression
Test of Slope
H0: 1 = 1,0
H1: 1  1,0
An appropriate test statistic would be
β෠ 1 − β1,0
𝑇0 =
ෝ2
σ
𝑆𝑥𝑥
β෠ 1 − β෠ 1,0
𝑇0 =
𝑠𝑒(β෠ 1 )
We would reject the null hypothesis if |t0| > ta/2,n - 2

January 2023 Semester - SMR CE 207_Linear Regression


Hypothesis Tests in Simple Linear 14

Regression
Test of Intercept
H0: 0 = 0,0
H1: 0  0,0
An appropriate test statistic would be
β෠ 0 − β0,0 β෠ 0 − β0,0
𝑇0 = =
1 𝑥lj 2 𝑠𝑒(β෠ 0 )
ෝ2
σ +
𝑛 𝑆𝑥𝑥

We would reject the null hypothesis if |t0| > ta/2,n - 2

January 2023 Semester - SMR CE 207_Linear Regression


15
Significance of Regression

Are the slopes equal to 0?

Are the slopes equal to 0?

January 2023 Semester - SMR CE 207_Linear Regression


16
Significance of Regression
An important special case of the hypotheses is
H0: 1 = 0
H1: 1  0
These hypotheses relate to the significance of
regression.

Failure to reject H0 is equivalent to concluding that


there is no linear relationship between x and Y.

January 2023 Semester - SMR CE 207_Linear Regression


17
Significance of Regression

January 2023 Semester - SMR CE 207_Linear Regression


18
Example Problem 1
Problem: To find the relationship between fuel
consumption and speed of the car. The miles per
gallon attained at various speeds was determined.
Find the relationship between fuel consumption
and speed.

Solution:
𝑛=7 ෍ 𝑥𝑖 = 420 ෍ 𝑦𝑖 = 156.4

𝑥lj = 60 𝑦lj = 22.34
෍ 𝑥𝑖2 = 25900

෍ 𝑥𝑖 𝑦𝑖 = 9265

January 2023 Semester - SMR CE 207_Linear Regression


19
Example Problem 1
Problem: To find the relationship between fuel consumption and
speed of the car. The miles per gallon attained at various speeds was
determined. Find the relationship between fuel consumption and speed.
∑ 𝑥𝑖 ∑ 𝑦𝑖
𝑆𝑥𝑦 = ෍ 𝑥𝑖 𝑦𝑖 − = −119
𝑛

∑ 𝑥𝑖 2
𝑆𝑥𝑥 = ෍ 𝑥𝑖2 − = 700
𝑛

𝑆𝑥𝑦 119
β෠ 1 = =− = −0.17
𝑆𝑥𝑥 700

β෠ 0 = 𝑦lj − β෠ 1 𝑥lj = 22.34 − −0.17 ∗ 60 = 32.54

𝑦ො = 32.54 − 0.17𝑥

January 2023 Semester - SMR CE 207_Linear Regression


20
Example Problem 2
Problem: Test the hypothesis that there is no regression of the speed
on fuel consumption (Does the speed of the car affect the fuel
consumption?).

Solution:
෍ 𝜀𝑖2 = 1.527 β෠ 1
𝑇= = −8.14

𝑠𝑒(β1 )
෍ 𝜎 2 = 0.305

ෝ2
σ β෠ 0
𝑠𝑒 β෠ 1 = = 0.0208 𝑇= = 25.61
𝑆𝑥𝑥 𝑠𝑒(β෠ 0 )

1 𝑥lj 2
𝑠𝑒 β෠ 0 = ෝ2
σ + = 1.27
𝑛 𝑆𝑥𝑥

January 2023 Semester - SMR CE 207_Linear Regression


21
Table for t-Distribution

January 2023 Semester - SMR CE 207_Linear Regression


22
Regression Output from Software
Solution:

Regression Statistics
Multiple R 0.964
R Square 0.930
Adjusted R Square 0.916
Standard Error 0.553
Observations 7

ANOVA
df SS MS F p-value
Regression 1 20.23 20.23 66.23 0.00045476
Residual 5 1.53 0.31
Total 6 21.76

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 32.54 1.27 25.61 1.6943E-06 29.28 35.81 29.28 35.81
Speed -0.17 0.02 -8.14 0.00045476 -0.22 -0.12 -0.22 -0.12

January 2023 Semester - SMR CE 207_Linear Regression


Adequacy of the Regression Model: 23

Residual Analysis
• Fitting a regression model requires several
assumptions.
▪ Errors are uncorrelated random variables with mean
zero;
▪ Errors have constant variance; and,
▪ Errors be normally distributed.
• Analysis of the residuals (𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 ) is frequently
helpful in checking the assumption that the errors are
approximately normally distributed with constant
variance, and in determining whether additional terms
in the model would be useful.

January 2023 Semester - SMR CE 207_Linear Regression


Adequacy of the Regression Model: 24

Residual Analysis
Patterns for residual plots.
(a) Satisfactory.
(b) Funnel.
(c) Double bow.
(d) Nonlinear.

January 2023 Semester - SMR CE 207_Linear Regression


Adequacy of the Regression Model: 25

Coefficient of Determination (R2)


• Ratio of sum of squares
2
𝑆𝑆𝑅 𝑆𝑆𝐸 0  R2  1
𝑅 = =1 −
𝑆𝑆𝑇 𝑆𝑆𝑇
• Called the coefficient of determination and is often
used to judge the adequacy of a regression model.
• We often refer (loosely) to R2 as the amount of
variability in the data explained or accounted for by
the regression model.
𝑛 𝑛 𝑛

෍ 𝑦𝑖 − 𝑦lj 2 = ෍ 𝑦ො𝑖 − 𝑦lj 2 + ෍ 𝑦𝑖 − 𝑦ො𝑖 2

𝑖=1 𝑖=1 𝑖=1

𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸


January 2023 Semester - SMR CE 207_Linear Regression
26
Regression on Transformed Variable

January 2023 Semester - SMR CE 207_Linear Regression


27
Multiple Linear Regression
➢Many applications of regression analysis involve
situations in which there are more than one
independent/regressor variable.
➢A regression model that contains more than one
independent variable is called a multiple regression
model.
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑚 𝑋𝑚 + 𝜀

➢Suppose that the effective life of a cutting tool depends


on the cutting speed and the tool angle. A possible
multiple regression model could be
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝜀
here, Y: tool life, X1: cutting speed and X2: tool angle
January 2023 Semester - SMR CE 207_Linear Regression
28
Practice Problems

• Chapter 9:
5, 11, 12, 18

January 2023 Semester - SMR CE 207_Linear Regression

You might also like