CE 207: Applied Mathematics for Engineers
Lecture# 10
Linear Regression
(Ref: Chapter 9 of Sheldon M. Ross)
Dr. Sheikh Mokhlesur Rahman
Associate Professor, Dept. of CE
Contact: smrahman@[Link]
2
Simple Linear Regression
January 2023 Semester - SMR CE 207_Linear Regression
3
Simple Linear Regression
➢The simple linear regression considers a single
regressor or predictor x and a dependent or
response variable Y.
➢The expected value of Y at each level of x is a
random variable:
E(Y|x) = 0 + 1x
➢ We assume that each observation, Y, can be
described by the model
Y = 0 + 1x +
January 2023 Semester - SMR CE 207_Linear Regression
4
Least Squares Estimates
Objective of estimating the parameters β0 and β1 is to minimize the
sum of the squares of the vertical deviations.
Known as Least Square method
the n observations in the sample 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀
the sum of the squares of the deviations
𝑛 𝑛
𝐿 = 𝜀𝑖2 = (𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 )2
𝑖=1 𝑖=1
January 2023 Semester - SMR CE 207_Linear Regression
5
Least Squares Estimates
least squares normal equations
January 2023 Semester - SMR CE 207_Linear Regression
6
Least Squares Estimates
The least-squares estimates of the intercept and slope in the simple
linear regression model are
ˆ = y−
ˆ x
n n
yi xi
n
i =1
yi xi − i =1
n
ˆ =
i =1
2
n
xi
n
i
x 2
− i =1
i =1 n
where 𝑦lj = (1/𝑛) ∑𝑛𝑖=1 𝑦𝑖 and 𝑥lj = (1/𝑛) ∑𝑛𝑖=1 𝑥𝑖 .
January 2023 Semester - SMR CE 207_Linear Regression
7
Simple Linear Regression
The fitted or estimated regression line is therefore
ˆ +
yˆ = ˆ x
Note that each pair of observations satisfies the relationship
yi = ˆ + ˆ xi + ei , i = 1, 2, , n
Where 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 is called the residual. The residual describes the
error in the fit of the model to the ith observation yi.
January 2023 Semester - SMR CE 207_Linear Regression
8
Simple Linear Regression
Notation
2
n
xi
n n
S xx = ( xi − x )2 = xi2 − i =1
i =1 i =1 n
𝑛 𝑛
∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖
𝑆𝑥𝑦 = 𝑦𝑖 − 𝑦lj 𝑥𝑖 − 𝑥lj = 𝑥𝑖 𝑦𝑖 −
𝑛
𝑖=1 𝑖=1
So, β1 = 𝑆𝑥𝑦
𝑆𝑥𝑥
January 2023 Semester - SMR CE 207_Linear Regression
9
Variance of the Error, 2
Estimating 2
Residual error/ residual: 𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖
The error sum of squares is
𝑛 𝑛
𝑆𝑆𝐸 = 𝑒𝑖2 = 𝑦𝑖 − 𝑦ො𝑖 2
𝑖=1 𝑖=1
2
𝑆𝑆𝐸
𝜎ො =
𝑛−2
Degrees of freedom n – 2.
January 2023 Semester - SMR CE 207_Linear Regression
10
Variance of the Error, 2
Estimating 2
𝑛 𝑛
2
𝑆𝑆𝐸 = 𝑦𝑖 − 𝑦ො𝑖 2
= 𝑦𝑖 − (β 0 + β 1 𝑥𝑖 )
𝑖=1 𝑖=1
𝑆𝑆𝐸 = ∑𝑛𝑖=1 𝑦𝑖 2 − β 0 ∑𝑛𝑖=1 𝑦𝑖 − β 1 ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖
Simplifying, 𝑆𝑆𝐸 = 𝑆𝑆𝑇 − β 1 𝑆𝑥𝑦
where, 𝑆𝑆𝑇 = ∑𝑛𝑖=1 𝑦𝑖 − 𝑦ത 2
= ∑𝑛𝑖=1 𝑦𝑖 2 − 𝑛𝑦ത 2
January 2023 Semester - SMR CE 207_Linear Regression
Properties of the Least Squares 11
Estimators
• Slope Properties
ˆ )= 2
E (1 1
ˆ )=
V (1
S xx
• Intercept Properties
1 x 2
E (ˆ 0 ) = 0 and V (ˆ 0 ) = 2 +
n S xx
January 2023 Semester - SMR CE 207_Linear Regression
12
Estimated Standard Error
Standard error of the slope
ෝ2
σ
𝑠𝑒(β 1 ) =
𝑆𝑥𝑥
Standard error of the intercept
1 𝑥lj 2
𝑠𝑒(β 0 ) = ෝ2
σ +
𝑛 𝑆𝑥𝑥
January 2023 Semester - SMR CE 207_Linear Regression
Hypothesis Tests in Simple Linear 13
Regression
Test of Slope
H0: 1 = 1,0
H1: 1 1,0
An appropriate test statistic would be
β 1 − β1,0
𝑇0 =
ෝ2
σ
𝑆𝑥𝑥
β 1 − β 1,0
𝑇0 =
𝑠𝑒(β 1 )
We would reject the null hypothesis if |t0| > ta/2,n - 2
January 2023 Semester - SMR CE 207_Linear Regression
Hypothesis Tests in Simple Linear 14
Regression
Test of Intercept
H0: 0 = 0,0
H1: 0 0,0
An appropriate test statistic would be
β 0 − β0,0 β 0 − β0,0
𝑇0 = =
1 𝑥lj 2 𝑠𝑒(β 0 )
ෝ2
σ +
𝑛 𝑆𝑥𝑥
We would reject the null hypothesis if |t0| > ta/2,n - 2
January 2023 Semester - SMR CE 207_Linear Regression
15
Significance of Regression
Are the slopes equal to 0?
Are the slopes equal to 0?
January 2023 Semester - SMR CE 207_Linear Regression
16
Significance of Regression
An important special case of the hypotheses is
H0: 1 = 0
H1: 1 0
These hypotheses relate to the significance of
regression.
Failure to reject H0 is equivalent to concluding that
there is no linear relationship between x and Y.
January 2023 Semester - SMR CE 207_Linear Regression
17
Significance of Regression
January 2023 Semester - SMR CE 207_Linear Regression
18
Example Problem 1
Problem: To find the relationship between fuel
consumption and speed of the car. The miles per
gallon attained at various speeds was determined.
Find the relationship between fuel consumption
and speed.
Solution:
𝑛=7 𝑥𝑖 = 420 𝑦𝑖 = 156.4
𝑥lj = 60 𝑦lj = 22.34
𝑥𝑖2 = 25900
𝑥𝑖 𝑦𝑖 = 9265
January 2023 Semester - SMR CE 207_Linear Regression
19
Example Problem 1
Problem: To find the relationship between fuel consumption and
speed of the car. The miles per gallon attained at various speeds was
determined. Find the relationship between fuel consumption and speed.
∑ 𝑥𝑖 ∑ 𝑦𝑖
𝑆𝑥𝑦 = 𝑥𝑖 𝑦𝑖 − = −119
𝑛
∑ 𝑥𝑖 2
𝑆𝑥𝑥 = 𝑥𝑖2 − = 700
𝑛
𝑆𝑥𝑦 119
β 1 = =− = −0.17
𝑆𝑥𝑥 700
β 0 = 𝑦lj − β 1 𝑥lj = 22.34 − −0.17 ∗ 60 = 32.54
𝑦ො = 32.54 − 0.17𝑥
January 2023 Semester - SMR CE 207_Linear Regression
20
Example Problem 2
Problem: Test the hypothesis that there is no regression of the speed
on fuel consumption (Does the speed of the car affect the fuel
consumption?).
Solution:
𝜀𝑖2 = 1.527 β 1
𝑇= = −8.14
𝑠𝑒(β1 )
𝜎 2 = 0.305
ෝ2
σ β 0
𝑠𝑒 β 1 = = 0.0208 𝑇= = 25.61
𝑆𝑥𝑥 𝑠𝑒(β 0 )
1 𝑥lj 2
𝑠𝑒 β 0 = ෝ2
σ + = 1.27
𝑛 𝑆𝑥𝑥
January 2023 Semester - SMR CE 207_Linear Regression
21
Table for t-Distribution
January 2023 Semester - SMR CE 207_Linear Regression
22
Regression Output from Software
Solution:
Regression Statistics
Multiple R 0.964
R Square 0.930
Adjusted R Square 0.916
Standard Error 0.553
Observations 7
ANOVA
df SS MS F p-value
Regression 1 20.23 20.23 66.23 0.00045476
Residual 5 1.53 0.31
Total 6 21.76
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 32.54 1.27 25.61 1.6943E-06 29.28 35.81 29.28 35.81
Speed -0.17 0.02 -8.14 0.00045476 -0.22 -0.12 -0.22 -0.12
January 2023 Semester - SMR CE 207_Linear Regression
Adequacy of the Regression Model: 23
Residual Analysis
• Fitting a regression model requires several
assumptions.
▪ Errors are uncorrelated random variables with mean
zero;
▪ Errors have constant variance; and,
▪ Errors be normally distributed.
• Analysis of the residuals (𝑒𝑖 = 𝑦𝑖 − 𝑦ො𝑖 ) is frequently
helpful in checking the assumption that the errors are
approximately normally distributed with constant
variance, and in determining whether additional terms
in the model would be useful.
January 2023 Semester - SMR CE 207_Linear Regression
Adequacy of the Regression Model: 24
Residual Analysis
Patterns for residual plots.
(a) Satisfactory.
(b) Funnel.
(c) Double bow.
(d) Nonlinear.
January 2023 Semester - SMR CE 207_Linear Regression
Adequacy of the Regression Model: 25
Coefficient of Determination (R2)
• Ratio of sum of squares
2
𝑆𝑆𝑅 𝑆𝑆𝐸 0 R2 1
𝑅 = =1 −
𝑆𝑆𝑇 𝑆𝑆𝑇
• Called the coefficient of determination and is often
used to judge the adequacy of a regression model.
• We often refer (loosely) to R2 as the amount of
variability in the data explained or accounted for by
the regression model.
𝑛 𝑛 𝑛
𝑦𝑖 − 𝑦lj 2 = 𝑦ො𝑖 − 𝑦lj 2 + 𝑦𝑖 − 𝑦ො𝑖 2
𝑖=1 𝑖=1 𝑖=1
𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸
January 2023 Semester - SMR CE 207_Linear Regression
26
Regression on Transformed Variable
January 2023 Semester - SMR CE 207_Linear Regression
27
Multiple Linear Regression
➢Many applications of regression analysis involve
situations in which there are more than one
independent/regressor variable.
➢A regression model that contains more than one
independent variable is called a multiple regression
model.
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑚 𝑋𝑚 + 𝜀
➢Suppose that the effective life of a cutting tool depends
on the cutting speed and the tool angle. A possible
multiple regression model could be
𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + 𝜀
here, Y: tool life, X1: cutting speed and X2: tool angle
January 2023 Semester - SMR CE 207_Linear Regression
28
Practice Problems
• Chapter 9:
5, 11, 12, 18
January 2023 Semester - SMR CE 207_Linear Regression