0% found this document useful (0 votes)
118 views

CHAPTER 3 Multiple Linear Regression

H0: β1 = β2 = β3 = 0 (No relationship between independent variables and dependent variable) H1: At least one βi ≠ 0 (There is a relationship between at least one independent variable and dependent variable) Test Statistic: F = 572.17 Critical Value: F0.05,2,22 = 3.49 Conclusion: Since F = 572.17 > 3.49, we reject the null hypothesis and conclude that the multiple linear regression model is statistically significant in explaining the relationship between pull strength, length and height.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views

CHAPTER 3 Multiple Linear Regression

H0: β1 = β2 = β3 = 0 (No relationship between independent variables and dependent variable) H1: At least one βi ≠ 0 (There is a relationship between at least one independent variable and dependent variable) Test Statistic: F = 572.17 Critical Value: F0.05,2,22 = 3.49 Conclusion: Since F = 572.17 > 3.49, we reject the null hypothesis and conclude that the multiple linear regression model is statistically significant in explaining the relationship between pull strength, length and height.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

FEM 2063 - Data Analytics

Chapter 3
Multiple Linear Regression

1
Overview

➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

2
Overview

➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

3
3.1 Background
Simple regression considers
the relation between a single
independent variable and
dependent variable

Multiple regression
simultaneously considers the
influence of multiple
independent variables on a
dependent variable Y
4
3.1 Background
◼ A simple regression model
fits a regression line in 2-
dimensional space

◼ A multiple regression model


with two independent
variables fits a regression
plane in 3-dimensional space

5
Overview
➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

6
3.2 Multiple Linear Regression
Regression coefficients are estimated by minimizing
SSE to derive this model:

Again, estimates for the multiple slope


coefficients are derived by minimizing SSE derive
this multiple regression model:

+…

7
3.2 Multiple Linear Regression
❑ An extension of a simple linear regression model.
❑ Allows the dependent variable y to be modeled as a linear
function of more than one independent variable xi

❑ Consider the following data consisting of n sets of values


(𝑦1 , 𝑥11 , 𝑥21 , . . . . 𝑥𝑘1 )
(𝑦2 , 𝑥12 , 𝑥22 , . . . . 𝑥𝑘2 )
.
(𝑦𝑛 , 𝑥1𝑛 , 𝑥2𝑛 , . . . . 𝑥𝑘𝑛 )

8
3.2 Multiple Linear Regression
❑ The value of the dependent variable yi is modeled as

❑ The dependent variable is related to k independent


variables.

❑ As in SLR, the parameters of MLR (𝛽0 , 𝛽1 , . . . , 𝛽𝑘 ) also


estimated using the method of least squares.

❑ However, it would be tedious to find these values by


hand, thus we use the computer to handle the
computations. 9
Overview

➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

10
3.3 Software Output
The software (Excel) output

Part 3. Reg Statistics

Part 2. ANOVA

Part 1. Regression
analysis

11
Overview
➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

12
3.4 ANOVA
Source Sum of Degrees Mean Computed F
Of variation Squares of Square
freedom (Sum of squares /
(df) df)

𝑆𝑆𝑅
Regression SSR k 𝑀𝑆𝑅 = F = MSR/MSE
𝑘
𝑆𝑆𝐸
Error SSE n – (k+1) 𝑀𝑆𝐸 =
𝑛 − (𝑘 + 1)

Total SST n–1


13
Overview
➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

14
3.5 Model Evaluation - (i) Standard error
of estimate (s)
𝐒𝐒𝐄
𝜎ො 𝟐 =
➢ Compute Standard Error of Estimate by 𝐧−𝑘−1


➢ This is an unbiased estimator for s 2 (for Population)

➢ The smaller SSE the more successful is the Multiple Linear


Regression Model in explaining y.

15
3.5 Model Evaluation – (ii) Coefficient of
Determination
❑ Coefficient of determination 𝑅2 =
𝑆𝑆𝑇 − 𝑆𝑆𝐸 𝑆𝑆𝑅
𝑆𝑆𝑇
=
𝑆𝑆𝑇
=1−
𝑆𝑆𝐸
𝑆𝑆𝑇

❑ proportion of variability in the observed dependent


variable that is explained by the MLR model.

❑ The coefficient of determination measures the strength


of that linear relationship, denoted by R2

❑ The greater R2 the more successful is the MLR Model

16
3.5 Model Evaluation – (iii) The
hypothesis test of the slope (t-test)
▪ The t-test addresses the adequate relationship between
xi and y exists.
▪ Test the hypothesis
H0 : 𝛽𝑖 = 0 (No relationship between xi and y)
H1: 𝛽𝑖 ≠ 0 (There is relationship between xi and y)
𝛽መ𝑖 − 𝛽𝑖 𝛽መ𝑖 − 𝛽𝑖
▪ Test Statistic: T – distribution: 𝑇= =
𝑠𝑒(𝛽መ𝑖 )
𝜎ො 2
𝑠𝑠𝑥𝑥
▪ Critical Region: |T | > tα/2, n-k-1 .
17
3.5 Model Evaluation – (iii) The
hypothesis test of the slope (t-test)

The t – test is used to test for inference on


individual regression coefficient.

18
3.5 Model Evaluation – (iii) Testing the
significance of regression (F-test)
𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 =. . . . = 𝛽𝑘 = 0
Hypotheses:
𝐻1 : at least one𝛽𝑗 ≠ 0
𝑀𝑆𝑅
Test statistic: 𝐹0 =
𝑀𝑆𝐸
𝑆𝑆𝑅 𝑆𝑆𝐸
𝑀𝑆𝑅 = , 𝑀𝑆𝐸 =
where: 𝑘 𝑛−𝑘−1

𝑀𝑆𝑅
Rejection criteria: 𝐹0 = > 𝑓𝛼,𝑘,𝑛−𝑘−1
𝑀𝑆𝐸

19
3.5 Model Evaluation – (iii) Testing the
significance of regression (F-test)

❑ The F – test is used to test for inference on multiple linear


regression model

20
Overview
➢ 3.1 Background
➢ 3.2 Multiple Linear Regression (MLR)
➢ 3.3 Software Output
➢ 3.4 ANOVA
➢ 3.5 Model Evaluation
➢ 3.6 Application/Examples

21
3.5 Application/Examples
Wire Bond Pull Strength Data

22
Wire Bond Pull Strength Data
I. Estimate the Multiple linear regression (MLR) equation

II. Find the standard error of estimate of this MLR.

III. Determine the coefficient of determination of this MLR.

IV. Test for significance of Slopes at 5% significance level.

V. Test for significance of MLR at 5% significance level.

23
Wire Bond Pull Strength Data
Regression Statistics
Multiple R 0.990523843
R Square 0.981137483

Adjusted R Square 0.979422709


Standard Error 2.288046833
Observations 25

ANOVA
df SS MS F Significance F
Regression 2 5990.771221 2995.385611 572.1672 1.07546E-19
Residual 22 115.1734828 5.235158308
Total 24 6105.944704

Upper
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% 95.0%
Intercept 2.263791434 1.060066238 2.135518851 0.044099 0.065348613 4.462234256 0.06534861 4.462234
X Variable 1 2.744269643 0.093523844 29.34299438 3.91E-19 2.550313061 2.938226226 2.55031306 2.938226
X Variable 2 0.012527811 0.002798419 4.476746229 0.000188 0.006724246 0.018331377 0.00672425 0.018331

24
Wire Bond Pull Strength Data
The Estimated Multiple Linear regression equation is

Strength = 2.26 + 2.74*Length + 0.0125 Height

25
Wire Bond Pull Strength Data

◼ Standard error of estimate (s) = 2.288

◼ Coefficient of determination (R2) = 98.1%

26
Wire Bond Pull Strength Data
H0 : 𝛽𝑖 = 0 (No relationship between xi and y)
H1: 𝛽𝑖 ≠ 0 (There is relationship between xi and y)

Test Statistic: 𝑇1 = 29.34 & 𝑇2 = 4.48 (From the table)

27
Wire Bond Pull Strength Data

Critical Value tα/2, n-p = t0.05/2, 22 = 2.074 (from statistical table)

Conclusion
Since 𝑇1 = 29.34 & 𝑇2 = 4.48 > 2.074, we reject H0 , and conclude
that pull strength is linearly related wire length and die height

28
Wire Bond Pull Strength Data

𝐻0 : 𝛽1 = 𝛽2 = 0
Hypotheses:
𝐻1 : at least one 𝛽𝑗 ≠ 0

𝑀𝑆𝑅
Test statistic: 𝐹0 = = 2995.4/5.2 = 572.17
𝑀𝑆𝐸

29
Wire Bond Pull Strength Data

𝑀𝑆𝑅
Rejection criteria: 𝐹0 = > 𝑓𝛼,𝑘,𝑛−𝑝
𝑀𝑆𝐸

Let  = 0.05. Since k = 2, n-p =22, we require to find F(0.05,2,22).

From table we find that F(0.05, 2, 22) = 3.44.

Conclusion

Since 572.17 > 3.44 we Reject H0 and conclude that pull strength is
linearly related to either wire length or die height or both

30
Example 2
A set of experimental runs were made to determine a way of
predicting cooking time y at various levels of oven width x1, and
temperature x2. The data were recorded as follows:

i. Estimate the Multiple linear regression (MLR)


equation
ii. Find the standard error of estimate of this
MLR.
iii. Determine the coefficient of determination of
this MLR.
iv. Test for significance of Slopes at 1%
significance level.
v. Test for significance of MLR at 1%
significance level.
Cooking time, oven width and
temperature

32
Cooking time, oven width and
temperature
i. MLR equation

ii. Find the standard error of estimate of this MLR.

iii. Determine the coefficient of determination of this MLR.


Cooking time, oven width and
temperature
iv. Test for significance of Slopes at 1% significance level.

v. Test for significance of MLR at 1% significance level.


35

You might also like