0% found this document useful (0 votes)

17 views

Chapter 11

statistics

Uploaded by

Ahmed Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Chapter 11

statistics

Uploaded by

Ahmed Mohamed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

1

INTRODUCTION TO STATISTICS &

PROBABILITY

Chapter 11:
Multiple Regression

Dr. Nahid Sultana

1/29/2023 Copyright© Nahid Sultana 2017-2018

Chapter 11
Multiple Regression
2

11.1 Inference for Multiple Regression

Copyright© Nahid Sultana 2017-2018 1/29/2023

11.1 Inference for Multiple Regression
3

 Population multiple regression equation

 Data for multiple regression
 Multiple linear regression model
 Confidence intervals and significance tests
 ANOVA
 Squared multiple correlation R2

Copyright© Nahid Sultana 2017-2018 1/29/2023

Population Multiple Regression Equation
4

 Linear regression model in which the mean response, μy, is related to one
explanatory variable x:

𝜇𝑦 = 𝛽0 + 𝛽1 𝑥
 Usually, more complex linear models are needed in practical situations.
 There are many problems in which knowledge of more than one explanatory
variable is necessary in order to obtain a better understanding and better
prediction of a particular response.

In multiple regression, the response variable y depends on p explanatory

variables 𝑥1 , 𝑥2 , … 𝑥𝑝 :
𝜇𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑝 𝑥𝑝

Copyright© Nahid Sultana 2017-2018 1/29/2023

Data for Multiple Regression
5

The data for a simple linear regression problem consists of n observations

(𝑥𝑖 , 𝑦𝑖 ) of two variables.
Data for multiple linear regression consists of the value of a response
variable y and p explanatory variables (𝑥1 , 𝑥2 , … , 𝑥𝑝 ) on each of n cases.
We write the data and enter them into software in the form:

Variables
Case x1 x2 … xp y
1 x11 x12 … x1p y1
2 x21 x22 … x2p y2
… … … … … …
n xn1 xn2 … xnp yn

Copyright© Nahid Sultana 2017-2018 1/29/2023

Multiple Linear Regression Model
6

The statistical model for multiple linear regression is

𝒚𝒊 = 𝒃𝟎 + 𝒃𝟏𝒙𝒊𝟏 + … +
𝒃𝒑𝒙𝒊𝒑 + 𝒆𝒊
for 𝑖 = 1, 2, … , 𝑛.
The mean response µy is a linear function of the explanatory variables:

𝜇𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑝 𝑥𝑝
The deviations 𝒆𝒊 are independent and Normally distributed N(0, 𝜎).
The parameters of the model are 𝜷𝟎 , 𝜷𝟏 , … 𝜷𝒑 and s.
The coefficient 𝜷𝒊 (𝒊 = 𝟏, … , 𝒑) has the following interpretation: It
represents the average change in the response when the variable xi increases
by one unit and all other x variables are held constant.
Copyright© Nahid Sultana 2017-2018 1/29/2023
Estimation of the Parameters
7

Select a random sample of n individuals on which 𝑝 + 1 variables

(𝑥1, … , 𝑥𝑝, 𝑦) are measured. The least-squares regression method chooses
𝑏0, 𝑏1, … , 𝑏𝑝 to minimize the sum of squared deviations (yi – ŷi)2, where
ŷi = b0 + b1xi1 +… + bpxip
As with simple linear regression, the constant b0 is the y intercept.
 The regression coefficients (𝑏1, … , 𝑏𝑝) reflect the unique association of
each independent variable with the y variable. They are analogous to
the slope in simple regression.
 The parameter 𝜎2 measures the variability of the responses about the
population response mean. The estimator of σ2 is

σ 𝟐 σ 𝟐
𝒆 𝒊 𝒚𝒊 − ෝ
𝒚 𝒊
𝒔𝟐 = =
𝒏−𝒑−𝟏 𝒏−𝒑−𝟏
Copyright© Nahid Sultana 2017-2018 1/29/2023
Confidence Interval for βj
8

Estimating the regression parameters 𝛽0, … , 𝛽𝑗, … , 𝛽𝑝 is a case of one-

sample inference with unknown population variance.
We rely on the t distribution, with n – p – 1 degrees of freedom.

A level C confidence interval for βj is

𝑏𝑗 ± 𝑡 ∗ 𝑆𝐸𝑏𝑗

where 𝑆𝐸𝑏𝑗 is the standard error of bj and t* is the t critical for the

t(n – p – 1) distribution with area C between –t* and t*.

Copyright© Nahid Sultana 2017-2018 1/29/2023

Example
9 A multiple regression was performed on data obtained from 31 trees to
predict usable volume from the height (in feet) and the diameter (in inches)
of the tree. Partial computer results are shown below.
t  1.701
*

Based on these results, a 90% confidence interval for the coefficient 1 of

diameter is
a. 4.708 ± 1.645 (0.2643).
b. 4.708 ± 1.701 (0.2643).
c. 4.708 ± 2.048 (0.2643).
Significance Test for 𝛽𝑗
10

To test the hypothesis H0: 𝜷j = 0 versus a one- or two-sided alternative,

we calculate the t statistic 𝒕 = 𝒃𝒋 ൗ𝑺𝑬𝒃𝒋 , which has the t (n – p – 1) distribution

when H0 is true. The P-value of the test is found in the usual way.

Note: Software typically

provides two-sided P-values.

Copyright© Nahid Sultana 2017-2018 1/29/2023

Example
A multiple regression was performed on data obtained from 50 stores
11
predicting sales given traffic, ease of access, and income. Output is given
below. Using .05 as level of significance, which of the following is true?

Coefficients Standard Error t Stat P-value

Intercept 284.2341216 165.341245 1.719076 0.092324
TRAFFIC 0.42858422 0.996331003 0.430162 0.669086
EASE 6.221037701 2.991297216 2.079712 0.043152
INCOME 6.408191928 5.184074702 1.23613 0.222685

a. Only ease of access is significant.

b. Only traffic is significant.
c. All three variables are significant.
Significance Test for 𝛽𝑗
12

Suppose we test H0: βj = 0 for each j So, failure to reject all such
and find that none of the p tests is hypotheses merely means
significant. that it is safe to throw away
at least one of the variables.
Should we then conclude that none of
the explanatory variables is related to Further analysis must be
the response? done to see which subset
of variables provides the
No, we should not! best model.

When we fail to reject H0: βj = 0, this

means that we probably do not need xj
in the model with all the other variables.

ANOVA F-test for Multiple Regression
13

In multiple regression, the ANOVA F statistic tests the hypotheses

𝐻0 : 𝛽 1 = 𝛽 2 = … = 𝛽 𝑝 = 0
versus Ha: at least one 𝛽𝑗 ≠ 0

by computing the F statistic F = MSM / MSE.

When H0 is true, F follows
the F(p, n − p − 1) distribution.
The P-value is P(F > f ).

A significant P-value does not mean that all p explanatory variables have
a significant influence on y—only that at least one does.

ANOVA Table for Multiple Regression
14

Source Sum of squares SS df Mean square MS F P-value

Model 2 p MSM = SSM/DFM MSM/MSE Tail area above F

෍ 𝑦ො − 𝑦ത

Error 2 n−p−1 MSE = SSE/DFE

෍ 𝑦𝑖 − 𝑦ො𝑖

Total 2 n−1
෍ 𝑦𝑖 − 𝑦ത

SSM = model sum of squares SSE = error sum of squares

SST = total sum of squares SST = SSM + SSE

DFM = p DFE = n − p − 1 DFT = n − 1 DFT = DFM + DFE

Copyright© Nahid Sultana 2017-2018 1/29/2023
Example
A multiple regression was performed on data obtained from 31 trees to
15
predict usable volume from the height (in feet) and the diameter (in inches)
of the tree. Partial computer results are shown below.
Predictor Coef SE Coef t
Constant −57.988 8.638 −6.71
Diameter 4.7082 0.2643 17.82
Height 0.3393 0.1302 2.61
Analysis of Variance
Source DF SS MS F p
Regression 2 7684.2 3842.1
Residual Error 28 421.9
Total 30 8106.1

The value of the MSR for this model is

a. 7684.2. SSR 7684.2
MSR    3842.1
b. 3842.1. df 2
c. 15.07.
Example
16
A multiple regression was performed on data obtained from 31 trees to predict usable
volume from the height (in feet) and the diameter (in inches) of the tree. Partial computer
results are shown below.
Predictor Coef SE Coef t
Constant −57.988 8.638 −6.71
Diameter 4.7082 0.2643 17.82
Height 0.3393 0.1302 2.61
Analysis of Variance
Source DF SS MS F p
Regression 2 7684.2 3842.1 254.99
Residual Error 28 421.9 15.07
Total 30 8106.1

The value of the F statistic for this model is

MSR 3842.1
a. 18.21. F   254.99
b. 254.99. MSE 15.07
c. 14.22.
Squared Multiple Correlation R 2
17

Just as with simple linear regression, R2, the squared multiple correlation, is
the proportion of the variation in the response variable y that is explained by
the model.
σ 𝑦ො𝑖 − 𝑦ത 2
2
𝑆𝑆𝑀
𝑅 = 2
=
σ 𝑦𝑖 − 𝑦ത 𝑆𝑆𝑇

In the particular case of multiple linear regression, the model is all p

explanatory variables taken together.
The square root of R2, called the multiple correlation coefficient, is the
correlation between the observations and the predicted values.

Example
18

A multiple regression was performed on data obtained from 50 stores

predicting sales given traffic, ease of access, and income. Output is given
below. The value of R2 is

ANOVA
df SS MS F
Regression 3 2124768.5 708256.2 17.95249
Residual 46 1814777.9 39451.69
Total 49 3939546.4

Customer Service Representative Cover Letter Windsor - Blue
No ratings yet
Customer Service Representative Cover Letter Windsor - Blue
3 pages
Social Studies 20-1 Position Paper End of Term S2
No ratings yet
Social Studies 20-1 Position Paper End of Term S2
6 pages
Evaluation of Sewing
No ratings yet
Evaluation of Sewing
19 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
17 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Unit 5
No ratings yet
Unit 5
10 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
Week 11-2 Lecture 15 Student
No ratings yet
Week 11-2 Lecture 15 Student
54 pages
C2 English
No ratings yet
C2 English
34 pages
Complete Business Statistics: Multiple Regression
No ratings yet
Complete Business Statistics: Multiple Regression
64 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
15 pages
2b Multiple Linear Regression
No ratings yet
2b Multiple Linear Regression
14 pages
Chapter5-Multiple_Linear_Regression
No ratings yet
Chapter5-Multiple_Linear_Regression
5 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
STAT 252-Notes-Topic 5-Multiple Linear Regression
No ratings yet
STAT 252-Notes-Topic 5-Multiple Linear Regression
33 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
4.1 Multiple Regression Models
No ratings yet
4.1 Multiple Regression Models
6 pages
Multiple Regression
No ratings yet
Multiple Regression
36 pages
Multiple Linear Regression-I
No ratings yet
Multiple Linear Regression-I
6 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
REGRESSION
No ratings yet
REGRESSION
8 pages
EVSC 445 Week 11
No ratings yet
EVSC 445 Week 11
40 pages
Chapter5-Multiple Linear Regression
No ratings yet
Chapter5-Multiple Linear Regression
5 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
No ratings yet
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
6 pages
ADM2304 Multiple Regression Dr. Suren Phansalker
No ratings yet
ADM2304 Multiple Regression Dr. Suren Phansalker
12 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
Multivariate Regression
No ratings yet
Multivariate Regression
20 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
Introduction To Multiple Regression: Dale E. Berger Claremont Graduate University
No ratings yet
Introduction To Multiple Regression: Dale E. Berger Claremont Graduate University
13 pages
125.785 Module 2.2
No ratings yet
125.785 Module 2.2
95 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
MultipleRegression 1
No ratings yet
MultipleRegression 1
40 pages
BS Classes V2
No ratings yet
BS Classes V2
70 pages
Mult Regression
No ratings yet
Mult Regression
28 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
No ratings yet
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
23 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
8 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
Team8 Lab3
No ratings yet
Team8 Lab3
12 pages
chapter 3
No ratings yet
chapter 3
31 pages
Chapter 3 - Multiple Linear Regression Models
No ratings yet
Chapter 3 - Multiple Linear Regression Models
29 pages
L4&5 Multiple Regression 2010B
No ratings yet
L4&5 Multiple Regression 2010B
77 pages
6 Multiple Regression
No ratings yet
6 Multiple Regression
36 pages
EDA Template
No ratings yet
EDA Template
18 pages
Tut Chap 013G25
No ratings yet
Tut Chap 013G25
5 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
No ratings yet
Week 5 Multiple Regression: Busa3500 Statistics For Business Ii Piedmont College
57 pages
Linear Regression Analaysis - 21
No ratings yet
Linear Regression Analaysis - 21
22 pages
CUHK STAT5102 Ch3
No ratings yet
CUHK STAT5102 Ch3
73 pages
C2-English
No ratings yet
C2-English
33 pages
Chapter 3 Multiple Linear Regression - We Use This One
No ratings yet
Chapter 3 Multiple Linear Regression - We Use This One
6 pages
Bivariate
No ratings yet
Bivariate
28 pages
Passing Reference Multiple Regression
No ratings yet
Passing Reference Multiple Regression
10 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
lec 7- Arrays_F
No ratings yet
lec 7- Arrays_F
49 pages
Lec5
No ratings yet
Lec5
38 pages
WB41 All
No ratings yet
WB41 All
46 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Chapter 6
No ratings yet
Chapter 6
39 pages
Lecture 1 PDF
No ratings yet
Lecture 1 PDF
34 pages
Ch3-Gate Level Minimization
No ratings yet
Ch3-Gate Level Minimization
38 pages
Ch4-Combinational Logic
No ratings yet
Ch4-Combinational Logic
57 pages
Desire ةعماجلا ىح - ةيبملولأا ةيرقلا مامأ ىواقرشلا ديعس عراش - لحم ةيصان ىنيصلا ىدانلا راوجب - د لايف / ىدنجلا ديعس - ىناثلا و لولأا رودلا
No ratings yet
Desire ةعماجلا ىح - ةيبملولأا ةيرقلا مامأ ىواقرشلا ديعس عراش - لحم ةيصان ىنيصلا ىدانلا راوجب - د لايف / ىدنجلا ديعس - ىناثلا و لولأا رودلا
22 pages
Lecture 1 PDF
No ratings yet
Lecture 1 PDF
26 pages
Lecture 2
No ratings yet
Lecture 2
4 pages
Lecture 3 PDF
No ratings yet
Lecture 3 PDF
26 pages
National Learning Camp Lac Session Matrix
No ratings yet
National Learning Camp Lac Session Matrix
1 page
Physical Pharmacy Answer Key-PINK PACOP
0% (1)
Physical Pharmacy Answer Key-PINK PACOP
33 pages
Face-to-Face Dimensions For Separable Flanged Globe-Style Control Valves (Classes 150, 300, and 600)
No ratings yet
Face-to-Face Dimensions For Separable Flanged Globe-Style Control Valves (Classes 150, 300, and 600)
14 pages
Sternberg Press - January 2024
No ratings yet
Sternberg Press - January 2024
4 pages
Yds Yokdil Ydt Icin Tenses Pratik Anlatim
No ratings yet
Yds Yokdil Ydt Icin Tenses Pratik Anlatim
52 pages
Mio Developers Guide
No ratings yet
Mio Developers Guide
64 pages
Climax 1
No ratings yet
Climax 1
2 pages
Question Paper Bank Po 36
No ratings yet
Question Paper Bank Po 36
18 pages
Tamil Book
No ratings yet
Tamil Book
36 pages
Present-Simple-Vs-Present-Continuous-English - JPG 720 × 720 Pixels
No ratings yet
Present-Simple-Vs-Present-Continuous-English - JPG 720 × 720 Pixels
1 page
Topic Sentences and Thesis Statements
No ratings yet
Topic Sentences and Thesis Statements
14 pages
Chili Fertigation Investment Opportunity
No ratings yet
Chili Fertigation Investment Opportunity
9 pages
1537326530linear & Angular Measurement
No ratings yet
1537326530linear & Angular Measurement
49 pages
Overview of CRM
0% (1)
Overview of CRM
43 pages
George Orwell Essay Shooting An Elephant
100% (2)
George Orwell Essay Shooting An Elephant
9 pages
JUNE-SSIP.2025Tmetable FINAL
No ratings yet
JUNE-SSIP.2025Tmetable FINAL
15 pages
Assessment 1 Knowledge Question
No ratings yet
Assessment 1 Knowledge Question
19 pages
Introduction To Calorimetry: Experiment
No ratings yet
Introduction To Calorimetry: Experiment
6 pages
Business Economics Meaning, Definitions Etc
No ratings yet
Business Economics Meaning, Definitions Etc
15 pages
MCQ's XI
No ratings yet
MCQ's XI
21 pages
Baum 2016
No ratings yet
Baum 2016
5 pages
Event List 4
No ratings yet
Event List 4
21 pages
Reliance 333
No ratings yet
Reliance 333
7 pages
Company Profile
No ratings yet
Company Profile
17 pages
Steam Turbine Safety and Overspeed Checks: BTCP/BTRF Maintenance Work Practice No. 8041
100% (1)
Steam Turbine Safety and Overspeed Checks: BTCP/BTRF Maintenance Work Practice No. 8041
35 pages
Snuffler 1205
No ratings yet
Snuffler 1205
6 pages
Iko 13250 Iko Calcularea Zonei Ventilate Eng
No ratings yet
Iko 13250 Iko Calcularea Zonei Ventilate Eng
1 page

Chapter 11

Uploaded by

Chapter 11

Uploaded by

1

INTRODUCTION TO STATISTICS &

Dr. Nahid Sultana

1/29/2023 Copyright© Nahid Sultana 2017-2018

11.1 Inference for Multiple Regression

Copyright© Nahid Sultana 2017-2018 1/29/2023

 Population multiple regression equation

Copyright© Nahid Sultana 2017-2018 1/29/2023

In multiple regression, the response variable y depends on p explanatory

Copyright© Nahid Sultana 2017-2018 1/29/2023

The data for a simple linear regression problem consists of n observations

Copyright© Nahid Sultana 2017-2018 1/29/2023

The statistical model for multiple linear regression is

Select a random sample of n individuals on which 𝑝 + 1 variables

Estimating the regression parameters 𝛽0, … , 𝛽𝑗, … , 𝛽𝑝 is a case of one-

A level C confidence interval for βj is

t(n – p – 1) distribution with area C between –t* and t*.

Copyright© Nahid Sultana 2017-2018 1/29/2023

Based on these results, a 90% confidence interval for the coefficient 1 of

To test the hypothesis H0: 𝜷j = 0 versus a one- or two-sided alternative,

we calculate the t statistic 𝒕 = 𝒃𝒋 ൗ𝑺𝑬𝒃𝒋 , which has the t (n – p – 1) distribution

Note: Software typically

Copyright© Nahid Sultana 2017-2018 1/29/2023

Coefficients Standard Error t Stat P-value

a. Only ease of access is significant.

When we fail to reject H0: βj = 0, this

Copyright© Nahid Sultana 2017-2018 1/29/2023

In multiple regression, the ANOVA F statistic tests the hypotheses

by computing the F statistic F = MSM / MSE.

Copyright© Nahid Sultana 2017-2018 1/29/2023

Source Sum of squares SS df Mean square MS F P-value

Model 2 p MSM = SSM/DFM MSM/MSE Tail area above F

Error 2 n−p−1 MSE = SSE/DFE

SSM = model sum of squares SSE = error sum of squares

SST = total sum of squares SST = SSM + SSE

DFM = p DFE = n − p − 1 DFT = n − 1 DFT = DFM + DFE

The value of the MSR for this model is

The value of the F statistic for this model is

In the particular case of multiple linear regression, the model is all p

Copyright© Nahid Sultana 2017-2018 1/29/2023

A multiple regression was performed on data obtained from 50 stores

You might also like