Chapter 11
Chapter 11
Chapter 11:
Multiple Regression
Linear regression model in which the mean response, μy, is related to one
explanatory variable x:
𝜇𝑦 = 𝛽0 + 𝛽1 𝑥
Usually, more complex linear models are needed in practical situations.
There are many problems in which knowledge of more than one explanatory
variable is necessary in order to obtain a better understanding and better
prediction of a particular response.
Variables
Case x1 x2 … xp y
1 x11 x12 … x1p y1
2 x21 x22 … x2p y2
… … … … … …
n xn1 xn2 … xnp yn
𝜇𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑝 𝑥𝑝
The deviations 𝒆𝒊 are independent and Normally distributed N(0, 𝜎).
The parameters of the model are 𝜷𝟎 , 𝜷𝟏 , … 𝜷𝒑 and s.
The coefficient 𝜷𝒊 (𝒊 = 𝟏, … , 𝒑) has the following interpretation: It
represents the average change in the response when the variable xi increases
by one unit and all other x variables are held constant.
Copyright© Nahid Sultana 2017-2018 1/29/2023
Estimation of the Parameters
7
σ 𝟐 σ 𝟐
𝒆 𝒊 𝒚𝒊 − ෝ
𝒚 𝒊
𝒔𝟐 = =
𝒏−𝒑−𝟏 𝒏−𝒑−𝟏
Copyright© Nahid Sultana 2017-2018 1/29/2023
Confidence Interval for βj
8
where 𝑆𝐸𝑏𝑗 is the standard error of bj and t* is the t critical for the
28
when H0 is true. The P-value of the test is found in the usual way.
Suppose we test H0: βj = 0 for each j So, failure to reject all such
and find that none of the p tests is hypotheses merely means
significant. that it is safe to throw away
at least one of the variables.
Should we then conclude that none of
the explanatory variables is related to Further analysis must be
the response? done to see which subset
of variables provides the
No, we should not! best model.
𝐻0 : 𝛽 1 = 𝛽 2 = … = 𝛽 𝑝 = 0
versus Ha: at least one 𝛽𝑗 ≠ 0
A significant P-value does not mean that all p explanatory variables have
a significant influence on y—only that at least one does.
Total 2 n−1
𝑦𝑖 − 𝑦ത
Just as with simple linear regression, R2, the squared multiple correlation, is
the proportion of the variation in the response variable y that is explained by
the model.
σ 𝑦ො𝑖 − 𝑦ത 2
2
𝑆𝑆𝑀
𝑅 = 2
=
σ 𝑦𝑖 − 𝑦ത 𝑆𝑆𝑇
ANOVA
df SS MS F
Regression 3 2124768.5 708256.2 17.95249
Residual 46 1814777.9 39451.69
Total 49 3939546.4
a. 73.4%.
b. 53.9%. R2 = SSReg/Sstotal = 2124768.5/3939546.4
c. 46.1%.
Copyright© Nahid Sultana 2017-2018 1/29/2023