Stats101A - Chapter 2
Stats101A - Chapter 2
• In summary, 𝑒! ~𝑁(0, 𝜎 $ ).
Simple Linear Regression Model
• Note: When 𝜎 $ is also unknown, we estimate it with
&
1
𝑆$ = 3 𝑒̂!$
𝑛−2
!%#
𝐸(𝑌|𝑋 = 𝑥) = 𝐸 𝛽" + 𝛽# 𝑋 + 𝑒| 𝑋 = 𝑥
= 𝛽" + 𝛽# 𝑥 + 𝐸 𝑒
= 𝛽" + 𝛽# 𝑥
40
20
0
0 10 20 30 40 50
Mothers' age
60
Fathers' age
40
20
0
0 10 20 30 40 50
Mothers' age
Sum of squared
Line Color Linear Equation
residuals
Black 𝑦! = 11.54 + 0.68𝑥 38905.1
Green 𝑦! = 1 + 𝑥 54545
Red 𝑦! = 15 + 0.6𝑥 47887.32
15
Statistical inference
• Statistical Inference is the process of drawing
conclusions about populations or scientific truths from
data.
• It includes point/interval estimation or hypothesis testing
for the parameters.
Inference about 𝛽0 and 𝛽1
• 1. Point estimation for 𝛽0 and 𝛽1
• Find 𝛽A" and 𝛽A# using least square method.
• Find the sampling distribution of 𝛽A" and 𝛽A# .
∑#$%! 𝑥$ − 𝑥̅ 𝑦$ − 𝑦-
𝛽'! =
∑#$%!(𝑥$ − 𝑥)̅ "
∑#$%! 𝑥$ − 𝑥̅ 𝑦$ − 𝑦- ∑#$%! 𝑥$ − 𝑥̅
=
∑#$%!(𝑥$ − 𝑥)̅ "
∑#$%! 𝑥$ − 𝑥̅ 𝑦$
= #
∑$%!(𝑥$ − 𝑥)̅ "
#
= 0 𝑐$ 𝑦$
$%!
('! (')̅
, where 𝑐$ = .
+,,
Inference about the slope
• (Sampling) Distribution of 𝛽A#
𝜎$
𝛽A# |𝑋~𝑁(𝛽# , )
SXX
𝐸 𝛽A# |𝑋 = 𝛽#
𝜎$
𝑉𝑎𝑟 𝛽A# |𝑋 =
SXX
𝛽A# − 𝛽#
𝑍= ~𝑁(0,1)
𝜎/ SXX
𝛽A# − 𝛽#
𝛽A# − 𝛽#
𝑇= = ~𝑇&($
A
𝑆/ SXX 𝑠𝑒(𝛽# )
)
, where 𝑠𝑒 𝛽A# = is the estimated standard error(se) of 𝛽A# .
*++
Inference about the slope
• Hypothesis testing for the significance of 𝛽1 : “The
slope(𝛽1) is significant” implies that the two variables have
a statistically significant linear association.
• The interval tells us the feasible values for the true 𝛽1.
Inference about the intercept
• Using Least square method, we find
𝛽A" = 𝑦T − 𝛽A# 𝑥̅
• (Sampling) Distribution of
1 𝑥̅ $
𝛽A" |𝑋~𝑁 𝛽" , 𝜎 $ +
𝑛 SXX
# 0̅ !
, and 𝑠𝑒(𝛽A" ) = 𝑆 + . is the estimated standard error of
& *++
𝛽A" .
Inference about the intercept
• Test statistic for 𝐻" : 𝛽" = 𝛽"∗
𝛽A" − 𝛽"∗
𝑇= ~𝑇&($
𝑠𝑒(𝛽A" )
• 2. Single 𝑌 at 𝑋 = 𝑥 ∗
𝑌 ∗ = 𝑌|(𝑋 = 𝑥 ∗ ) = 𝛽" + 𝛽# 𝑥 ∗ + 𝑒 ∗
• 2. Single 𝑌 at 𝑋 = 𝑥 ∗
1 (𝑥 ∗ − 𝑥)̅ $
𝑦; ∗ ± 𝑡&($ ∗ 𝑆 𝟏+ +
𝑛 𝑆𝑋𝑋
𝑦; ∗ = 9.2076 + 0.9014 ∗ 𝑥 ∗
= 9.2076 + 0.9014 ∗ 100
= 99.35
Confidence vs. Prediction interval
• Find a 95% confidence interval for 𝐸(𝑌|𝑋) at 𝑥 ∗ = 100.
∗ # (0 ∗ (0)̅ !
𝑦; ± 𝑡&($ ∗ 𝑆 +
& )33
# (#""(56.8)!
= 99.35 ± 2.06 ∗ 7.729 +
$4 9::#.:8;
= (96.15, 102.55)
1 (𝑥 ∗ − 𝑥)̅ $
𝑦; ∗ ± 𝑡&($ ∗ 𝑆 1+ +
𝑛 𝑆𝑋𝑋
# (#""(56.8)!
= 99.35 ± 2.06 ∗ 7.729 1 + +
$4 9::#.:8;
= (83.11, 115.59)
Confidence vs. Prediction interval
• Confidence interval for 𝐸(𝑌|𝑋) at 𝑥 ∗ = 100 vs.
Prediction interval for 𝑥 ∗ at 𝑥 ∗ = 100.
Confidence vs. Prediction interval
• Confidence vs. Prediction band
Confidence vs. Prediction interval
• A prediction interval is designed to cover a “moving target”,
the random future value of 𝑦, while the confidence interval
is designed to cover the “fixed target”, the expected value
of 𝑦, 𝐸(𝑌|𝑋) for a given 𝑥 ∗ .
𝐻" : 𝛽# = 0 𝑣𝑠. 𝐻, : 𝛽# ≠ 0
𝐻" : 𝑌 = 𝛽" + 𝑒
vs. 𝐻, : 𝑌 = 𝛽" + 𝛽# 𝑥 + 𝑒
Testing between the two models
• Hypotheses in terms of the models
𝐻" : 𝑌 = 𝛽" + 𝑒
vs. 𝐻, : 𝑌 = 𝛽" + 𝛽# 𝑥 + 𝑒
• Two scenarios:
• If 𝐻𝑜 is plausible, then 𝑋 and 𝑌 do not have linear
association. Thus, the best estimates of 𝑌 is 𝑌.T
• If 𝐻𝐴 is plausible, then 𝑋 is a significant explanatory
variable, thus the best estimates of 𝑌 is 𝑌.b
Variations in the linear model
• Total deviation = Unexplained deviation + Explained deviation
𝑌! − 𝑌T = 𝑌! − 𝑌b! + (𝑌b! − 𝑌)
T
Variations in the linear model
• In terms of sum of squares of the terms:
T $ = Σ(𝑌! − 𝑌b! )$ +Σ(𝑌b! − 𝑌)
Σ(𝑌! − 𝑌) T $
SST = RSS + SSreg
))<=>
• 1. 𝑅$ = : the proportion of variation in 𝑌’s
))?
))<=> ))?(@))
§ 𝑅$ = = = 1, if RSS=0. It
))? ))?
𝑆𝑆𝑟𝑒𝑔/𝑑𝑓#
𝐹= ~𝐹AB#,AB!
𝑅𝑆𝑆/𝑑𝑓$
• Data structure:
Recap with a special case
• Scatter plot and Boxplot for Y vs. X
• Also, we may refer the ANOVA test for the linear model.
𝐻" : 𝑌 = 𝛽" + 𝑒
vs. 𝐻, : 𝑌 = 𝛽" + 𝛽# 𝑥 + 𝑒