0% found this document useful (0 votes)

7 views71 pages

Simple+Linear+Regression

The document provides an overview of regression analysis, including types of statistical models (parametric, nonparametric, and semiparametric) and their applications in studying relationships between variables. It discusses the formulation and estimation of simple linear regression, emphasizing the importance of estimating parameters to model the relationship between dependent and independent variables. Additionally, it covers correlation analysis and conditional distributions as foundational concepts in regression analysis.

Uploaded by

lzyztt720

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views71 pages

Simple+Linear+Regression

Uploaded by

lzyztt720

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 71

1

Part I
Background Review

2
Regression Analysis
 Statistical model is a mathematical description of the data
structure/data generating mechanism
 Parametric model
 Easier to fit, interpret, infer
 More powerful (statistically)
 Model complexity is fixed
 Nonparametric model
 No distributional assumption
 More flexible
 Model complexity may grow
 Semiparametric model
3
Regression Analysis
 Example: exam scores
 Parametric: approximate the class distribution by a normal
distribution with certain parameters (mean and variance)
(hence we can say mean +/- one standard deviation ~ 68%)
 Nonparametric: use the histogram

 So far, we are dealing with only one variable

 What if we have more variables available?
 Want to exploit other information for a better picture

4
Regression Analysis
 Regression studies the relationship between
 Response/outcome/dependent variables; and
 Predictor/explanatory/independent variables

 In this course, we only deal with parametric approach

 Goal: estimate the parameters
 After building the model: interpret, infer, and predict

 As we will see, the regression framework covers many of

the techniques you learnt in CB2200

5
Types of Variables
Nominal Binary
Qualitative/ No orderings in categories Only two categories
• Martial Status • Yes/No
Categorical • Eye Color • Male/Female
Categories are naturally ordered
Ordinal • Likert/rating scale
• Letter grades

Variables

• Number of children
Discrete • Defects per hour

Quantitative/
Numerical
• Weight
Continuous • Voltage
6
Let’s look at the simplest case
 To study the relationship between two numerical
variables, such as
 Exam score vs. Time spent on doing revision
 Apartment price vs. Gross floor area
 Electricity consumption vs. Air temperature

 We can use Pearson correlation coefficient, also known as

linear correlation coefficient

7
Linear Correlation Analysis
 Scatter plot

8
Linear Correlation Analysis Cont’d
 (Sample) Linear correlation coefficient, 𝑟
σ𝑛 ത ത
𝑖=1(𝑋𝑖 −𝑋)(𝑌𝑖 −𝑌)
𝑟=
σ𝑛 ത 2 σ𝑛
𝑖=1(𝑋𝑖 −𝑋)
ത 2
𝑖=1(𝑌𝑖 −𝑌)

 Dimensionless
 −1 ≤ 𝑟 ≤ +1
 “Sign” indicates the direction (positive / negative) of a linear
relationship
 “Magnitude” measures the strength of a linear relationship
σ𝑛
𝑖=1 𝑋𝑖 σ𝑛
𝑖=1 𝑌𝑖
ത=
 𝑋 , 𝑌ത = are the sample means
𝑛 𝑛
σ𝑛 ത 2
𝑖=1(𝑋𝑖 −𝑋) σ𝑛 ത 2
𝑖=1(𝑌𝑖 −𝑌)
 𝑆𝑋2 = 𝑛−1
, 𝑆𝑌2 = 𝑛−1
are the sample variances
9
Linear Correlation Analysis Cont’d
 t-test for correlation coefficient
Important!! Note the
𝐻0 : 𝜌 = 0 (no linear correlation) slight abuse of notations
𝐻1 : 𝜌 ≠ 0 (linear correlation exists) • Upright t denotes the
value of statistic
𝑟−𝜌 • 𝑡𝑛−2 denotes the
t-statistic t =
1−𝑟2 distribution itself
𝑛−2 • 𝑡𝛼Τ2,(𝑛−2) denotes its
upper tail quantile
p-value = 2𝑃(𝑡𝑛−2 ≥ |t|)
𝑡𝑛−2 denotes a 𝑡 distribution with (𝑛 − 2) degree of
freedom (d.f.)
Reject 𝐻0 if |t| > C. V. = 𝑡𝛼Τ2,(𝑛−2) or p-value < 𝛼
10
Example
 Is residential apartment price related to its gross floor
area and age of the building?

 Source: HKEA Transaction Records, http://www.hkea.com.hk/private/TransServ

11
 Data: Transactions of residential apartments in Tseung Kwan O during 1 – 8 April 2014
Example Cont’d
 The data file consists of 124 records, with 44 records
contain missing values, we are going to make use of the
following three variables
 Price = Price in million HK$
 GrossFA = Gross floor area in ft2
 Age = Age of building in years

12
Example • R will not process codes after #, use for comments
 #set working directory
 setwd(“C:/Users/chiwchu/Google Drive/Academic/CityU/MS3252/Lecture")

 #load the data • library(…) is to load packages

 library(readxl) • attach(…) so that variables in the
 Example = read_excel("Example.xlsx") database can be used directly;
 attach(Example) o/w, use Example$Price, etc

 #scatter plots of Price vs GrossFA and Price vs Age

 plot(GrossFA,Price)
 plot(Age,Price)

 #compute correlations and t-tests

 cor(cbind(Price,GrossFA,Age)) • cbind(…) is to combine by columns
 cor.test(Price,GrossFA) • rbind(…) is to combine by rows

13
Example Cont’d

14
Example Cont’d

Correlation coefficient Correlation coefficient

between Price itself between Price and GrossFA

2x10-16
-2 .

Calculated value of p-value of t-test for

t-test statistics correlation coefficient

15
Conditional Distribution
 Probability/density -> Distribution
 Conditional probability/density -> Conditional distribution
 e.g. Let 𝑌 denote the random variable of whether it will
rain tomorrow (1=yes, 0=no)
 If the probability of raining tomorrow is 0.4, the (marginal)
distribution of 𝑌 is Bernoulli(0.4), denoted by 𝑌 ∼ 𝐵𝑒𝑟𝑛(0.4)
 But what if we know whether a typhoon is coming?
 Let 𝑋 denote the random variable of whether a typhoon is
coming (1=yes, 0=no)
 𝑋 can be random itself, but we can think of it as fixed
16
Conditional Distribution
 Given the information of 𝑋, the probability of raining
tomorrow and hence the distribution of 𝑌 may change!
 Say, conditional probability 𝑃 𝑌 = 1 𝑋 = 1 = 0.9, then
the conditional distribution of 𝑌|𝑋 = 1 is 𝐵𝑒𝑟𝑛(0.9)
 Similarly, the conditional distribution of 𝑌|𝑋 = 0 could be
𝐵𝑒𝑟𝑛(0.3)
 𝑌|𝑋 ∼ 𝐵𝑒𝑟𝑛(0.3 + 0.6𝑋)
 The conditional distribution of 𝑌, particularly the
conditional mean, varies across different values of 𝑋
 Regression is about the study of conditional distribution!
17
Part II
Formulation and Estimation

18
Overview of Regression Analysis
 Input
 Response / outcome / dependent variable, 𝑌
 The variable we wish to explain or predict
 Predictor / covariate / explanatory / independent variable, 𝑋
 The variable used to explain the response variable

 Output
 A (linear) function that allows us to
 Model association: Explain the variation of the response caused by the
predictor(s)
 Provide prediction: Estimate the value of the response based on value(s)
of the predictor(s)

19
Simple Linear Regression - Formulation
 A simple linear regression model consists of two
components
 Regression line: A straight line that describes the
dependence of the average value (conditional mean) of the
𝑌-variable on one 𝑋-variable
 Random error: The unexpected deviation of the observed
value from the expected value
Population Slope Coefficient
Population Intercept
Predictor

Response 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖 Random Error

Regression Line 20
Simple Linear Regression - Formulation
EcEiXi)ta
 (Linear) regression model 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖
 Assumptions ElYi(Xi) = BotB , Xi +a

 Linearity of regression equation

 𝛽0 + 𝛽1 𝑋𝑖 is a linear function a
 Error normality (can be dropped if sample size is large, why?)
xi

 𝜀𝑖 has a normal distribution for all 𝑖

 Zero mean of errors (not really an assumption with the intercept)

 E 𝜀𝑖 |𝑋𝑖 = 0

 Constant variances of errors

 Var 𝜀𝑖 |𝑋𝑖 = 𝜎𝜀2
 Error independence
 𝜀𝑖 are independent for all 𝑖

21
+ Ve

-ve

·
Ei normal distribution

El S : /Xi ) =0 tEi have tre and we

Independent
80
Student : E to 17
Ei -525
Student 2:
Go 100 so

to
Ei
Simple Linear Regression - Formulation
 Equivalently, the linear regression model can be written as
 𝐸 𝑌𝑖 𝑋𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 (mean function)
 𝑉𝑎𝑟 𝑌𝑖 𝑋𝑖 = 𝜎𝜀2 (variance function)
 𝑌𝑖 |𝑋𝑖 are independent and normally distributed
 In other words, 𝑌𝑖 |𝑋𝑖 are independent 𝑁 𝛽0 + 𝛽1 𝑋𝑖 , 𝜎𝜀2
 𝑁 𝜇, 𝜎 2 denotes a normal distribution with mean 𝜇 and
variance 𝜎 2
 We also call it mean regression model

22
Simple Linear Regression - Formulation
 Framework: we have one response 𝑌 and 𝐾 predictors 𝑋
 𝐾 = 1 here because we only have one 𝑋
 We obtain a random sample of size 𝑛, containing the
values of 𝑌𝑖 and 𝑋𝑖 for each individual/subject/observation
𝑖, 𝑖 = 1, ⋯ , 𝑛
 Our goal is to model/infer about the conditional mean of 𝑌
given 𝑋
 As the conditional mean is characterized by 𝛽0 and 𝛽1 , that
means we need to estimate 𝛽0 and 𝛽1 from the data

23
Simple Linear Regression - Estimation
 Goal: estimate 𝛽0 and 𝛽1
 Let’s denote these estimates by 𝑏0 and 𝑏1
 our notations for parameters: Greek alphabets (𝛽0 , 𝛽1 )
represent the population/true versions; English alphabets
(𝑏0 , 𝑏1 ) represent the sample/estimated analogies.
 Two methods (turn out to be equivalent for linear
regression):
 Least Squares Estimator (LSE)/Ordinary Least Squares (OLS)
 Maximum Likelihood Estimator (MLE)

24
Simple Linear Regression - Estimation
𝑌

𝒀𝒊

𝒆𝒊

෡𝒊
𝒀

෡ 𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝒊
𝒀

We are assuming
𝑋
(conditional) normality
𝑋𝑖
of Y for every level of X
 𝑏0 represents the sample intercept
 𝑏1 represents the sample slope coefficient
 𝑒 represents the sample residual error 25
Simple Linear Regression - Estimation
 𝑏0 and 𝑏1 are estimated using the least squares method,
which minimize the sum of squares errors (SSE)

𝑛 𝑛 𝑛

𝑆𝑆𝐸 = ෍ 𝑒𝑖2 = ෍(𝑌𝑖 − 𝑌෠𝑖 )2 = ෍(𝑌𝑖 − (𝑏0 + 𝑏1 𝑋𝑖 ))2

𝑖=1 𝑖=1 𝑖=1

26
Simple Linear Regression - Estimation
 The solution to 𝑏0 and 𝑏1 can be obtained by
differentiating with respect to 𝑏0 and 𝑏1
 That is to solve
𝜕 σ𝑛𝑖=1 𝑒𝑖2 𝑛
= −2 ෍ 𝑌𝑖 − 𝑏0 + 𝑏1 𝑋𝑖 = 0
𝜕𝑏0 𝑖=1
and
𝜕 σ𝑛𝑖=1 𝑒𝑖2 𝑛
= −2 ෍ 𝑋𝑖 𝑌𝑖 − 𝑏0 + 𝑏1 𝑋𝑖 = 0
𝜕𝑏1 𝑖=1
simultaneously

27
Simple Linear Regression - Estimation
 The solutions are

σ𝑛 ത ത σ𝑛 ത 2
𝑖=1(𝑌𝑖 −𝑌)
𝑖=1(𝑋𝑖 −𝑋)(𝑌𝑖 −𝑌) 𝑆
𝑏1 = σ𝑛 ത 2
=𝑟 = 𝑟(𝑆𝑌 )
𝑖=1(𝑋𝑖 −𝑋) σ𝑛 ത 2 𝑋
𝑖=1(𝑋𝑖 −𝑋)

and
𝑏0 = 𝑌ത − 𝑏1 𝑋ത
 Also, the estimate for the error variance 𝜎𝜀2 is given by
𝑆𝑆𝐸
𝑆𝑒2 = 𝑀𝑆𝐸 =
𝑛−𝐾−1

28
Simple Linear Regression - Estimation
 Maximum Likelihood Estimation is to find the parameters that
maximize the likelihood/probability of observing the sample
 Recall that 𝑌𝑖 |𝑋𝑖 ∼ 𝑁 𝛽0 + 𝛽1 𝑋𝑖 , 𝜎𝜀2
2
𝑦𝑖 −𝜇
1 −
 The density function of 𝑁 𝜇, 𝜎 2 is 2
𝑒 2𝜎2
2𝜋𝜎
 Assume 𝜎𝜀2 is known and equals 1 for simplicity…
 The joint likelihood/probability of observing these 𝑌𝑖 given these
2
𝑌𝑖 −𝛽0 −𝛽1 𝑋𝑖
1
𝑋𝑖 will be ς𝑛𝑖=1 𝑒− 2
2𝜋
 Maximizing this likelihood function is equivalent to minimizing
2
σ𝑛𝑖=1 𝑌𝑖 − 𝛽0 − 𝛽1 𝑋𝑖 , which is exactly the SSE, so MLE = LSE!

29
Example Cont’d

• lm(…) means linear model

• summary(…) reports summary
of variables/ model results

𝑏0

𝑏1

𝑆𝑒 = 𝑀𝑆𝐸

𝑟 2 or 𝑅2
30
Example – The Model &
Interpretation of Coefficients Cont’d
 The estimated simple linear regression equation
𝑌෠ = 1.3584 + 0.0048𝑋
where 𝑌 = Price in million HK$
𝑋 = Gross floor area in ft2
 The estimated slope coefficient, 𝑏1
 Measures the estimated change in the average value of 𝑌
as a result of a one-unit increase in 𝑋
 𝑏1 = 0.0048 says that the price of an apartment increases
by 𝐻𝐾$4,800(= 0.0048 × 𝐻𝐾$1,000,000), on average, for
each square foot increase in gross floor area

31
Example – The Model &
Interpretation of Coefficients Cont’d
 The estimated simple linear regression equation
𝑌෠ = 1.3584 + 0.0048𝑋
where 𝑌 = Price in million HK$
𝑋 = Gross floor area in ft2
 The estimated intercept coefficient, 𝑏0
 Denotes the estimated average value of 𝑌 when 𝑋 is zero
 𝑏0 = 1.3584 says that the price of an apartment is
𝐻𝐾$1,358,400(= 1.3584 × 𝐻𝐾$1,000,000), on average,
when the gross floor area is zero (any problem?)
 Interpret with caution with the 𝑋-value is out of range!!

32
Example Cont’d
 Regress Price against Age

33
Example Cont’d
 The relationship between apartment price and age of the
building is
𝑌෠ = 6.1478 − 0.1078𝑍
where 𝑌 = Price in million HK$
𝑍 = Age of building in years
 If the building gets 1 year older, the average apartment
price decreases by 𝐻𝐾$107,800

34
Confidence Interval (CI)
 Confidence interval estimate for slope coefficient
𝑏1 ± 𝑡𝛼Τ2,𝑛−𝐾−1 𝑆𝑏1
 R program

95% CI for 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴

 Upon repeated sampling, those CI will cover the true

parameter with approximately 95% chance
 We are 95% confident that the population parameter is
contained in between the CI
35
Special Case I: One Sample
 In other words, no 𝑋𝑖 are considered (imagine 𝑋𝑖 = 0)
 The linear regression model assumes that
 𝑌𝑖 |𝑋𝑖 are independent 𝑁 𝛽0 , 𝜎𝜀2
 This is equivalent to fitting a normal distribution to the
entire sample!!
 Intuitively, what are the best estimates for 𝛽0 and 𝜎𝜀2 ?
σ𝑛 ത 2
𝑖=1 𝑌𝑖 −𝑌
ത 𝑆𝑒2 =
 𝑏0 = 𝑌, = 𝑆𝑌2
𝑛−1

36
Special Case II: Two Groups
 Now, the 𝑋𝑖 are either 0 or 1 indicating which group the
observation belongs to
 The linear regression model assumes that
 𝑌𝑖 |𝑋𝑖 = 0 are independent 𝑁 𝛽0 , 𝜎𝜀2
 𝑌𝑖 |𝑋𝑖 = 1 are independent 𝑁 𝛽0 + 𝛽1 , 𝜎𝜀2
 This is equivalent to fitting two normal distributions to the
two groups respectively!!

37
Part III
Goodness of Fit,
Parameter Inference,
and Model Significance

38
Goodness of Fit and Model Significance
 We want to compare the fitted model with 𝑋 against the
null model without 𝑋
 Fitted/Full model = the model you considered
 Null model = special case I = a horizontal line at 𝑌ത
 (Saturated model = data = the model with perfect fit)

 Saturated model <----- Fitted model -----> Null model

 One simplest way to evaluate the goodness of fit is to

look at the variance/variation breakdown (although not
always a good idea)
39
Analysis of Variance (ANOVA)
 Total variation of the 𝑌-variable is made up of two parts
𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸
Sum of Squares Total, 𝑆𝑆𝑇 = σ𝑛𝑖=1(𝑌𝑖 − 𝑌)
ത 2
 Total variation of 𝑌𝑖 around their mean, 𝑌ത
Sum of Squares Regression, 𝑆𝑆𝑅 = σ𝑛𝑖=1(𝑌෠𝑖 − 𝑌)
ത 2
 Variation explained by the regression equation
Sum of Squares Errors, 𝑆𝑆𝐸 = σ𝑛𝑖=1(𝑌𝑖 − 𝑌෠𝑖 )2
 Variation not explained by the regression equation
 Question: what are the values of 𝑆𝑆𝑅 and 𝑆𝑆𝐸 for the
null and saturated models?
40
Analysis of Variance (ANOVA) Cont’d
SST, the total variation of 𝑌-
variable we wish to explain

Price

SSE = SST - SSR

GrossFA

SSR, the variation of 𝑌-variable that

being explained by the regression
equation with the predictor 𝑋
41
33TC1-RI" SSE :

*
Analysis of Variance (ANOVA) Cont’d
 Coefficient of determination, 𝑅 2
𝑅2 =
𝑆𝑆𝑅 SSTeSSR + SSE
𝑆𝑆𝑇
SSR = SSTXR
:

 0 ≤ 𝑅2 ≤ 1
 Measures the proportion of variation of 𝑌𝑖 explained by
the regression equation with the predictor 𝑋
 Measures the “goodness of fit” of the regression model
SST =
SSTXRETSSE
 Remark!! 𝑅 2 = 𝑟 2 in simple linear regression, i.e. when
there is one 𝑋-variable
SST-SSi XRESSE
42
Example
 Which independent variable, GrossFA or Age, provides a
better explanation to the variation of apartment price?

SSR

SSE

𝑆𝑒2 =MSE
For SST, use either of
• sum(anova(m1)[,2])
• var(Price)*(length(Price)-1)

43
Inferences about the Parameters –
A 𝑋-Variable Significance
 t-test for a slope coefficient
𝐻0 : 𝛽1 = 0 (no linear relationship)
𝐻1 : 𝛽1 ≠ 0 (linear relationship exists)
𝑏1 −𝛽1
t-statistic t =
𝑆𝑏1
where 𝑆𝑏1 = standard error of the slope

p-value = 2𝑃(𝑡𝑛−𝐾−1 ≥ |t|)

𝑡𝑛−𝐾−1 has a 𝑡 distribution with 𝑛 − 𝐾 − 1 d.f.
Reject 𝐻0 if |t| > C. V. = 𝑡𝛼Τ2,(𝑛−𝐾−1) or p-value < 𝛼

44
Inferences about the Parameters –
A 𝑋-Variable Significance Cont’d
 𝑆𝑏1 measures the variation in the slope of regression lines from
different possible samples (one color denotes one sample)
𝒀 𝒀

Small 𝑆𝑏1 𝑿 Large 𝑆𝑏1 𝑿

𝑆𝑒2 𝑆𝑒2
 𝑆𝑏21 = ത 2
σ(𝑋𝑖 −𝑋)
= 2
(𝑛−1)𝑆𝑋
𝑆𝑆𝐸
 𝑆𝑒2 = = variation of the errors around the regression line
𝑛−𝐾−1
45
Inferences about the Parameters –
A 𝑋-Variable Significance Cont’d
𝑆𝑌 𝑏1 𝑟
 Recall 𝑏1 = 𝑟(𝑆 ), we can show that 𝑆 = !!
𝑋 𝑏1 1−𝑟2
𝑛−2

 Test for 𝛽1 = 0 is equivalent to the t-test for linear correlation

coefficient

46
Example
 Is GrossFA significantly affecting the apartment price?

𝑏1 𝑆𝑏1 t p-value

d.f. = 𝑛 − 𝐾 − 1

47
Example Cont’d
 Is GrossFA significantly affecting the apartment price?
𝐻0 : 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴 = 0 In R, use
𝐻1 : 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴 ≠ 0 • qt(.975,78) to obtain C.V.
0.0048−0 • 2*(1-pt(10.81,78)) to obtain p-value
t= 0.000448
= 10.812
In exam,
• use t-table to obtain C.V.
At 𝛼 = 5% 7
• p-value is not computable by hand,
d.f. = (80 − 1 − 1) = 78 but a range can be found at best
C.V. = 1.9908

-
Reject 𝐻0 , GrossFA significantly
affects apartment price. -
V
-

p-value < 2 × 10−16 < 𝛼, reject 𝐻0 1 : 9408 48

Example Cont’d
 Is Age having a significant negative relationship with
price? 𝐻0 : 𝛽𝐴𝑔𝑒 ≥ 0
𝐻1 : 𝛽𝐴𝑔𝑒 < 0
−0.1078−0
t= = −5.935
0.01817

At 𝛼 = 5%
d.f. = (80 − 1 − 1) = 78
C.V. = −1.6646

Reject 𝐻0, Age has a significant negative

impacts on apartment price.

p-value = 7.66 × 10−8 < 𝛼, reject 𝐻0

Note: For one-tail test, 𝐶. 𝑉. = 𝑡𝛼,(𝑛−𝐾−1)

49
Inferences about the Parameters –
Overall Model Significance
 F-test for the overall model
𝐻0 : 𝛽1 = 0 (the model is insignificant)
𝐻1 : 𝛽1 ≠ 0 (the model is significant)
𝑀𝑆𝑅 𝑆𝑆𝑅/𝐾
F-statistic F = 𝑀𝑆𝐸
= 𝑆𝑆𝐸/(𝑛−𝐾−1)
where 𝑀𝑆𝑅 = Mean Squares Regression
𝑀𝑆𝐸 = Mean Squares Errors
𝐾 = no. of predictors (excluding intercept) Again a slight abuse
𝑛 = no. of observations of notations
p-value = 𝑃(𝐹𝐾,(𝑛−𝐾−1) ≥ F)
𝐹𝐾,(𝑛−𝐾−1) denotes an 𝐹 distribution with d.f. 𝐾 and (𝑛 −
𝐾 − 1)
Reject 𝐻0 if F > C. V. = 𝐹𝛼,𝐾,(𝑛−𝐾−1) or p-value < 𝛼
50
Example
 Is the model significant?

F
d.f. = 𝐾, 𝑛 − 𝐾 − 1

p-value

SSR MSR
SSE MSE

51
Example Cont’d
 Is the model significant?
𝐻0 : 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴 = 0 In R, use
𝐻1 : 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴 ≠ 0 • qf(.95,78) to obtain C.V.
55.96517 • 1-pf(116.90,78) to obtain p-value
F= = 116.90
0.47876
In exam,
• use F-table to obtain C.V.
At 𝛼 = 5% • p-value is not computable by hand
d.f. = 1, (80 − 1 − 1) = 1, 78
C.V. = 4.00

Reject 𝐻0 , the model is significant.

p-value < 2.2 × 10−16 < 𝛼, reject 𝐻0 52

Inferences about the Parameters –
Overall Model Significance
• The F statistic is a monotone transform of 𝑅 2
• Essentially it is testing whether the 𝑅 2 is significantly
bigger than 0

• For simple linear regression (only one predictor)

• F = t 2 , i.e. F-test is equivalent to two-tail t-test!!

53
Part IV
Prediction and Diagnostics

54
Prediction of New Observations –
Point Prediction
 Convert the given 𝑋-value into the same measurement
scale as the observed 𝑋-value
 As the estimated slope coefficient is scale dependent
 Ideally, only use the regression equation to predict the 𝑌-
value when the given 𝑋-value is inside the observed data
range
 As we are not sure whether the linear relationship will go
beyond the range of observed 𝑋-value

55
Example Cont’d
 What is the estimated price for an apartment with gross floor
area 764 ft2?
 Prediction given by the simple linear regression equation
𝑌෠ = 1.3585 + 0.0049𝑋
= 1.3585 + 0.0049 × 764 = 5.1020
where 𝑌 = Price in million HK$
𝑋 = Gross floor area in ft2
 The expected price for an apartment with gross floor area
764 ft2 is 𝐻𝐾$5,102,100
 What is the estimated mean price for apartments with gross
floor area 764 ft2? – same estimate, but any differences?

56
Prediction of New Observations Cont’d
 The prediction given by regression models raised from
different possible samples will vary

෡𝒊
𝒀
Which prediction
෡𝒊
𝒀 we should trust?
෡𝒊
𝒀

𝑿𝒊 𝑿
57
Prediction of New Observations –
Interval Prediction Cont’d
 Confidence interval estimate for the mean of 𝑌-variable
given a 𝑋-value
𝑌෠ ± 𝑡𝛼Τ ,𝑛−𝐾−1 𝑆𝑚
2

2 2 1 (𝑋−𝑋)ത 2
where 𝑆𝑚 = 𝑆𝑒 𝑛 + ത 2
σ(𝑋𝑖 −𝑋)
, 𝑋 is the given 𝑋-value
 R program
predict(m1,level=.95,interval="confidence")
ത 2 = 𝑛 − 1 𝑆𝑋2 , where 𝑆𝑋2 is the
 Note that σ(𝑋𝑖 − 𝑋)
sample variance of 𝑋

58
Prediction of New Observations –
Interval Prediction Cont’d
 Prediction interval estimate for an individual 𝑌-value
given a 𝑋-value
𝑌෠ ± 𝑡𝛼Τ ,𝑛−𝐾−1 𝑆𝑝
2
1 (𝑋−𝑋)ത 2
where 𝑆𝑝2 = 𝑆𝑒2 1+ + ത 2
σ(𝑋𝑖 −𝑋)
= 𝑆𝑒2 + 𝑆𝑚
2
𝑛
 R program
predict(m1,level=.95,interval=“prediction")
 It is still a type of confidence interval, although we are using
the term prediction interval to differentiate them
 This CI is wider because there are more uncertainty about the
prediction for a single Y compared to the average

59
Prediction of New Observations –
Interval Prediction Cont’d
𝒀
Prediction Interval for an
individual 𝑌-value

Confidence Interval for the

mean of 𝑌-variable

𝑿𝒊 𝑿
60
Example Cont’d
 Determine a 90% confidence interval for the mean
apartment price for flats of 764 ft2 gross area
 Also, construct a 90% prediction interval for the
apartment price for a flat of 764 ft2 gross area

confidence interval for mean

point estimate for mean

61
Regression Assumptions
 Linearity of regression equation
 𝛽0 + 𝛽1 𝑋𝑖 is a linear function
 Error normality
 𝜀𝑖 has a normal distribution for all 𝑖
 Constant variances of errors
 Var 𝜀𝑖 |𝑋𝑖 = 𝜎𝜀2
 Error independence
 𝜀𝑖 are independent for all 𝑖

62
Residual Analysis
 Check the regression assumptions by examining the
residuals
 Residuals (or errors), 𝑒𝑖 = 𝑌𝑖 − 𝑌෠𝑖
 Plot
 Residuals against the predictor for checking linearity and
constant variances
 Residuals against index for checking error independence
 Histogram of the residuals for examining error normality

63
Residual Analysise Cont’d

0
e
𝑿
Residuals has a systematic pattern,
0 𝑌 and 𝑋-variables are not having a
liner relationship, but a curved one
𝑿 e

 Residuals fall within a horizontal

band centered around 0, 0
displaying a random pattern

𝑿
Error variance increases with 𝑋-value
64
Residual Analysis Cont’d

e e

0 0

Index(Time) Index(Time)

 Residuals
pattern
displaying a random Negative residuals are associated
mainly with the early trials, and
positive residuals with the later
trials, time of the data being
collected affects the residuals and
𝑌-values

65
Residual Analysis Cont’d

35 35

30 30

25 25

20 20

%
%

15 15

10 10

5 5

0 0
-0.75 -0.5 -0.25 0 0.25 0.5 0.75 -0.75 -0.5 -0.25 0 0.25 0.5 0.75

e e

 Residuals follow a symmetrical

and bell shape distribution
Residuals are being right-skewed

66
Summary
Description Response Predictor Correlation Error
Population Version 𝑌𝑖 𝑋𝑖 𝜌 𝜀𝑖
Sample Analogy 𝑟 𝑒𝑖
Variance of Estimator 𝑆𝑆𝑇 σ 𝑌𝑖 − 𝑌ത 2 σ 𝑋𝑖 − 𝑋ത 2
1 − 𝑟2 𝑆𝑆𝐸
𝑆𝑌2 = = 𝑆𝑋2 = 𝑆𝑒2 =
(take square root to 𝑛−1 𝑛−1 𝑛−1 𝑛−2 𝑛−𝐾−1
get standard error)

Intercept Slope Expected value of 𝑌 New observation of 𝑌

𝛽0 𝛽1 𝐸(𝑌𝑖 |𝑋𝑖 ) = 𝛽0 + 𝛽1 𝑋𝑖 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖
𝑏0 𝑆𝑌 𝑌෠𝑖 = 𝑏0 + 𝑏1 𝑋𝑖
𝑏1 = 𝑟
𝑆𝑋
𝑆𝑏20 𝑆𝑒2 ത 2
1 (𝑋 − 𝑋) ത 2
1 (𝑋 − 𝑋)
𝑆𝑏21 = 2
𝑆𝑚 = 𝑆𝑒2 + 𝑆𝑝2 = 𝑆𝑒2 1+ + = 𝑆𝑒2 + 𝑆𝑚
2
(𝑛 − 1)𝑆𝑋2 𝑛 (𝑛 − 1)𝑆𝑋2 2
𝑛 (𝑛 − 1)𝑆𝑋

67
Summary
 𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸 is the breakdown of variance / variations
𝑆𝑆𝑅 𝑆𝑆𝐸
 𝑅2 = = 1 − 𝑆𝑆𝑇 = a single number in [0,1] that quantifies the
𝑆𝑆𝑇
model explained variation / measures the goodness of fit
𝑏1 −𝛽1
 t-statistic t = tests the significance of a single predictor, i.e.
𝑆𝑏1
whether 𝛽1 = 0
𝑀𝑆𝑅 𝑆𝑆𝑅/𝐾
 F-statistic F = = tests the significance of the entire
𝑀𝑆𝐸 𝑆𝑆𝐸/(𝑛−𝐾−1)
model, i.e. whether all 𝛽1 = ⋯ 𝛽𝐾 = 0, 𝐾 is the number of
predictors
 In this chapter with a single 𝑋, 𝐾 = 1 and F = t2
 Point prediction and confidence interval prediction

68
Summary
 𝑆𝑆𝑇 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸 is the breakdown of variance / variations
𝑆𝑆𝑅 𝑆𝑆𝐸
 𝑅2 = = 1 − 𝑆𝑆𝑇 = a single number in [0,1] that quantifies the
𝑆𝑆𝑇
model explained variation / measures the goodness of fit
𝑏1 −𝛽1
 t-statistic t = tests the significance of a single predictor, i.e.
𝑆𝑏1
whether 𝛽1 = 0
𝑀𝑆𝑅 𝑆𝑆𝑅/𝐾
 F-statistic F = = tests the significance of the entire
𝑀𝑆𝐸 𝑆𝑆𝐸/(𝑛−𝐾−1)
model, i.e. whether all 𝛽1 = ⋯ 𝛽𝐾 = 0, 𝐾 is the number of
predictors
 In this chapter with a single 𝑋, 𝐾 = 1 and F = t2
 Point prediction and confidence interval prediction

Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
NFEM
No ratings yet
NFEM
7 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
68 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
Chapter 2 - 1907876925
No ratings yet
Chapter 2 - 1907876925
33 pages
EECM3724 Unit 9 ch14 Slides 2023
No ratings yet
EECM3724 Unit 9 ch14 Slides 2023
57 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
Course 10-Part 1
No ratings yet
Course 10-Part 1
32 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Lecture9 Regression1 PDF
No ratings yet
Lecture9 Regression1 PDF
22 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Lecturer 4 Regression Analysis
100% (1)
Lecturer 4 Regression Analysis
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Chap010
No ratings yet
Chap010
45 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Simple Linear Regression sample
No ratings yet
Simple Linear Regression sample
55 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
48 pages
Econometrics Simple Linear Regression
No ratings yet
Econometrics Simple Linear Regression
22 pages
Research Methods & Designs: Regression Analysis
No ratings yet
Research Methods & Designs: Regression Analysis
18 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
Example Class One
No ratings yet
Example Class One
4 pages
Regression
No ratings yet
Regression
19 pages
Module -05 Statistical Computing and r Programming
No ratings yet
Module -05 Statistical Computing and r Programming
53 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Regression and Analysis
No ratings yet
Regression and Analysis
132 pages
3/11/2024 11-Simple Linear Regression. Correlation 1
No ratings yet
3/11/2024 11-Simple Linear Regression. Correlation 1
37 pages
Week 2
No ratings yet
Week 2
33 pages
1683609733_Deck2-BusinessIntelligence-M1-ACSA
No ratings yet
1683609733_Deck2-BusinessIntelligence-M1-ACSA
15 pages
chapter 9
No ratings yet
chapter 9
44 pages
Lecture 10
No ratings yet
Lecture 10
38 pages
BUSINESS STATISTICS: Simple Linear Regression and Correlation
No ratings yet
BUSINESS STATISTICS: Simple Linear Regression and Correlation
55 pages
STAT 445-Lecture 1_2021
No ratings yet
STAT 445-Lecture 1_2021
42 pages
Simple Regression and Correlation
No ratings yet
Simple Regression and Correlation
30 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Simple Linear Regression1.Pptx
No ratings yet
Simple Linear Regression1.Pptx
36 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Regression Analysis
No ratings yet
Regression Analysis
22 pages
Engineering Analysis & Statistics: Lect. # 11
No ratings yet
Engineering Analysis & Statistics: Lect. # 11
22 pages
15.Simple Linear Regression-530
No ratings yet
15.Simple Linear Regression-530
54 pages
Regression Equations
No ratings yet
Regression Equations
94 pages
Stat 4-6 Chapter
No ratings yet
Stat 4-6 Chapter
37 pages
Lecture10 Regression2 TS PDF
No ratings yet
Lecture10 Regression2 TS PDF
22 pages
Introduction To Linear Regression and Correlation Analysis
No ratings yet
Introduction To Linear Regression and Correlation Analysis
92 pages
Simple Linear Regression and Correlation 568a5ac2ce9b3
No ratings yet
Simple Linear Regression and Correlation 568a5ac2ce9b3
31 pages
Chap010
No ratings yet
Chap010
45 pages
Notes2
No ratings yet
Notes2
16 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
DOC-20250524-WA0001. (1)
No ratings yet
DOC-20250524-WA0001. (1)
43 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Melnet: A Generative Model For Audio in The Frequency Domain
No ratings yet
Melnet: A Generative Model For Audio in The Frequency Domain
14 pages
EnterFEA - Is FEA Difficult
No ratings yet
EnterFEA - Is FEA Difficult
3 pages
27.1 Record Tracking Satellite
No ratings yet
27.1 Record Tracking Satellite
22 pages
UNIT 1 daa
No ratings yet
UNIT 1 daa
167 pages
IJCRT2002120
No ratings yet
IJCRT2002120
4 pages
5 Regularization
No ratings yet
5 Regularization
79 pages
Volumen Finito
No ratings yet
Volumen Finito
11 pages
Full Download (Ebook) Weak interactions and modern particle theory by Howard Georgi ISBN 9780486469041, 0486469042 PDF DOCX
No ratings yet
Full Download (Ebook) Weak interactions and modern particle theory by Howard Georgi ISBN 9780486469041, 0486469042 PDF DOCX
80 pages
M000
No ratings yet
M000
2 pages
Lecture-3 Modeling in Time Domain
0% (1)
Lecture-3 Modeling in Time Domain
24 pages
Computer Programming Fundamentals
No ratings yet
Computer Programming Fundamentals
14 pages
On Quantum Channels.: Ab A A B
No ratings yet
On Quantum Channels.: Ab A A B
14 pages
Convert Distributed Parameter Lines To Pi Section Lines - MATLAB Answers - MATLAB Central
No ratings yet
Convert Distributed Parameter Lines To Pi Section Lines - MATLAB Answers - MATLAB Central
4 pages
Unit Pattern
No ratings yet
Unit Pattern
6 pages
Lampiran SPSS Uji T Pretes
No ratings yet
Lampiran SPSS Uji T Pretes
2 pages
Credit Risk Analysis Using Machine and Deep Learning
No ratings yet
Credit Risk Analysis Using Machine and Deep Learning
19 pages
Rapidminer Report
No ratings yet
Rapidminer Report
28 pages
Homework3 Sol
No ratings yet
Homework3 Sol
3 pages
Syllabus ANN
No ratings yet
Syllabus ANN
2 pages
DOTcvp
No ratings yet
DOTcvp
66 pages
Iterative Methods In Combinatorial Optimization Lau Lc Ravi R download
No ratings yet
Iterative Methods In Combinatorial Optimization Lau Lc Ravi R download
84 pages
lec48
No ratings yet
lec48
12 pages
Curve Fitting
100% (4)
Curve Fitting
37 pages
Improve - 7 - Fractional Factorial Experiments - v12-1
No ratings yet
Improve - 7 - Fractional Factorial Experiments - v12-1
36 pages
Questions On Digital Filters Design
75% (4)
Questions On Digital Filters Design
130 pages
Bayesian Methods for Management and Business Pragmatic Solutions for Real Problems 1st Edition Eugene D. Hahn all chapter instant download
No ratings yet
Bayesian Methods for Management and Business Pragmatic Solutions for Real Problems 1st Edition Eugene D. Hahn all chapter instant download
55 pages
IV-1 DSP Lab Manual
100% (15)
IV-1 DSP Lab Manual
123 pages
VSM Ladrillera Latesa
No ratings yet
VSM Ladrillera Latesa
2 pages
To Print A4 Black-4
No ratings yet
To Print A4 Black-4
6 pages

Simple+Linear+Regression

Uploaded by

Simple+Linear+Regression

Uploaded by

1

 So far, we are dealing with only one variable

 In this course, we only deal with parametric approach

 As we will see, the regression framework covers many of

 We can use Pearson correlation coefficient, also known as

 Source: HKEA Transaction Records, http://www.hkea.com.hk/private/TransServ

 #load the data • library(…) is to load packages

 #scatter plots of Price vs GrossFA and Price vs Age

 #compute correlations and t-tests

Correlation coefficient Correlation coefficient

Calculated value of p-value of t-test for

Response 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖 Random Error

 Linearity of regression equation

 𝜀𝑖 has a normal distribution for all 𝑖

 Zero mean of errors (not really an assumption with the intercept)

 Constant variances of errors

El S : /Xi ) =0 tEi have tre and we

𝑆𝑆𝐸 = ෍ 𝑒𝑖2 = ෍(𝑌𝑖 − 𝑌෠𝑖 )2 = ෍(𝑌𝑖 − (𝑏0 + 𝑏1 𝑋𝑖 ))2

• lm(…) means linear model

95% CI for 𝛽𝐺𝑟𝑜𝑠𝑠𝐹𝐴

 Upon repeated sampling, those CI will cover the true

 Saturated model <----- Fitted model -----> Null model

 One simplest way to evaluate the goodness of fit is to

SSE = SST - SSR

SSR, the variation of 𝑌-variable that

p-value = 2𝑃(𝑡𝑛−𝐾−1 ≥ |t|)

Small 𝑆𝑏1 𝑿 Large 𝑆𝑏1 𝑿

 Test for 𝛽1 = 0 is equivalent to the t-test for linear correlation

p-value < 2 × 10−16 < 𝛼, reject 𝐻0 1 : 9408 48

Reject 𝐻0, Age has a significant negative

p-value = 7.66 × 10−8 < 𝛼, reject 𝐻0

Note: For one-tail test, 𝐶. 𝑉. = 𝑡𝛼,(𝑛−𝐾−1)

Reject 𝐻0 , the model is significant.

p-value < 2.2 × 10−16 < 𝛼, reject 𝐻0 52

• For simple linear regression (only one predictor)

Confidence Interval for the

confidence interval for mean

point estimate for mean

 Residuals fall within a horizontal

 Residuals follow a symmetrical

Intercept Slope Expected value of 𝑌 New observation of 𝑌

You might also like