Course 10-Part 1
Course 10-Part 1
Session 10
Simple Linear
Regression
Part 1
Plan
Part 1 :
• Context
Simple Linear Regression Model
Estimation of the regression line
Part 2
Confidence interval on the parameters of the regression line
Hypothesis tests on the parameters of the regression line
Pointwise prediction, Interval prediction
Coefficient of determination and correlation
2
Context
In statistics, several problems consist of defining the relationship
between two statistical variables:
…
3
Context
In this kind of problem, the main questions we want to answer are :
4
Context
To answer questions 1 to 4
We use a statistical tool called :
Regression Analysis
To answer question 5
We use a statistical tool called :
Correlation Analysis
5
Regression Analysis
6
Regression Analysis
Definition
Regression analysis is a statistical method that allows us to
study the type of relationship that may exist between a
certain variable whose values we want to explain and one or
more other variables that are used for this explanation.
7
Regression Analysis
Example
Rent (Y) of an apartment depends on its surface area,
its distance from the city centre, its number of rooms,
its level of soundproofing…
Y f ( X 1 , X 2 , X 3 ,..., X n )
Rent of an Surface
apartment Distance Rooms
from the city
centre
8
Regression Analysis
9
Simple Linear Regression
10
Simple Linear Regression
Scatter plot
11
Simple Linear Regression
Example
12
Simple Linear Regression
y 10
Linear 3
13
Scatter plot
A scatter plot gives us an idea of the relation that can exist
between X and Y, and of the quality of our estimation of this
relation (if it exists…)
14
Scatter plot
15
Simple Linear Regression
Goal :
Y f (X )
16
Simple Linear Regression
The equation that describes how Y is related to X and an error
term is called the regression model.
Y f ( X) 1X 0
Where :
Y = dependent variable
X = independent variable
17
Estimation of the Regression Line
This population model is unknown Y 0 1 X because the
parameters 0 , 1 and are unknown.
Yˆ b0 b1 X
18
Estimation of the Regression Line
19
Estimation of the Regression Line
Least Squares Method
ei
20
Least Squares Principle
The objective of the least squares method is to
determine the regression line that minimizes
n n
min e min y y
ˆ
2 2
i i i
i 1 i 1
where:
yi = observed value of the dependent variable for the ith
observation
ŷi= estimated value of the dependent variable for the ith
observation
21
Least Squares Principle
b0 y b1 x
n
xy i i nx y
b1 i 1n
i
x 2
nx 2
Sample size
i1
22
Estimation of the Regression Line (b1, b0)
Studying-result example (continued) : b0 y b1 x
i xi yi xi yi xi2 x 91 7 n
1 4 5 20 16 13 xy i i nx y
2 8 8 64 64 86 b1 i 1n
y 6.6154
3
4
5
9
7
9
35
81
25
81 13 i
x 2
nx 2
i1
5 12 10 120 144
6 7 7 49 49 b1 684 13 7 6.6154 0.7736
7 3 4 12 9 743 13 7 2
8 4 4 16 16
9 10 8 80 100
b0 6.6154 0.7736 7 1.2
10 3 1 3 9
11 10 9 90 100
12 7 6 42 49 Estimated regression line
13 9 8 72 81
å 91 86 684 743 ŷ 1.2 0.7736 x
23
Estimation of the Regression Line
Example 2:
A company wants to conduct a study on the relationship between
weekly advertising spending and the volume of sales that it
makes. The following data have been collected over the last 10
weeks:
Advertising
4 2 2.5 2 3 5 1 5.5 3.5 4.5
cost (X) (M$)
Sales volumes
40.5 41 43 39 46 53 38 54 48.5 51.5
(Y) (M$)
24
Estimation of the Regression Line
Scartterplot
60
50
Sales volumes
40
30
20
10
0
0 1 2 3 4 5 6
Advertising cost
25
Estimation of the Regression Line
Answer Sales
Advertising cost volumes (Y)
(X) (M$) (M$)
xi yi xi . yi xi 2
4 49,5 198 16
2 41 82 4
2,5 43 107,5 6,25
2 39 78 4
3 46 138 9
5 53 265 25
1 38 38 1
5,5 54 297 30,25
3,5 48,5 169,75 12,25
4,5 51,5 231,75 20,25
Sum 1605 128
Mean x 3,3 y 46,35
26
Estimation of the Regression Line
Answer
x y i i nx y
1605 10 3.3 46.35
b1 i 1n 2
3.95
128 10 (3.3)
i
x 2
i 1
n x 2
Yˆ 33.31 3.95 X
27
Estimation of the Regression Line
Interpretation
Yˆ 33.31 3.95 X
28
Estimation of the Regression Line
Properties:
29
Estimation of the Regression Line
x , y
(3.3, 46.35)
30
See next presentation (part 2) :
• Confidence interval on the parameters of the
regression line
Hypothesis tests on the parameters of the regression
line
Pointwise prediction, Interval prediction
Coefficient of determination and correlation
31
Thank you
32