Lecture 6 - Linear Regression and Correlation
Lecture 6 - Linear Regression and Correlation
and
Regression
Correlation
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
Wt. 67 69 85 83 74 81 97 92 114 85
SBP(mmHg) (kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
220
200
180
160
140
120
100
80 wt (kg)
60 70 80 90 100 110 120
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
16
14
12
Height in CM
10
0
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Negative relationship
Reliability
Age of Car
No relation
Correlation Coefficient
If r = l = perfect correlation.
How to compute the simple correlation
coefficient (r)
∑ x∑ y
∑xy −
n
r=
(∑ x) 2
( ∑ y) 2
∑ x2 − . ∑ y −
2
n n
Example:
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.
∑ xy − ∑ x∑ y
r = n
( ∑ x)2
( ∑ y) 2
∑ x2 − . ∑ y 2 −
n n
Age Weight
Serial
(years) (Kg) xy X2 Y2
n.
(x) (y)
1 7 12 84 49 144
2 6 8 48 36 64
3 8 12 96 64 144
4 5 10 50 25 100
5 6 11 66 36 121
6 9 13 117 81 169
Total ∑x= ∑y= ∑xy= ∑x2= ∑y2=
41 66 461 291 742
41 × 66
461 −
r = 6
(41) 2 (66) 2
291 − . 742 −
6 6
r = 0.759
strong direct correlation
EXAMPLE: Relationship between Anxiety and
Test Scores
Anxiety Test X2 Y2 XY
(X) score (Y)
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
Calculating Correlation Coefficient
r = - 0.94
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
By using the least squares method (a procedure
that minimizes the vertical deviations of plotted
points surrounding a straight line) we are
able to construct a best fitting straight line to the
scatter diagram points and then formulate a
regression equation in the form of:
ŷ = a + bX
∑ x∑ y
∑ xy −
ŷ = y + b(x − x) bb1 = n
( ∑ x) 2
∑ x 2
−
n
Regression Equation
SBP(mmHg)
220
180
describes the 160
120
mathematically 100
80
Intercept
Wt (kg)
60 70 80 90 100 110 120
Slope
Linear Equations
Y
ŷ = a + bX
Y = bX + a
Change
b = Slope in Y
Change in X
a = Y-intercept
X
Hours studying and grades
Regressing grades on hours
Linear Regression
80.00
70.00
41 × 66
461 −
b= 6 = 0.92
2
(41)
291 −
6
Regression equation