Correlation Analysis
Correlation Analysis
ANALYSIS
TION
COARELA The mean of the product of deviation scores (x, -) and (y,-) is called the
Definition. Y
X and
i.e.
of
...0
coariamCe
Cov(X, )or Cyy = E(r- N) (y-y)
that
to see
It is easy ..i)
Cov(X, Y) =
if x - I,
are
y-y A and B, use u = X- A, v= y- B,
then
means ...(iin)
Cov(X, Y) =Euw-u Zo
EXAMPLES
ILLUSTRATIVE xy = 115 and
1. Find the covariance between x and y when Ex = 50, >2y = -30, E
Eyaple
n =10.
Given Ex = 50, Ey =-30, 2 xy= 115 and n = 10.
Cov(x, y) = n xy
115-x 50 x(-30)
1
-(115 + 150) =
265
10
= 26.5
10
=
Y ifE uv, 50 and n =15 where u,and v, are deviations
)
respective
X-and Yseries from
their
uv,=50
of
faolation.
E(;- 7) (y, - = 50.
Here, n = 15.
Cov(X, Y) = E(x;-\y;-)
50 10
= 3.33 (approx.)
3
15
Cov(x, y) = Exy-Ex2y
-48-x20x45 -(148-180)
= - 5 --6.4
C- 1230
UNDERSTANDING 1SC
MATHEMATICS-XI
Eyámple4. If Ex - 3) = 20 and = 10, find Covl,
15, Ey, =40, E (K,- 2) (y, n
y).
Covts, y) - Exy-y)
o85-x15x40-(85-60)
25
10
= 2.5
3 4 5 6 7 25
8 7 6 5 4 30
Xy 24 28 30 30 28 140
))
40.
6
(x- (y -
15 -15 44 4 -60
20 10 43 3 30
25 -5 45 5 -25
30 37 -3 0
40 10 34 -6 -60
50 20 -3
37 -60
So
Total
Cov(X,Y) = E(x-)(y
N
1
-235)
-
=-39.17
) -235
6
ANALYSIS
CORRELATION C- 1231
7. Calculate the
Exgprple covariance for the following bivariate data
11 12 13 14 15 17 18 19 20 21
14 8 12 21 19 19 23 22 17 25
Solution. Assume mean of X-variate A = and
16, for Y-variate B= 19.
u = X- 16
V= /- 19
11 -5 14 -5 25
12 -4 - 11 44
13 12 -7 21
14 -2 21 2 -4
15 -1 19 0
17 1
19 0 0
18 2 23 4
19 3 22 3
20 4 17 2
21 5 25 6 30
Total - 10 125
Cov(X, Y) =
= 12.5
EXERCISE 2.1
1. Find Cov(X, Y) between = 2 y.= Ex; =-115 and n =
x and y when Ex, 50, 30, V; 10.
between
52, y, 60, x; y, 10.
x and y.
Find Cov(X, Y) between X and Y if E u, v, =55 and n = 11 where u, and v; are deviations
.
X and Y means.
5 of
(3, 9),
from
Find the covariance of the data given below
:
(4,
their
11), (5,
respective
10), (6, 9), (7, 8), (8, 7), (9, 6), (10, 5).
48 42 36 30 24 18 12
Y 78 72 66 60 54
data :
8. Calculate covariance for the following
1
3 4 6 7 8
|Y 16
9 4 1 1
independent
4 9 16
Scale. If u =
ax + b, v = cy + d, show that cov(u, v) a.c. cov(r, y).=
OF CORRELATION
2,4 KARL PEARSON'S COEFFICIENT
it depends on the scale of
covariance is independent of the choice of the origin,
Though
further, we use the following formula for (Karl Pearson's)
measurement. To standardise it
of correlation (sometimes called Product
Moment Correlation).
coefficient
C- 1232 UNDERSTANDING 1SC MATHEMATICs
VVarx Var y
It is casy to see that if
I = ax + b andv= cy + d, then
N a(r-)?
Ea(r-) b(y -
b2(y
)
-)2
N
ab
Iabl
Cov (x, y)
o,y
to(X, y)
N N
Therefore, coefficient of correlation is independent of choice of origin and scale. Note
that - 1 Srs1 (Proof is beyond the scope of this book).
E(r-)(y-)
If , E(-I2y-y)2
y are small numbers, we use
Zxy-xy
..i)
Ey?-y)²
Ofherwise, we use assumed means A and B, and u =X- A, v + y- B,
N Eu Ev
1
E-Z)2
N
...(iü)
slrl<
4 4
and of low degree if 0 s r|<.
4
1
I
3. IfX and Y are independent variables then cov(X, Y) = 0 and coefficient of correlation
r=0. Inversely, if r =0, then X and Y have no linear
still have relation. However, Y may
a curved with X. For example, for observations (-4, 16),
relation
(-3, 9), (-2, 4), (-1, 1),
(1,1), (2, 4), (3, 9), (4, 16), we find that r =0
(Do it !). However, we also see that Y = X2.
=
Hence, though r 0, we can still accurately
predict the value of Y, given the value of X.
4. Correlation highly abused by researchers and
coefficient is
advertisers. It may or may not
indicate cause and
relationship. For example, in any school, you will find a
effect
high
positive correlation between children's shoe size and spelling ability. Does it mean that
bigger feet lead to better brains or that if you learn to
spell better, your feet will get bigger?
May be a third factor, that is, age of children, affects both these factors.
TION ANALYSIS
OARELA
JLLUSTRATIVE EXAMPLES
1. Find
Example the
()
)
6.25 and Var
Var(X)
cocf}icient of
20.25. C1233
corelation
solution. r(X =
between X and
Y
Cov(X,Y) when Cov (X,
|Var(X) Y-2.75,
-2.75
Var(Y)
-2.75 N6.25 x
2025
2.5x 4.5 275 10
100
10 11
--2.444 25x45
45
Findthe (approx.)
Example
2.
Ey' =464, E xy = correlation
508 and
1 =
25.
cocfficient
betwcen Xand
Y
wlen Ex=
125, Ey 100,
x'-650,
-x
L
EX-(2x)
y'-(y
508 1
25 x 125 x 100
650 -25 X (125)2
1
64-1
25 X(100)2
508 –500
8
J650-625\464 -400 J25 x
8 1
64
5x8E0.2
Example 3. Calcylate
the correlation
(y -15) = 180, E
E(x- 10) (y -15) =60 and 215,
n = 10.
Solution. =X -10 and v=y -
.u= Let u
100 – 10 × 10 0,
v=E(y- 15) =Ly - 15n = 150 - 15 = x 10 0.
Also E 2 =180, = 215 andE uv = 60
u²
1
Euv.
60
10X0x0 60
V180 x 215
|180-x0²215-x0²
10
= 60 6
= 0.305 (approx.)
38700 V387
Example 4. Find Karl Pearson's coefficient of correlation between X and Y for the following data :
X 5 4 2
Y 4 2 10 8 6
C- 1234 UNDERSTANDING ISC MATHEMATICS
5 25 4 16 20
4 16 2 4 8
3 9 10 100 30
2 4 64 16
1 1 6 36 6
Total 15 55 30 220 80
1 = Exy -rEy1
(Ex)2 (Ey)²
N N
80-(15)(30) -10 10
V10 40 20
=-0.5
Example 5. Calculate coefficient of correlation from the follorving data :
X 12 13 14 15 16 17 18
14 17 18 19 20 24 28
2Y
Y = N
140
7 = 20.
16 1 1 20 0 0
17 2 4 24 4 16 8
18 3 28 64 24
Example 6. Find the Karl Pearson's coefficient of correlation between x and y for the following data:
16 18 21 20 22 26 27 15
22 25 24 26 25 30 33 18
Solution. Assume mean A =20 for the x-variate and B = 25 for y-variate and we shall use
the formula(ii).
C-1235
ANALYSIS
TION
CORRELA U= 20 2 Vy- 25
22 -3 12
-4 16
0 0
16 -2 4 25
18 24
1
21 0 0 26 1
(0
1
20 4 25 0 0
2
22 36 30 25
6
26 49 33 8 64 56
7
27 25 -11 55
5 14 121
15
152
5 135 1 221
Luv-E|Ev
N
Hence,
p,y) =
152-(5)(-1) 1221
V1055 /1767
ji35-(6 21--9
= 0.894
husbands and wives based
on the
heiglts of
between the
Exanple 7. Find the correlation coefficient
(given in inches)and interpret
the result.
data
dllowig
1 23 4 5 6 10 11 1213 14 15
lCouple
7271 7170 68 6868 68676762
Heict of husband 76|757572
67 64 6565 66636561
77 70|7067 |716565
Height of
wife
1 65 1 1 -1
1
7 71
1
70 0 67 1
4 64 2 4 4
68
4 65 1 2
68
10
4 65 1
2
68 2
11
66 0
2 4
12 68
3 63 -3
13 67
9 65 -1 1 3
14 67 -3
8 64 61 -5 25 40
15 62
5 127 140
Total
194
C-1236 INDERSTANDING ISC MATHEMATICS- XI
(0) (5)
140
N 15
(E )? (Eo)? (0)2
15
y(127)2(9)2
N N 15
7 9
2 3
Fallen from mmber of storeys
15 18 15 12 3
3 9 12
Percentage killed
0 18 6 36
15 9 3
6 1 1
7 2 12 0 0
-3 9 -9
3 9
4 16 3 -9 81 -36
of getting killed
storey, percentage cats
bodies 1 2 3 4 5 7 8 9 X
when
they fall from higher storeys,
they get sufficient time to stretch their bodies Number of stories fallen
Eyample 9. From the follorwing data, find the values of a, b and Karl Pearson's coefficient of
correlatio:
10 13 16 25 26 30
6 10 12 15 19
70 + b = 84 b= 14
U=t -20
V= y- 12
-10 100 6 -6 60
10 36
-7 49 8 -4 28
13 16
-4 16 10 -2 8
16 4
0 0 12 0 0
20
5 25 10
25 14 2 4
6 36 15 3 18
26 9
30 10 100 19 7 49
70
Total
Eu = 0 2 u2 =326 v = 0 E² =118 2uv =194
Luv-EuLv 194-x0x0
326-x0° 118-xo
11
194 194 97 97
98.066
=0.989 (approx.)
A student ohile calculating correlation cocfficient betroeen two variables x and y for
25 pdis of gbservations obtained the following results :
oH rechecking, it zvas found that he had wrongly copied two pairs as (6, 14) and (8, 6) whereas
palues zwere (8, 12) and (6, 8). Calculate the correct correlation coefficient between x and y.
Correct - 62 - + 82 + 62 = 650
L2 = 650 8
C-1238
50 observations is 0.3, R= 10, - 6,
of corelation betveen x and y for
Jkample 11. Cocffcient
values (10, 6) wasinaCCurate and hence
49 pairs of values.
of correlation
6
the cocfficient
Ceeded out. Caleulate
0.3
(:o, 3, o, 2)
Jx2
> 61.8y
1
18 y - 10 x 6
50 x 61.8 3090.
o, -3 -(I)? - 32
EO2- 10 E?- 50
1
9 109 x 5450;
a, - 2 y- (7)? = 22
50
y-6 - 4 y = 40 x 50 - 2000.
Exy- 1,
ExEy
= we
11
Using the formula, r get
1
3030 x 490 x 294
-
new = 49
-x
r
4g
x (294)2
3030 -2940 90
90 90 3
= 0.3
300 10
/90000
EXERCISE 2.2
A Find p(x, y) if cov(r, y) = -16.5, var(r) = 2.25 and var(y) =144.
2. The coefficient of correlation between two variables X and Y is 0.64. Their covariance is
coefficient
y²= 580and between x and y when n = 10,
6 Fina of E xy
Xlulate the
) =138, (r-
coefficient
correlation
of
betweenx
and v,
)(y-V)=122 andwhen Er=
correlation
=305.
n = 15.
375, Ey | 270, (r- =
2 5 3 8 7
Karl
K Compute Pearson's
firm for six months.
Coefficient of
Correlation
of
Sales
18 20
of 27 20
)
(in laklh
21 29
Expenditure
23 27
(in lakh of 28 28 29 30
Calculate KarlPearson's
coefficient of correlation
and interpret the result : betweenxand y for the following
data
(1, 6),(2, 5),
(3, 7), (4,9), (5, 8), (6, 10),
(7, 11), (8,
13), (9, 12).
Calculate Karl Pearson's
coefficient of
X 6
correlation between x and y for the followingdata
2 4 9 :
1 3 5
Y 13 8 12 15 9 10
LA
11 16
Find Karl Pearson's
coefficient of correlation between X and Y for the following data:
16 18 21 20 22 26 27 15
Y 22 25 24 26 25 30 33 14
11<The weights of sons and fathers (in kilograms) are
given below :
Weight of father 65 66 67 67 68 69 70 72
Weight of son 67 68 65 68 72 72 69 71
Find the coefficient of correlation.
12 Calculate Karl Pearson's coefficient of correlation from the following data and interpret
the result :
13. Calculate Karl Pearson's coefficient of correlation between the marks in English and
Mathematics obtained by 10 students :
English
20 13 18 21 11 12 17 14 19 15
Mathematics 17 12 23 25 14 19 21 22 19
21 24 26 29 32 43 25 30 35 37
120 123 125 128 131 142 124 129 134 136
C-1240 IGC MATHEMATCS
UNOERSTANDING
15. From the following table, calulate the Kartl Pearson's coefficient of correlation
X 6 2 10 4
11 7
Anhmetic means of X and Y series are 6 and 8 respectively.
befween two variables r and y for 12 pairs of
alculating the coeffiient of correlation
etc Sometimes, though may be possible to quantify the variable, we may chose to grade it in
it
terms of ranks, by using numbers 1, 2. ..,n. Assigning rank to the highest (or lowest) value 1
and rank 2 to the next highest (or next lowest) value and so on. If two corresponding sets of
values x and y are ranked in such manner, the Edteard Spearman's coefficient of rank correlation,
denoted by r or as r, is given by
r =1 6E42
mn -1)
where =difference between ranks of corresponding x and y
d
n = number of pairs of values (r, y) in the data.
As an example, let 5 students be ranked in Maths and Physics as
Student A B C D
Maths 1 2 3 4 5
Prysics 2 3 4 5
Student A B C D E
Maths 1 2 3 4 5
Sports
5 4 3 2 1
6x 40 240
and r= =1 =1-2=-1.
1
5(25-1) 120
Thus, we see that r =+1 when ranks are in complete agreement, and in the same direction.
Also r =-1 when ranks are in complete agreement but in the opposite direction. Otherwise,r
varies between -1 and +1.
Now, let us assumethat the five students are given marks as follows in Maths and English:
Student A B D
Maths 90 80 70 60 50
English 90 70 80 60 50