TOPIC 9
TOPIC 9
CORRELATION
Correlation helps to establish the relationship between two or more than two
variables.
Positive correlation
Negative correlation
POSITIVE CORRELATION
If one variable increases the corresponding variable also increase. Example, when
petrol price goes up, it leads to increase in the bus fare.
NEGATIVE CORRELATION
If one variable increases the corresponding variable decreases. Example when the
supply of maize is more in the market, the price of maize flour in the supermarket
reduces.
DEGREE OF CORRELATION
They are two;
(a) Perfect positive correlation
(b) Perfect negative correlation
Example,
If a student scores the highest marks in English and again the same student scores
the highest marks in Kiswahili then there is a perfect positive correlation.
1
PERFECT NEGATIVE CORRELATION
When one variable is perfectly correlated negatively with another variable, the
increase or decrease in that variable leads to increase or decrease in corresponding
variable, then there is a perfect negative correlation.
Example,
If a student scores the highest marks in English and then the same student scores
the lowest marks in Kiswahili, there is a perfect negative correlation.
The method of correlation was introduced by Spearman who adopted the Rank
Difference Method to calculate the correlation between two variables.
Formula;
rho = 1 – 6 ∑ D2
N (N2 – 1)
ILLUSTRATION;
The following are scores of a student in two examinations, X and Y.
X Y
78 84
36 54
98 36
25 60
75 36
80 54
25 92
62 36
36 62
44 68
2
Calculate the correlation coefficient for X and Y scores using Spearman’s rank
difference method.
Procedure;
1. Have a column for X scores as shown below.
2. Have a column for the Y scores as shown below.
3. Have a column for ranking the scores in the X (R1) column as shown below.
4. Have a column for ranking the scores in the Y (R2) column as shown below.
5. Have a column for the rank differences (R1 – R2) = D
6. Have a column for D2
7. Get the summation of D2 (∑D2)
Formula;
rho = 1 – 6 ∑ D2
N (N2 – 1)
= 1 – 6 x 241
10 (100-1)
= 1 - 1446
10 (99)
= 1- 1446
990
= 1 – 1.4606
3
rho = -.46
Pearson’s r = ∑ (x – x ) (y – y)
N (Sx) (Sy)
r = Correlation co-efficient
S = Summation
x-x = Difference between the mean for each score on the x test.
Illustration;
The following are scores of two tests, X and Y.
X Y
50 60
60 80
70 90
80 70
90 100
Calculate the co-relation co-efficient for the above tests.
Procedure;
1. Create a column for the X scores as shown below.
2. Calculate the mean for the X scores as shown below.
3. Create a column for the mean of the X scores as shown below.
4. Create a column for the difference between the scores on the X column and
the mean as shown below.
4
5. Create a column for the Y scores as shown below.
6. Calculate the mean for the Y scores as shown below.
7. Create a column for the mean of the Y scores as shown below.
8. Create a column for the difference between the scores on the Y column and
the mean as shown below.
9. Create a column for the product of (x – x ) (y – y) as shown below
x x x–x y y (y – y) (x – x) (y – y)
50 70 -20 60 80 -20 400
60 70 -10 80 80 0 0
70 70 0 90 80 10 0
80 70 10 70 80 -10 -100
90 70 20 100 80 20 400
700
Pearson’s r = ∑ (x – x ) (y – y)
N (Sx) (Sy)
= 700
N (Sx) (Sy)
Note that the standard deviation for X and Y scores should be calculated.
SD for x = ( square root of each scores in the (x - x) column as shown
below)
20 x 20 = 400
10 x 10 = 100
0x 0 = 0
10 x 10 = 100
20 x 20 = 400
1,000
SD =
√ 1000
5
5
= √ 200
= 14.14
Note that the standard deviation for the y scores should also be calculated. However
in the current case the scores in the y – y column are the same as the scores in the
x – x column. It therefore means that the standard deviation for the y scores is the
same as standard deviation for the x scores.
Sx = 14.14
Sy = 14.14
Substitute the various values in the formula given above.
Thus, r = 700
(5) (14.14) (14.14)
= 700
5 x 199.9396
= 700
999.698
r = 0.70
Formula
rxy = n ∑ xy – {∑x} {∑y}
{n ∑ x 2−¿{∑x}2 {n∑y2 –{∑y}2}
6
∑xy = Summation of the products of xy scores
Illustration;
The following are scores of two tests, X and Y;
X Y
11 11
13 17
14 15
15 23
17 19
Procedure;
1. Sum of all the scores in the x column as shown below.
2. Sum of all the scores in the y column as shown below.
3. Sum of all the scores in the x column after being squared as shown below.
4. Sum of all the scores in the y column after being squared as shown below.
5. Sum of the products of scores in the x column with the corresponding scores in
the y column as shown below.
∑x = 11+13+14+15+17 = 70
∑y = 11+17+15+23+19 = 85
∑x2 = 121+169+196+225+289 =1000
∑y2 = 121+289+225+529+361 = 1525
n =5
∑xy = (11) (11) + (13) (17) + (14) (15) + (15) (23) + (17) (19)
=121+221+210+345+323 = 1220
7
= 6100-5950
(5000−4900 ¿(7625 – (7225)
= 150
(100 ¿(400)
= 150
200
1. Prediction
It is used to predict the success one will achieve in his further Careers.
Example
Marks obtained by a student in KCSE can be compared with those marks obtained in
the College examination to predict his success in the completion of the University
degree programme.
2. Reliability
8
It is used to test reliability. The coefficient correlation informs us immediately and
precisely on the extent a test gives the same results on two successive application to
the same individuals.
3. Validity
A test is a worthy / value can be obtained through correlation When a test is
constructed the question being asked is, “what does it test”.
This question is answered by the magnitude of the coefficient with various criteria.
4. Test construction
Whenever a test is constructed, there is always the question of whether each
element of the test is related to other elements or to the tests as a whole and as to
whether each element is related to the criterion chosen.
These relationships are all examined through the technique called correlation
coefficient.