Unit-7 Statistics in Psychology
Unit-7 Statistics in Psychology
OF CORRELATION*
7.0 Objectives
7.1 Introduction
7.2 Pearson’s Product Moment Correlation
7.2.1 Assumptions of Pearson’s Product Moment Correlation
7.2.2 Uses of Pearson’s Product Moment Correlation
7.2.3 Computation of Pearson’s Product Moment Correlation
7.3 Spearman’s Rank Order Correlation
7.3.1 Assumptions for Spearman’s Rank Order Correlation
7.3.2 Uses of Spearman’s Rank Order Correlation
7.3.3 Computation of Spearman’s Rank Correlation
7.4 Let Us Sum Up
7.5 References
7.6 Answers to Check Your Progress
7.7 Unit End Questions
7.0 OBJECTIVES
After reading this unit, you will be able to :
learn to compute coefficient of correlation with the help of Pearson’s
product moment coefficient of correlation and Spearman’s rank order
correlation.
7.1 INTRODUCTION
In the previous unit, we discussed about the basics of correlation. We discussed
that correlation indicates relationship between two or more variables. This
correlation can be interpreted in terms of direction and magnitude. Thus, a
relationship between given two variables can be positive, negative or there
could be no correlation. Further, the correlation may range between +1 to -1.
In the present unit, we will learn about the computation of correlation with the
help of Pearson’s product moment correlation and Spearman’s rank order
correlation.
One way in which these two methods can be distinguished is that, Pearson’s
product moment correlation can be categorised under parametric statistics and
the Spearman’s rank order correlation falls under nonparametric statistics.
To distinguish between parametric and nonparametric statistics, the following
table (table 7.1) can be referred to:
In the next section, we will learn how to compute Pearson’s product moment
correlation.
137
Correlation 7.2.2 Uses of Pearson’s Product Moment Correlation
1) It helps in determining the relationship between two variables
quantitatively. With quantification, it is possible for us to compare.
2) Based on ‘r’, regression equation can be computed. Thus, after computing
‘r’, it is possible to compute regression and determine whether one
variable can be predicted based on another variable.
3) ‘r’ can be used in computation of reliability and validity of psychological
tests.
4) It will also assist in computation of factor analysis.
138
Computation
Participants Data 1 Data 2 x y xy x2 y2 of Coefficient
(1) (X) (Y) (4) (5) (6) (7) (8) of Correlation
(2) (3)
1 3 4 0 0 0 0 0
2 2 3 -1 -1 1 1 1
3 4 5 1 1 1 1 1
4 4 5 1 1 1 1 1
5 3 5 0 1 0 0 1
6 2 3 -1 -1 1 1 1
7 2 3 -1 -1 1 1 1
8 3 4 0 0 0 0 0
9 5 5 2 1 2 4 1
10 2 3 -1 -1 1 1 1
2
ΣX= 30 ΣY= 40 Σxy Σx = Σy2 =
=8 10 8
Step 1: First the scores under X and and Y are totalled separately. Thus, ΣX
and ΣY is obtained as can be seen above in the second and third column. N is
also noted and in this case it is 10.
Step 2: Mean is now computed for the data 1 (X) and 2 (Y) as follows:
Mean for scores on X = 30 (30/10) = 3
Mean for scores on Y = 40 (40/10) = 4
Step 3: In the third step, deviation is computed for each score of X from its
mean, that is, 3 in the case of this example. In a similar manner deviation is
computed for each score of Y from its mean, that is, 4. These are entered in the
column four and five above under headings ‘x’ and ‘y’ respectively.
Step 4: The values thus entered under ‘x’ and ‘y’ are multiplied and entered in
column six and then they are also squared and entered under column seven and
eight with headings x2 and y2. Further, the scores under each of these columns
are totalled to obtain Σxy, Σx2 and Σy2.
Step 5: Use the formula to compute ‘r’.
r = Σxy/ √Σx2 Σy2
= 8/ √ 10 x 8
= 8/ √ 80
= 8/8.94
= 0.89
Thus, the coefficient of correlation obtained for the above data is 0.89,
denoting that there is a positive and high relationship between the two data sets
X and Y.
139
Correlation Method 2: The formula for the second method is give below,
r = NΣXY- ΣX Σ Y/ √[N Σ X2 -(ΣX)2] [N ΣY2 -(ΣY)2]
Where,
X and Y= the raw scores for X and Y
ΣXY = The total of the products of each X score multiplied with
its corresponding Y score
N= Total number of scores.
In this method, the deviations from the mean are not computed, instead raw
scores are used to compute ‘r’.
Let us understand this method and steps involved in it with the help of an
example used earlier for calculating ‘r’.
Step 1: Total scores for X and Y in column two and three are computed and
denoted as ΣX and ΣY. In the case of present example, they are obtained as 30
and 40.
Step 2: In column four, XY is computed where the paired values under X and
Y are multiplied. Thus, for participant 1, X value 3 and Y value 4 are
multiplied and 12 is obtained. Similarly, XY is computed for all the
participants and then ΣXY is also computed.
Step 3: In column five and six, X2andY2 are computed. These are squared
values of X and Y respectively. Further, ΣX2and ΣY2 are computed that are
summations of X2andY2respectively.
Step 4: Use the formula to compute ‘r’.
r = NΣXY- ΣX ΣY/ √[N ΣX2 -(ΣX)2] [NΣY2 -(ΣY)2]
140
= 10 x 128- (30x 40)/ √ [(10 x 100- (30)2] [( 10 x 168 - (40)2] Computation
of Coefficient
= 1280- (1200)/ √ [1000 - 900] [1680- 1600] of Correlation
= 80/ √ 100 x 80
= 80/ √ 8000
= 80/ 89.44
= 0.89
Thus, ‘r’ obtained is 0.89 denoting a positive and high correlationship.
Check Your Progress I
1) Pearson Product Moment Correlation is denoted as ............................... .
2) The variables used to compute ‘r’ are continuous in nature and the scales
of measurement are ............................... and ............................... .
3) The formula for the first method of computing Pearson’s product moment
correlation is ............................... .
141
Correlation 4) It can be used to study the degree of relationship between two variables
that are monotonic. A relationship is termed as monotonic when the
variables display consistent but one directional relationship.
Step 1: Ranks are assigned separately to scores under data 1 and those under
data 2. These ranks are mentioned in column four and five respectively. Ranks
can either be assigned in descending or ascending order. For instance in the
present example, rank 1 is assigned to the lowest value and rank 10 to the
highest value and this is followed in same way for both the data.
Step 2: Difference in Ranks are calculation and these are irrespective of their
signs(R1- R2= |d|). These are then mentioned in column six. In the last column,
142
that is, column seven, difference squared(d2) is computed and the total of this is Computation
mentioned as ∑d2. In the case of present example, ∑d2 is 18. of Coefficient
of Correlation
Step 3:The formula used to compute Rho is
p = 1- [(6Σd2) / [N (N2 - 1)]
= 1- (6 x 18) / 10 (102-1)
= 1- (108)/ 10 (100-1)
= 1- (108/ 10 x 99)
= 1- (108/ 990)
= 1- 0.11
= 0.89
Thus, the correlation of coefficient (Rho) obtained for the above data is 0.89,
thus denoting a significant and positive correlation between data 1 and data 2.
Method 2 (with tied ranks): The formula used for computing rho with tied
ranks is the same. Only we need to understand how ranks are assigned when
there are two or more similar scores in a given data.
Let us understand this method and steps involved in it, with the help of an
example,
A researcher wanted to study the relationship between data 1 (X) and data 2
(Y). The data obtained is given below:
As can be seen in the above table, there are same values under data 1, that is
23, obtained by participant 2 and 3 and score of 25 obtained by participants 7
and 8. Similarly, in data 2, participants 5, 6 and 8 have obtained 60 score. In
such a case ranks are assigned in a bit different manner.
143
Correlation As can be seen in above table, 21 is assigned with rank 1 and then there are two
’23’ scores that need to be equally assigned ranks 2 and 3. Thus 2 +3 = 5/ 2=
2.5. The rank 2.5 is then allotted to both these score. The next score is then
allotted rank 4. But in present example, rank 4 and 5 are shared equally by
score 25 obtained by 7th and 8th participants. Thus 4+5= 9/ 2 (because there
are two same scores) = 4.5. Thus, 4.5 is allotted to these two scores and the
next score, that is, 32 is assigned rank 6.
In data 2, the score 60 equally shares ranks 8, 9 and 10. Thus 8 + 9 +10 = 27/ 3
(because there are three same scores) = 9. Thus the score 60 is assigned rank 9.
Using the same formula Rho is computed as follows
p = 1- [(6Σd2) / [N (N2 - 1)]
= 1- (6 x 92) / 10 (102-1)
= 1- (552)/ 10 (100-1)
= 1- (552/ 10 x 99)
= 1- (552/ 990)
= 1- 0.56
= 0. 44
Thus, the correlation of coefficient (Rho) obtained for the above data is 0.44,
thus denoting a positive and high correlationship between the two data sets.
Check Your Progress II
1) The variables in Spearman’s Rho are measured in terms of ...................
scale.
2) A relationship is termed as monotonic when
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
3) The formula for the first method of computing Spearman’s rho is
7.5 REFERENCES
Mangal, S. K. (2002). Statistics in Psychology and Education. New Delhi: Phi
Learning Private Limited.
Mohanty, B and Misra, S. (2016). Statistics for Behavioural and Social
Sciences. Delhi: Sage.
Veeraraghavan, V and Shetgovekar, S. (2016). Textbook of Parametric and
Nonparametric Statistics. Delhi: Sage.
145