0% found this document useful (0 votes)
33 views10 pages

Unit-7 Statistics in Psychology

This document discusses two methods for computing the coefficient of correlation: Pearson's product moment correlation and Spearman's rank order correlation. Pearson's correlation is used when the data meets the assumptions of parametric statistics, including a normal distribution and continuous measurement scales. It measures the linear relationship between two variables on an interval or ratio scale. Spearman's correlation is a nonparametric alternative that can be used on ordinal data and makes fewer assumptions about the distribution. The document then explains the steps for computing Pearson's correlation coefficient using two different formulas. It provides an example calculation to demonstrate how to find the coefficient from a set of bivariate data.

Uploaded by

Rashmi N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views10 pages

Unit-7 Statistics in Psychology

This document discusses two methods for computing the coefficient of correlation: Pearson's product moment correlation and Spearman's rank order correlation. Pearson's correlation is used when the data meets the assumptions of parametric statistics, including a normal distribution and continuous measurement scales. It measures the linear relationship between two variables on an interval or ratio scale. Spearman's correlation is a nonparametric alternative that can be used on ordinal data and makes fewer assumptions about the distribution. The document then explains the steps for computing Pearson's correlation coefficient using two different formulas. It provides an example calculation to demonstrate how to find the coefficient from a set of bivariate data.

Uploaded by

Rashmi N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

UNIT 7 COMPUTATION OF COEFFICIENT

OF CORRELATION*
7.0 Objectives
7.1 Introduction
7.2 Pearson’s Product Moment Correlation
7.2.1 Assumptions of Pearson’s Product Moment Correlation
7.2.2 Uses of Pearson’s Product Moment Correlation
7.2.3 Computation of Pearson’s Product Moment Correlation
7.3 Spearman’s Rank Order Correlation
7.3.1 Assumptions for Spearman’s Rank Order Correlation
7.3.2 Uses of Spearman’s Rank Order Correlation
7.3.3 Computation of Spearman’s Rank Correlation
7.4 Let Us Sum Up
7.5 References
7.6 Answers to Check Your Progress
7.7 Unit End Questions

7.0 OBJECTIVES
After reading this unit, you will be able to :
 learn to compute coefficient of correlation with the help of Pearson’s
product moment coefficient of correlation and Spearman’s rank order
correlation.

7.1 INTRODUCTION
In the previous unit, we discussed about the basics of correlation. We discussed
that correlation indicates relationship between two or more variables. This
correlation can be interpreted in terms of direction and magnitude. Thus, a
relationship between given two variables can be positive, negative or there
could be no correlation. Further, the correlation may range between +1 to -1.
In the present unit, we will learn about the computation of correlation with the
help of Pearson’s product moment correlation and Spearman’s rank order
correlation.
One way in which these two methods can be distinguished is that, Pearson’s
product moment correlation can be categorised under parametric statistics and
the Spearman’s rank order correlation falls under nonparametric statistics.
To distinguish between parametric and nonparametric statistics, the following
table (table 7.1) can be referred to:

* Prof. Suhas Shetgovekar, Faculty, Discipline of Psychology, School of Social Sciences,


IGNOU, New Delhi
136
Computation
Table 7.1: Difference between Parametric and Nonparametric statistics of Coefficient
Parametric Non-parametric of Correlation
The assumed distribution is The assumed distribution may not be normal.
normal. It can be any distribution.
The variance is The variance could be heterogeneous or no
homogeneous. assumption is made with regard to the
variance.
The scales of measurement The scales of measurement used are nominal
used are interval or ratio. or ordinal.
The relationship between the There is no assumption with regard to the
data needs to be independent. independence of relationship between the
data.
Mean is the measure of Median is the measure of central tendency
central tendency that is used that is used here.
here.
It is more complex to It is simple to calculate.
compute when compared to
the non parametric
techniques.
Can get affected by outliers. Is comparatively less affected by outliers.

In the next section, we will learn how to compute Pearson’s product moment
correlation.

7.2 PEARSON’S PRODUCT MOMENT


CORRELATION
Pearson’s product moment correlation is one of the methods to compute
coefficient of correlation. This is mainly used when the assumptions of
parametric statistics are met. This method is named after Karl Pearson, who
invented this method. It is denoted by ‘r’.

7.2.1 Assumptions of Pearson’s Product Moment Correlation


The assumptions of Pearson’s product moment correlation are as follows:
1) The variables used to compute ‘r’ are continuous in nature and the scales of
measurement are interval and ratio.
2) The distribution of the variables in this method is unimodal and it is close
to symmetrical. The distribution need not be normal.
3) The pairs of scores involved are independent in nature and are in no way
connected with other.
4) There is a linear relationship between the two variables. A scatter gram
thus drawn with the help of scores in the two variables, will denote a
straight line.
5) ‘r’ is mainly used to ascertain the sign and size of the correlation that can
be positive, negative or zero correlation and will range between -1 to +1.

137
Correlation 7.2.2 Uses of Pearson’s Product Moment Correlation
1) It helps in determining the relationship between two variables
quantitatively. With quantification, it is possible for us to compare.
2) Based on ‘r’, regression equation can be computed. Thus, after computing
‘r’, it is possible to compute regression and determine whether one
variable can be predicted based on another variable.
3) ‘r’ can be used in computation of reliability and validity of psychological
tests.
4) It will also assist in computation of factor analysis.

7.2.3 Computation of Pearson’s Product Moment Correlation


There are two main methods that we will discuss for computing Pearson’s
product moment correlation. They are discussed as follows:
Method 1: The formula for the first method is give below,
rxy = Σxy/ N σxσy
Where,
r = Correlation
x = Deviation of any score of X from the mean of X
y = Deviation of any score of Y from the mean of Y
Σxy= Indicates the sum of all the products of deviation (that is, each x
deviation is multiplied by its corresponding y deviation)
σx = Standard deviation of scores in X
σy= Standard deviation of scores in Y
N = Total number of participants (frequencies)
The formula can be simplified as follows
σx = √Σx2/ N
σy = √Σy2/ N
Thus, by substituting the values for σx and σy, the following is obtained :
r = Σxy/ N √Σx2/ N √Σy2/ N
= Σxy/ √Σx2 Σy2
Let us understand this method and steps involved in it, with the help of an
example,
A researcher wanted to study the relationship between data 1 (X) and data 2
(Y). The data is given below:

138
Computation
Participants Data 1 Data 2 x y xy x2 y2 of Coefficient
(1) (X) (Y) (4) (5) (6) (7) (8) of Correlation
(2) (3)
1 3 4 0 0 0 0 0
2 2 3 -1 -1 1 1 1
3 4 5 1 1 1 1 1
4 4 5 1 1 1 1 1
5 3 5 0 1 0 0 1
6 2 3 -1 -1 1 1 1
7 2 3 -1 -1 1 1 1
8 3 4 0 0 0 0 0
9 5 5 2 1 2 4 1
10 2 3 -1 -1 1 1 1
2
ΣX= 30 ΣY= 40 Σxy Σx = Σy2 =
=8 10 8

Step 1: First the scores under X and and Y are totalled separately. Thus, ΣX
and ΣY is obtained as can be seen above in the second and third column. N is
also noted and in this case it is 10.
Step 2: Mean is now computed for the data 1 (X) and 2 (Y) as follows:
Mean for scores on X = 30 (30/10) = 3
Mean for scores on Y = 40 (40/10) = 4
Step 3: In the third step, deviation is computed for each score of X from its
mean, that is, 3 in the case of this example. In a similar manner deviation is
computed for each score of Y from its mean, that is, 4. These are entered in the
column four and five above under headings ‘x’ and ‘y’ respectively.
Step 4: The values thus entered under ‘x’ and ‘y’ are multiplied and entered in
column six and then they are also squared and entered under column seven and
eight with headings x2 and y2. Further, the scores under each of these columns
are totalled to obtain Σxy, Σx2 and Σy2.
Step 5: Use the formula to compute ‘r’.
r = Σxy/ √Σx2 Σy2
= 8/ √ 10 x 8
= 8/ √ 80
= 8/8.94
= 0.89
Thus, the coefficient of correlation obtained for the above data is 0.89,
denoting that there is a positive and high relationship between the two data sets
X and Y.
139
Correlation Method 2: The formula for the second method is give below,
r = NΣXY- ΣX Σ Y/ √[N Σ X2 -(ΣX)2] [N ΣY2 -(ΣY)2]
Where,
X and Y= the raw scores for X and Y
ΣXY = The total of the products of each X score multiplied with
its corresponding Y score
N= Total number of scores.
In this method, the deviations from the mean are not computed, instead raw
scores are used to compute ‘r’.
Let us understand this method and steps involved in it with the help of an
example used earlier for calculating ‘r’.

Participants Data 1 Data 2 XY X2 Y2


(1) (X) (Y) (4) (5) (6)
(2) (3)
1 3 4 12 9 16
2 2 3 6 4 9
3 4 5 20 16 25
4 4 5 20 16 25
5 3 5 15 9 25
6 2 3 6 4 9
7 2 3 6 4 9
8 3 4 12 9 16
9 5 5 25 25 25
10 2 3 6 4 9
2 2
ΣX= 30 ΣY= 40 ΣXY= ΣX = 100 ΣY = 168
128

Step 1: Total scores for X and Y in column two and three are computed and
denoted as ΣX and ΣY. In the case of present example, they are obtained as 30
and 40.
Step 2: In column four, XY is computed where the paired values under X and
Y are multiplied. Thus, for participant 1, X value 3 and Y value 4 are
multiplied and 12 is obtained. Similarly, XY is computed for all the
participants and then ΣXY is also computed.
Step 3: In column five and six, X2andY2 are computed. These are squared
values of X and Y respectively. Further, ΣX2and ΣY2 are computed that are
summations of X2andY2respectively.
Step 4: Use the formula to compute ‘r’.
r = NΣXY- ΣX ΣY/ √[N ΣX2 -(ΣX)2] [NΣY2 -(ΣY)2]
140
= 10 x 128- (30x 40)/ √ [(10 x 100- (30)2] [( 10 x 168 - (40)2] Computation
of Coefficient
= 1280- (1200)/ √ [1000 - 900] [1680- 1600] of Correlation

= 80/ √ 100 x 80
= 80/ √ 8000
= 80/ 89.44
= 0.89
Thus, ‘r’ obtained is 0.89 denoting a positive and high correlationship.
Check Your Progress I
1) Pearson Product Moment Correlation is denoted as ............................... .
2) The variables used to compute ‘r’ are continuous in nature and the scales
of measurement are ............................... and ............................... .
3) The formula for the first method of computing Pearson’s product moment
correlation is ............................... .

7.3 SPEARMAN’S RANK ORDER CORRELATION


Another method to compute coefficient of correlation is Spearman’s rank order
correlation. This method is used when the assumptions of parametric statistics
are not met. The method is named after Charles Spearman, who is known for
his significant work on factor analysis and theory of intelligence besides
Spearman’s rank order correlation.

7.3.1 Assumptions for Spearman’s Rank Order Correlation


The assumptions of Spearman’s rank order correlation are as follows:
1) The variables are measured in terms of ordinal scale.
2) The relationship between the two variables is linear in nature.
3) The observations are independent in nature, thus denoting that the sample
needs to be randomly selected.
4) The pairs of scores are independent in nature and are in no way connected
with other pairs.

7.3.2 Uses of Spearman’s Rank Order Correlation


1) It is used when the data is measured with the help of ordinal scale.
2) It is especially useful when the sample size is small, that is, less than 25-
30 (Mohanty and Misra, 2016).
3) Many a times it is not possible to measure traits directly. Thus, they are
measured in terms of ranks. Spearman’s rank order correlation involves
separately ranking the scores in the two data, followed by computation of
correlationship between them.

141
Correlation 4) It can be used to study the degree of relationship between two variables
that are monotonic. A relationship is termed as monotonic when the
variables display consistent but one directional relationship.

7.3.3 Computation of Spearman’s Rank Correlation


There are two main methods that we will discuss for computing Spearman’s
rank order correlation, one without tied ranks and one with tied ranks. There
are discussed as follows:
Method 1 (without tied ranks): The formula for the first method is give
below,
p = 1- [(6Σd2) / [N (N2 - 1)]
where,
Σd2= Sum of the difference squared
N = Total number of participants
Let us understand this method and steps involved in it, with the help of an
example,
A researcher wanted to study the relationship between data 1 (X) and data 2
(Y). The data is given below:

Partici Data Data Rank Rank for Difference Difference


pants 1(X) 2 for Data 2 in Ranks Squared
(c) (2) (Y) Data 1 (R2) (R1- R2= (d2)
(3) (R1) (5) |d|) (7)
(4) (6)
1 45 40 9 9 0 0
2 34 33 8 8 0 0
3 23 25 3 3 0 0
4 22 21 2 2 0 0
5 65 60 10 10 0 0
6 33 30 7 5 2 4
7 30 31 5 6 1 1
8 25 32 4 7 3 9
9 32 29 6 4 2 4
10 21 20 1 1 0 0
2
N= 10 ∑d = 18

Step 1: Ranks are assigned separately to scores under data 1 and those under
data 2. These ranks are mentioned in column four and five respectively. Ranks
can either be assigned in descending or ascending order. For instance in the
present example, rank 1 is assigned to the lowest value and rank 10 to the
highest value and this is followed in same way for both the data.
Step 2: Difference in Ranks are calculation and these are irrespective of their
signs(R1- R2= |d|). These are then mentioned in column six. In the last column,
142
that is, column seven, difference squared(d2) is computed and the total of this is Computation
mentioned as ∑d2. In the case of present example, ∑d2 is 18. of Coefficient
of Correlation
Step 3:The formula used to compute Rho is
p = 1- [(6Σd2) / [N (N2 - 1)]
= 1- (6 x 18) / 10 (102-1)
= 1- (108)/ 10 (100-1)
= 1- (108/ 10 x 99)
= 1- (108/ 990)
= 1- 0.11
= 0.89
Thus, the correlation of coefficient (Rho) obtained for the above data is 0.89,
thus denoting a significant and positive correlation between data 1 and data 2.
Method 2 (with tied ranks): The formula used for computing rho with tied
ranks is the same. Only we need to understand how ranks are assigned when
there are two or more similar scores in a given data.
Let us understand this method and steps involved in it, with the help of an
example,
A researcher wanted to study the relationship between data 1 (X) and data 2
(Y). The data obtained is given below:

Partic Data Data Rank Rank for Difference Difference


ipants 1 2 for Data Data 2 in Ranks Squared
(c) (X) (Y) 1 (R2) (R1- R2= |d|) (d2)
(2) (3) (R1) (5) (6) (7)
(4)
1 45 40 8 7 1 1
2 23@ 33 2.5 6 3.5 12.25
3 23@ 25 2.5 3 0.5 0.25
4 64 21 9 2 7 49
5 65 60# 10 9 1 1
6 33 60# 7 9 2 4
7 25# 31 4.5 5 0.5 0.25
8 25# 60# 4.5 9 4.5 20.25
9 32 29 6 4 2 4
10 21 20 1 1 0 0
2
N= 10 ∑d = 92

As can be seen in the above table, there are same values under data 1, that is
23, obtained by participant 2 and 3 and score of 25 obtained by participants 7
and 8. Similarly, in data 2, participants 5, 6 and 8 have obtained 60 score. In
such a case ranks are assigned in a bit different manner.
143
Correlation As can be seen in above table, 21 is assigned with rank 1 and then there are two
’23’ scores that need to be equally assigned ranks 2 and 3. Thus 2 +3 = 5/ 2=
2.5. The rank 2.5 is then allotted to both these score. The next score is then
allotted rank 4. But in present example, rank 4 and 5 are shared equally by
score 25 obtained by 7th and 8th participants. Thus 4+5= 9/ 2 (because there
are two same scores) = 4.5. Thus, 4.5 is allotted to these two scores and the
next score, that is, 32 is assigned rank 6.
In data 2, the score 60 equally shares ranks 8, 9 and 10. Thus 8 + 9 +10 = 27/ 3
(because there are three same scores) = 9. Thus the score 60 is assigned rank 9.
Using the same formula Rho is computed as follows
p = 1- [(6Σd2) / [N (N2 - 1)]
= 1- (6 x 92) / 10 (102-1)
= 1- (552)/ 10 (100-1)
= 1- (552/ 10 x 99)
= 1- (552/ 990)
= 1- 0.56
= 0. 44
Thus, the correlation of coefficient (Rho) obtained for the above data is 0.44,
thus denoting a positive and high correlationship between the two data sets.
Check Your Progress II
1) The variables in Spearman’s Rho are measured in terms of ...................
scale.
2) A relationship is termed as monotonic when
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
.........................................................................................................................
3) The formula for the first method of computing Spearman’s rho is

7.4 LET US SUM UP


In the present unit, we mainly discussed about the two methods of computing
coefficient of correlation. The first method is Pearson’s product moment
correlation and the other is Spearman’s rank order correlation. Pearson’s
product moment correlation is one of the methods to compute coefficient of
correlation. This is mainly used when the assumptions of parametric statistics
144
are met. This method is named after Karl Pearson, who invented this method. It Computation
is denoted by ‘r’. Spearman’s rank order correlation is used when the of Coefficient
of Correlation
assumptions of parametric statistics are not met. The method is named after
Charles Spearman, who is known for his significant work on factor analysis
and theory of intelligence. The assumptions and uses of these method were also
discussed. The formula and computation for the two methods were discussed
with the help of examples.

7.5 REFERENCES
Mangal, S. K. (2002). Statistics in Psychology and Education. New Delhi: Phi
Learning Private Limited.
Mohanty, B and Misra, S. (2016). Statistics for Behavioural and Social
Sciences. Delhi: Sage.
Veeraraghavan, V and Shetgovekar, S. (2016). Textbook of Parametric and
Nonparametric Statistics. Delhi: Sage.

7.6 ANSWERS TO CHECK YOUR PROGRESS


Check Your Progress I
1) Pearson Product Moment Correlation is denoted as r.
2) The variables used to copute r are continuous in nature and the scales of
measurement are interval and ratio.
3) The formula for the first method of computing Pearson's Product Moment
Correlation is rxy = Σxy/ N σxσy
Check Your Progress II
1) The variables in Spearman’s Rho are measured in terms of Ordinal scale.
2) A relationship is termed as monotonic when the variables display
consistent but one directional relationship.
3) The formula for the first method of computing Spearman’s rho is
p = 1- [(6Σd2) / [N (N2 - 1)]

7.7 UNIT END QUESTIONS


1) Differentiate between Parametric and Nonparameric Statistics.
2) Discuss the assumptions of Pearson’s product moment correlation
3) Describe the uses of Pearson’s product moment correlation.
4) Discuss the assumptions of Spearman’s rank order correlation.
5) Describe the steps involved in computation of Spearman’s rho with the
help of an example.

145

You might also like