0% found this document useful (0 votes)
19 views22 pages

8-CORRELATION

The document explains the concept of correlation, focusing on the relationship between independent and dependent variables in statistics. It describes how to calculate the correlation coefficient (r) to measure the strength and direction of linear relationships, along with examples and interpretations. Additionally, it introduces Spearman's rank correlation coefficient as a nonparametric measure of statistical dependence between two variables.

Uploaded by

Glaizel Panal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views22 pages

8-CORRELATION

The document explains the concept of correlation, focusing on the relationship between independent and dependent variables in statistics. It describes how to calculate the correlation coefficient (r) to measure the strength and direction of linear relationships, along with examples and interpretations. Additionally, it introduces Spearman's rank correlation coefficient as a nonparametric measure of statistical dependence between two variables.

Uploaded by

Glaizel Panal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

CORRELATION

CORRELATION
a mutual relationship or connection between two or more
things.
-Meriam Webster

STATISTICS
a quantity measuring the extent of the interdependence of variable
quantities.
DEFINITION
■Correlation deals with the relationship between
two quantitative variables
■A bivariate data contains two sets of related
data.
■A dependent variable in a experiment is the
variable that is affected by the independent
variable or outside factor.
Independent and Dependent Variable
Examples
1. The time spent by a student in reviewing a lesson can
increase his score in an examination.
Dependent variable : score in an examination
Independent variable : time spent in reviewing
2. Age affects human stamina.
Dependent variable: human stamina
Independent variable: age
Directions: Identify the independent and dependent variables in the
following statements.
1. The more time people spend using 4. Spending time with a family dog
social media, the less they read books. decreases the amount of stress
someone is feeling.
Independent Variable : Time on Social Media
Dependent Variable: Less books read Independent Variable: Spending time with a
family dog
2. Drinking energy drinks makes people Dependent Variable: Amount of stress someone is
more aggressive. feeling.

Independent Variable: Drinking energy drinks 5. Eating breakfast in the morning


Dependent Variable: Aggressive behavior increases the ability to learn in school.

3. Taking a nap in the afternoon makes Independent Variable: Eating breakfast


people more focused for the rest of the day. Dependent Variable: Ability to learn
Independent Variable: Time spent napping in
the afternoon
Dependent Variable: Focus
CORRELATION
The data can be represented by the ordered pairs (x, y) where
x is the independent (or explanatory) variable, and y is the
dependent (or response) variable.
A scatter plot can be used to y

determine whether a linear 2


(straight line) correlation exists
between two variables. x
2 4 6
Example:
–2
x 1 2 3 4 5

y –4 –2 –1 0 2 –4
Scatterplot of Correlation with Bivariate Data
y y
As x increases, y As x increases, y
tends to tends to
decrease. The increase. The
value of r is close value of r is close
to -1 or r = -1. to 1 r = 1.

x x
Negative Linear Correlation Positive Linear Correlation
y The value of r is y
close to 0 or
r = 0.

x x
No Correlation Nonlinear Correlation
LINEAR CORRELATION COEEFICIENT
The correlation coefficient , denoted by r, measures the strength
and the direction of a linear relationship between two variables.
This coefficient is sometimes called Pearson product moment
correlation since it was developed by the English mathematician
and biostatistician, Karl Pearson. The formula for r is,
n å xy - (å x )(å y )
r= .
n å x - (å x ) n å y - (å y )
2 2 2 2

The range of the correlation coefficient is -1 to 1. If x and y have


a strong positive linear correlation, r is close to 1. If x and y have
a strong negative linear correlation, r is close to -1. If there is no
linear correlation or a weak linear correlation, r is close to 0.
ØThe value of r ranges between ( -1) and ( +1)
ØThe value of r denotes the strength of the
association as illustrated
by the following diagram.

-1 -0.75 -0.25 0 0.25 0.75 1


indirect Direct
perfect perfect
correlation correlation
no relation
LINEAR CORRELATION
y y

r = 0.88
r = -0.91

x x
Strong negative correlation Strong positive correlation
y
y

r = 0.42 r = 0.07

x
x
Weak positive correlation Nonlinear Correlation
Pearson’s r Product Moment Correlation Chart

Absolute value of coefficient Interpretation


1 perfect correlation
0.81 – 0.99 very strong correlation
0.61 – 0.80 strong correlation
0.41 – 0.60 moderate correlation
0.21 – 0.40 weak correlation
0.01 – 0.20 very weak correlation
0 no correlation
Calculating a Correlation Coefficient
In Words In Symbols
1. Find the sum of the x-values. åx
2. Find the sum of the y-values. åy
3. Multiply each x-value by its å xy
corresponding y-value and find the
sum.
4. Square each x-value and find the sum. åx 2
5. Square each y-value and find the sum. åy2
6. Use these five sums to calculate n å xy - (å x )(å y )
the correlation coefficient. r= .
n å x 2 - (å x ) n å y 2 - (å y )
2 2

Continued.
CORRELATION COEFFICIENT
Example:
Calculate the correlation coefficient r for the following data.

x y xy x2 y2
1 –3 –3 1 9
2 –1 –2 4 1 There is a direct
3 0 0 9 0 very strong
4 1 4 16 1 correlation
5 2 10 25 4 between x and
å x = 15 å y = -1 å xy = 9 å x 2 = 55 å y 2 = 15 y.

n å xy - (å x )(å y ) 5(9) - (15)(-1) 60


r= = =
50 74
» 0.986
5(15) - (-1)
2
n å x - (å x ) n å y - (å y )
2 2 2 2 2
5(55) - 15
CORRELATION COEFFICIENT
Example 1:
Grade in Math Grade in English
Determine the correlation Student
(x) (y)
between the grades of students
in Math and English as shown in Jose 90 82
the table. Mario 92 79
a.) Display the scatter plot. Lordy 88 81
b.) Calculate the correlation Danny 87 78
coefficient r.
Menchie 90 88
c.) Interpret the result based on
the Pearson’s r product moment Linda 93 86
correlation chart. John 97 82

Continued.
CORRELATION COEFFICIENT
Example continued: x 90 92 88 87 90 93 97
y 82 79 81 78 88 86 82
y

88
86
Grade in English

84
82
80
78
x
86 88 90 92 94 96 98
Grade in Math
Continued.
CORRELATION COEFFICIENT
Solution: Construct a table
An r value of
Grade in Math Grade in
xy x2 y2 0.25 means
(x) English (y)
that there is a
90 82 7380 8100 6724
direct weak
92 79 7268 8464 6241 correlation
88 81 7128 7744 6561 between the
87 78 6786 7569 6084 grades of
8100 7744 students in
90 88 7920
Math and
93 86 7998 8649 7396
English.
97 82 7954 9409 6724

! 𝒙 = 𝟔𝟑𝟕 ! 𝒚 = 𝟓𝟕𝟔 ! 𝒙𝒚 = 𝟓𝟐, 𝟒𝟑𝟒 ! x2 = 𝟓𝟖, 𝟎𝟑𝟓 ! y2 = 𝟒𝟕, 𝟒𝟕𝟒

n å xy - (å x )(å y ) / 01,232 4 (63/)(0/6) <16


r= = :
= ≈ 0.25
/ 08,930 4 63/ ; / 2/,2/2 4 (0/6): 2/6 ; 021
n å x 2 - (å x ) n å y 2 - (å y )
2 2
Testing a Population Correlation Coefficient
■ Spearman’s rank correlation coefficient or Spearman’s rho, named
after Charles Spearman, is another formula for correlation
coefficient and is denoted by the Greek letter p or rs . It is a
nonparametric measure of statistical dependence between two
variables which assess how well relationship between two variables
can be described. A perfect Spearman correlation of +1 or -1 occurs
when each of the variables is a perfect function of the other.
The formula for the Spearman’s rank correlation coefficient is:
(6 𝐷 ) ∑ 1
𝜌 =1 −
𝑁 ( 𝑁 1 − 1)
where D is the difference in ranks of x and y values and
N is the number of pairs of x and y.
Testing a Population Correlation Coefficient
SOLUTION
We first arrange the values of x and y from highest to lowest to find
their ranks.
x Rank y Rank
97 1 88 1
93 2 86 2 There are two
values of 90 which
There are two
values of 90 which 92 3 82 3.5 share the ranks 3
and 4. Thus, we get
share the ranks 4
and 5. Thus, we get 90 4.5 82 3.5 the average:
3=2
the average:
90 4.5 81 5 1
= 3.5
2= 0
= 4.5
1 88 6 79 6
87 7 78 7 Continued.
Testing a Population Correlation Coefficient
Construct a table as follows:
Difference in
Rank of x- Rank of y- Ranks
x y value (Rx ) value (Ry) (D = Rx - Ry ) D2

90 82 4.5 3.5 1 1
92 79 3 6 -3 9
88 81 6 5 1 1 ∑ D2 = 29.5
87 78 7 7 0 0
90 88 4.5 1 3.5 12.25
93 86 2 2 0 0
97 82 1 3.5 -2.5 6.25
Continued.
Testing a Population Correlation Coefficient
■ Here, we have N = 7 since there are 7 pairs of values for x and y.
Computing for the Spearman’s rank correlation coefficient, we have:
( 6 ∑ 𝐷1 )
𝜌 =1 −
𝑁 ( 𝑁 1 − 1)
# (%&.()
=1 −
* ( *+ ,-)
≈ 0.47
Thus, there is a moderate probability that if the grade in Math is high,
the grade in English is also high; and if the grade in Math is low, the
grade in English is also low.
CORRELATION

You might also like