Lecture-5-Correlation
Lecture-5-Correlation
Correlation coefficients are used to measure how strong a relationship is between two variables. There are several types of
correlation coefficient, but the most popular is Pearson’s. Pearson’s correlation (also called Pearson’s R) is a correlation coefficient
commonly used in linear regression. If you’re starting out in statistics, you’ll probably learn about Pearson’s R first. In fact, when
anyone refers to the correlation coefficient, they are usually talking about Pearson’s.
Pearson's correlation coefficient when applied to a sample is commonly represented by the letter r and may be referred to as
the sample correlation coefficient or the sample Pearson correlation coefficient. If we have one dataset {x1,…xn} containing n values
and another dataset {y1,...,yn} containing n values, then that formula for r is given by:
𝑛 𝛴 𝑥𝑦 − 𝛴𝑥 𝛴𝑦
𝑟=
√[𝑛𝛴𝑥 2 − (𝛴𝑥 )2 ][𝑛𝛴𝑦 2 − (𝛴𝑦)2 ]
where:
Σxy = sum of the product of the data of the first and the second variable
Value of r Interpretation
1.0 Perfectly positive correlation
0.90 to 0.99 Highly positive correlation
0.70 to 0.89 High Positive correlation
0.40 to 0.69 Moderate Positive Correlation
0.20 to 0.39 Small Positive Correlation
-0.20 to 0.19 Very small; Negligible
-0.40 to -0.21 Small Negative Correlation
-0.70 to -0.41 Moderate Negative Correlation
-0.90 to -0.71 High Negative Correlation
-0.99 to -0.91 Highly Negative Correlation
-1.0 Perfect Negative Correlation
• Positive correlation indicates that there is a direct relationship between the two variables. Meaning to say as the first
variable increases, the second variable also increases. Likewise, a negative correlation indicates that there is an indirect
relationship between the two variables. Meaning to say as the first variable increases, the second variable decreases.
• The Pearson R Product Moment Correlation coefficient can be seriously affected by odd extreme observations. Moreover,
test based on it relies for their validity on the assumption of normality.
Example: Data below shows the result of the examination of five students in Research and Statistics. Identify the value of r of the given
data below and interpret the result.
Student Research Statistics
A 17 16
B 12 23
C 14 32
D 15 16
E 22 29
Solution: Set Research as the x variable and Statistics for y variable. Identify the value of x², y², and xy then get the summation.
Interpretation: Since the value of r = 0.16, we can say that there is a very small or negligible correlation between the result of exam in
Statistics and Research.
Now, to test the significance of the value of the Pearson correlation coefficient, the student’s t-test should be computed
using the formula.
𝑁−2
𝑡 = 𝑟√ ; 𝑑𝑓 = 𝑛 − 2
1 − 𝑟2
Where:
t = students’ t-test
N = number of paired samples
r = Pearson’s Product Moment correlation coefficient
df = degree of freedom