RM9 - Analysis of Correlation Explanation and Example Part 1
RM9 - Analysis of Correlation Explanation and Example Part 1
Learning Objectives: After reading this information sheet, you must be able to:
Analysis of correlation is a method to describe the linear relationship between two different
variables. With this method, we can see the patterns and define how linear it is.
We can conclude that two variable is associated if a change in one variable causes a change in
another variable
Things that we need to mark is that the correlation coefficient does not answer whether variable
A causes a change in variable B or whether variable B causes a change in variable A.
It just explains about a relationship, but we can’t make conclusion variable A causes change to
variable B just by using correlation analysis. Do not do that!
From the chart, you can see that there is a relationship between age and weight. The greater the
age, the weight is kept raising up. Likewise, the lower the age, weight decreased. That’s how
correlation works!
It shows you the connection and relation among two different things!
Lesson 2: Types of analysis of correlation
1. Positive correlation
A positive correlation is a relationship between 2 variables which the increase of one variable
causes an increase for another variable.
Or it can also be defined otherwise, the lower a variable, the more it moves down as well as
other variables.
For examples, in case we are planting fruit, increasing use of fertilizers will increase the
probability of increasing production. If we are part-time workers, the longer we work, the greater
the pay.
2. Negative correlation
The Negative correlation is the opposite, it’s a relationship between 2 variables which the increase
of one variable causes a decrease for another variable. This applies otherwise.
For examples, when the price of rice continues to rise, then the people’s purchasing power will
decrease. The longer a person learns, the fewer mistakes he makes.
Correlation analysis always involves two variables that tied together. Usually, statistician use
scatterplot to help and give an initial sign of analyzing. Scatterplots help provides a general picture
so that we can see the correlation between the two variables.
The Scatterplot also helps check if there are outliers in the data set. Outliers need to be checked
if it will affect the results of the analysis, both descriptive and inferential analysis.
We should have two group set of data and transform them into horizontal dimension and vertical
dimension. It should be numerical data and have numbers.
From the scatterplot, at a glance, you can see that there is a contrast correlation between price
and sell products. The higher the price, the sold products are decreasing. We can conclude it as
a negative correlation.
From the scatterplot above, you’ll see a different pattern from the previous one. We can see that
the older the kid, the weight is getting an increase. We can conclude it as a positive correlation.
Basically, the closer to the value of 1, the stronger the relationship between the two variables.
When it approaches zero, the association between the two variables is getting weaker. When
you get a negative value, it means there is a negative correlation.
The way the interpretation is the same. The closer to -1, the stronger the negative correlation.
The closer to 0, the weaker the negative correlation.
Lesson 5: Measuring analysis of correlation
We can see the pattern and direction of two variable from the scatterplot but we need to measure
how strong the relationship between two variable. We need a specific number to define the
strength of the correlation.
Usually, we can use the correlation coefficient to calculate how associated two variables. Several
formulas have been created. In this article, there are two methods prepared.
One of the most used ones is the Pearson Correlation Coefficient. We can call it just the
correlation coefficient. This coefficient is used to calculate the correlation with the terms:
Answer: Let we say, x is age and y is weight. We can define some parts of the formula here!
Based on the correlation value, we can conclude that there is a very strong positive correlation
between age and weight. The greater someone age, there the heavier he is.
2. Spearman Correlation
Spearman Correlation is a correlation measurement method for data that has an ordinal (rank)
scale. Both variables are quantitative but normal conditions are not met.
There is two spearman Condition. First, when there is no double rank or double data. Second,
when there is double rank or data.
Now, take a look at the steps in using the Spearman correlation test:
Arrange and order data rankings from smallest to largest. If there is the same data give
an average rating value.
Find the difference between the rankings of the first variable with the second variable.
Use the calculation formula according to the data conditions
a) No double/rank data
Now, let us check the first, no double rank or double data. The formula for the spearman
correlation is:
We can rank data from the biggest or the smallest before the correlation calculation according to
the needs and types of questions.
We are examining ten students mark for math and science. We want to know that is there a
relationship between science score and math score?
Use the formula above and you’ll find this result!
Well, based on the calculation, we found the correlation value between science score and math
score is -0.66.
There is a negative correlation between math score and science score, it has a moderate
relationship. The higher the value of the science subject, the lower the value of mathematics.
b) Double rank/data
Sometimes, there is double data in ranking the Spearman correlation test. Therefore, the formulas
that we use are also different and have special treatment.
Because we have double data or double rank, we need to use the correction factor using the
following formula.
Example:
Suppose we have Biology score data and History score data of 10 students. We would like to
know how strong the correlation is. Let’s use the formula!
Conclusion: There is a weak correlation between Biology score and History score. The correlation
value is -0.146.
Lesson 6: How to use correlation analysis using Microsoft Excel
If you want to use Microsoft Excel Formula, it’s really easy. You can use this simple formula find
the result instantly.
It is very simple. Just put the simple formula, block the correlation variable, and hit enter. You’ll
find exactly the same value as the example as I wrote above on Pearson section.
But, you can’t use this formula to for ordinal data or spearman formula. So, you have to use
another statistical tool such as SPSS, SAS, Minitab, or others to find your correlation value.