0% found this document useful (0 votes)
16 views

Lecture 7 Correlation

Uploaded by

amtullahhadia02
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Lecture 7 Correlation

Uploaded by

amtullahhadia02
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Correlation

EXAMINING RELATIONSHIP BETWEEN VARIABLES

I N S T R U C T O R: FAT I M A Z A FA R
Correlation
Correlation refers to a process for establishing the relationships between two variables. You
learned a way to get a general idea about whether or not two variables are related, is to plot
them on a “scatter plot”.
The correlation analysis is used to measure the direction and relationship between two
variables. It's important to note that correlation does not equal causation. That means that
while a relationship may be observed, it's impossible to say that one variable caused or affected
the other variable. The relationship observed may be due to other variables not accounted for in
the model.
Correlation Coefficient
A correlation coefficient is a number between -1 and 1 that tells you the strength and direction
of a relationship between variables.

Correlation coefficient value Correlation type Meaning


1 Perfect positive correlation When one variable changes, the
other variables change in the same
direction.

0 Zero correlation There is no relationship between the


variables.
-1 Perfect negative correlation When one variable changes, the
other variables change in the
opposite direction.
Interpreting a correlation
coefficient
The sign of the coefficient reflects whether the variables change in the same or opposite
directions: a positive value means the variables change together in the same direction, while a
negative value means they change together in opposite directions.
The absolute value of a number is equal to the number without its sign. The absolute value of a
correlation coefficient tells you the magnitude of the correlation: the greater the absolute value,
the stronger the correlation.
Interpreting a correlation
coefficient
There are many different guidelines for interpreting the correlation coefficient because findings
can vary a lot between study fields. You can use the table below as a general guideline for
interpreting correlation strength from the value of the correlation coefficient.
Correlation coefficient Correlation strength Correlation type
-.7 to -1 Very strong Negative
-.5 to -.7 Strong Negative
-.3 to -.5 Moderate Negative
0 to -.3 Weak Negative
0 None Zero
0 to .3 Weak Positive
.3 to .5 Moderate Positive
.5 to .7 Strong Positive
.7 to 1 Very strong Positive
Cont..
-1 = perfect negative correlation
-.7 = strong negative correlation
-.5 = moderate negative correlation
-.3 = weak negative correlation
0 = no correlation
.3 = weak positive correlation
.5 = moderate positive correlation
.7 = strong positive correlation
1 = perfect positive correlation
A correlation coefficient is a bivariate statistic when it summarizes the relationship between two
variables, and it’s a multivariate statistic when you have more than two variables.
A correlation coefficient is also an effect size measure, which tells you the practical significance
of a result.
The p-value helps us determine whether or not we can meaningfully conclude that the
population correlation coefficient is different from zero, based on what we observe from the
sample.
Examples
Is there an association/relationship between:
• Children’s IQ and Parents’ IQ
• Exam scores and preparation time
• Depression and life satisfaction
• Age and height
• Years of marriage and marital satisfaction
Visualizing linear
correlations

The correlation coefficient tells you how closely your data fit on a line. If you have a linear
relationship, you’ll draw a straight line of best fit that takes all of your data points into account
on a scatter plot.
The closer your points are to this line, the higher the absolute value of the correlation coefficient
and the stronger your linear correlation.
If all points are perfectly on this line,
you have a perfect correlation
If all points are close to this line, the
absolute value of your correlation
coefficient is high
If these points are spread far from this line,
the absolute value of your correlation
coefficient is low
Cont..
Note that the steepness or slope of the line isn’t related to the correlation coefficient value. The
correlation coefficient doesn’t help you predict how much one variable will change based on a
given change in the other, because two datasets with the same correlation coefficient value can
have lines with very different slopes.
Types of correlation
coefficients

You can choose from many different correlation coefficients based on the linearity of the
relationship, the level of measurement of your variables, and the distribution of your data.
For high statistical power and accuracy, it’s best to use the correlation coefficient that’s most
appropriate for your data.
The most commonly used correlation coefficient is Pearson’s r because it allows for strong
inferences. It’s parametric and measures linear relationships. But if your data do not meet
all assumptions for this test, you’ll need to use a non-parametric test instead.
Non-parametric tests of rank correlation coefficients summarize non-linear relationships
between variables. The Spearman’s rho and Kendall’s tau have the same conditions for use, but
Kendall’s tau is generally preferred for smaller samples whereas Spearman’s rho is more widely
used.
Cont..
Correlation coefficient Type of relationship Levels of measurement Data distribution

Pearson’s r Linear Two quantitative (interval Normal distribution


or ratio) variables

Spearman’s rho Non-linear Two ordinal, interval or Any distribution


ratio variables
Point-biserial Linear One dichotomous Normal distribution
(binary) variable and one
quantitative (interval or
ratio) variable

Cramér’s V (Cramér’s φ) Non-linear Two nominal variables Any distribution

Kendall’s tau Non-linear Two ordinal, interval Any distribution


or ratio variables
Pearson’s r

The Pearson’s product-moment correlation coefficient, also known as Pearson’s r, describes the
linear relationship between two quantitative variables.
These are the assumptions your data must meet if you want to use Pearson’s r:
I. Both variables are on an interval or ratio level of measurement
II. Data from both variables follow normal distributions
III. Your data have no outliers
IV. Your data is from a random or representative sample
V. You expect a linear relationship between the two variables

The Pearson’s r is a parametric test, so it has high power. But it’s not a good measure of
correlation if your variables have a nonlinear relationship, or if your data have outliers, skewed
distributions, or come from categorical variables. If any of these assumptions are violated, you
should consider a rank correlation measure.

You might also like