100% found this document useful (1 vote)
480 views8 pages

Pearson Correlation Coefficient

The document discusses the Pearson r correlation coefficient, which measures the strength of a relationship between two variables. It provides examples of variables that may be correlated, such as the number of male deer and their mating success. The document then gives step-by-step instructions for students to calculate the Pearson r using data collected on the number of fallen logs and red-backed salamanders found in different areas. The students find a strong positive correlation between the two variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
480 views8 pages

Pearson Correlation Coefficient

The document discusses the Pearson r correlation coefficient, which measures the strength of a relationship between two variables. It provides examples of variables that may be correlated, such as the number of male deer and their mating success. The document then gives step-by-step instructions for students to calculate the Pearson r using data collected on the number of fallen logs and red-backed salamanders found in different areas. The students find a strong positive correlation between the two variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Statistical Analysis - 3

PEARSON R
CORRELATION
COEFFICIENT
Introduction:
Sometimes in scientific data, it appears that two variables are
connected in such a way that when one variable changes, the other
variable changes also. This connection is called a correlation.
Examples of this type of correlation include: (1) in deer populations,
large males seem to have more successful matings; and (2) larger
numbers of birds seem to nest in areas with dense vegetation.

Scientists measure the strength of a relationship between two


variables by calculating a correlation coefficient. The value of the
correlation coefficient indicates to what extent the change found in
one variable relates to change in another. There are several types
of correlation coefficients, but the one that is most widely used is
called the Pearson Product-Moment Correlation Coefficient, or
simply, the Pearson r.

Student Procedure
Example: Your students have done some classroom research on
amphibian species found in your area and have discovered that the
red-backed salamander uses fallen logs and debris on the forest
floor for their home. During their earlier census of their Biodiversity
Plot, they have noticed that some quadrats have many fallen logs
whereas other quadrats have few or none. They expect that they
would find more red-backed salamanders in those quadrats with
many fallen logs and design an experiment to test this hypothesis.
This experiment measures: (1) the number of fallen logs in each
quadrat; and (2) the number of red-backed salamanders in each
quadrat. This is a table of the data your class has collected:

SA 3.1
Statistical Analysis - 3

Q uadrat N um ber # F a lle n L o g s # S a la m a n d e r s


1 4 3
2 6 2
3 2 0
4 3 1
5 7 3
6 1 0
7 0 0
8 2 0
9 0 0
10 3 1
11 2 1
12 4 2
13 0 0
14 2 1
15 2 0
16 1 0
17 3 1
18 5 3
19 3 1
20 2 2
21 2 1
22 0 0
23 0 0
24 2 0
25 1 0

Step 1:
Graphing a the data
Graph your data by computer, or by hand, by assigning the number
of fallen logs (the second column) as the X-axis and the number of
salamanders (the last column) as the Y-axis. For example, in
Quadrat 1, the X-value would be 4 and the Y-value would be 3. The
results, when we plot all 25 points on the graph, look like this:
RELATIONSHIP OF FALLEN LOGS AND SALAMANDERS

3.5

3
Number of Salamanders

2.5

1.5

0.5

0
0 1 2 3 4 5 6 7 8
Number of Fallen Logs
SA 3.2
Statistical Analysis - 3
Looking at this graph, there seems to be a positive relationship
between the number of fallen logs and the number of salamanders.
In other words, it appears that when the number of fallen logs
increases, the number of Red-Backed Salamanders also increases.
Some things to remember about the Pearson r correlation:
• The lowest value that the Pearson r can have is r = 0.00. This
means there is ZERO correlation, and would indicate that X and Y
are not related to one another.
• The highest value that the Pearson r can have is r = 1.00. This
indicates a PERFECT correlation and would indicate that X and Y
are completely related to one another in the sample.
• Pearson r values can be either positive or negative. A positive
value indicates that increases in X correspond to increases in Y. A
negative value indicates that increases in one variable are
associated with decreases in the other variable.

The following graphs illustrate some of the various types of correlations


possible:
a. This is an example of a perfect, positive
correlation, in that the data shows no
deviation from a straight line.

b. This is an example of a perfect, negative


correlation, in that the data shows no
deviation from a straight line.

c. This is an example of a high, positive


correlation. Since the data shows some
variability, a perfect prediction cannot be
made.

d. This is an example of a high, positive


correlation. Since the data shows some
variability, a perfect prediction cannot be
made.

e. This shows a low correlation. Although


predictions could be made, and those
predictions would be slightly better than
chance, estimates would still be imprecise.

f. This figure shows a zero correlation.


Prediction would be no better than chance.

SA 3.3
Statistical Analysis - 3

Step 3:
Calculating the Pearson r Correlation
Coefficient
The graph below was produced by Microsoft Excel (charting
function) which calculated a correlation coefficient from the data in
our example. The graph shows a trend indicating an increase in
salamanders where there are more fallen logs present. Note,
however, that the value calculated by this program is the Pearson r
value squared. You must take the square root of this figure to give
the Pearson r value. From the graph: R2 = 0.72; Pearson r = 0.85.
Because 0.85 is close to 1.0 (the maximum value for the Pearson r),
this demonstrates a strong, positive correlation.

RELATIONSHIP OF FALLEN LOGS AND SALAMANDERS

3.5
R2 = 0.7175
Pearson r = 0.85
3

2.5
Number of Salamanders

1.5

0.5

0
0 1 2 3 4 5 6 7 8
-0.5
Number of Fallen Logs

If not using the Excel Software, or other graphing program, you can
calculate the Pearson r by using the following formula:

FORMULA FOR CALCULATING THE PEARSON R


CORRELATION COEFFICIENT

Pearson r =
N(∑ XY )− (∑ X )(∑ Y)
[N(∑ X )− (∑ X) ][N(∑ Y )− (∑ Y) ]
2
2
2
2

SA 3.4
Statistical Analysis - 3

This formula looks complicated, but can be simplified by breaking it


into its separate components. Using your original data, create the
following table:

Q uadrat # Fallen # Salam anders


Num ber Logs X2 Y2 XY
(Y)
(N) (X)
1 4 3 16 9 12
2 6 2 36 4 12
3 2 0 4 0 0
4 3 1 9 1 3
5 7 3 49 9 21
6 1 0 1 0 0
7 0 0 0 0 0
8 2 0 4 0 0
9 0 0 0 0 0
10 3 1 9 1 3
11 2 1 4 1 2
12 4 2 16 4 8
13 0 0 0 0 0
14 2 1 4 1 2
15 2 0 4 0 0
16 1 0 1 0 0
17 3 1 9 1 3
18 5 3 25 9 15
19 3 1 9 1 3
20 2 2 4 4 4
21 2 1 4 1 2
22 0 0 0 0 0
23 0 0 0 0 0
24 2 0 4 0 0
25 1 0 1 0 0
Σ N = 25 Σ X = 25 Σ Y = 25 ΣX = ΣY = Σ XY =
2 2

25 25 25

Using the values from the new table, complete the Pearson r
formula:

Pearson r =
N (∑ XY )− (∑ X )(∑ Y)
[N(∑ X )− (∑ X) ][N(∑ Y )− (∑ Y) ]
2
2
2
2

SA 3.5
Statistical Analysis - 3

The numerator, or top of the formula, looks like this once we plug in
all the numbers:

(25)(90) − (57)(22)
Pearson r =
[25(213) − (57)2 ][25(46) − (22)2 ]
2250 − 1254
Pearson r =
[5325 − 3249][1150 − 484]
996
Pearson r =
[2076][666]
996
Pearson r =
1382616
996
Pearson r =
1175.85
Pearson r = 0.8471

Again, this Pearson r correlation coefficient (being extremely close to


the maximum value 1.0) demonstrates a strong positive correlation
between the number of fallen logs and the number of salamanders.

SA 3.6
Statistical Analysis - 3

Step 4:
Determine if your calculations have
statistical significance

You must determine whether or not your calculations have statistical


significance. To do this you must determine the ‘critical value’ for
your Pearson r correllation coefficient by using the following table:

So for our example:

1. Calculate the degrees of freedom (DF) by subtracting the 2 from


the number of comparisons you are making (DF = N - 2)

In our case, we are sampling fallen logs and salamanders in


25 quadrats (N = 25)

DF = 25 - 2 = 23

2. Find your DF on the table below and find the critical value allowed.

In our case, the nearest DF listed is 25 with a critical value of


0.3233. Our calculated Pearson r correlation coefficient is
0.8471.

3. The calculated figure is greater than the critical value from the
table; our findings have statistical significance. Therefore, we can
assume that our hypothesis is true and that there is a strong
positive correlation betwen the number of fallen logs and the
number of salamanders and this correlation is not due to chance.

SA 3.7
Statistical Analysis - 3

CRITICAL VALUES FOR THE PEARSON R


CORRELLATION COEFFICIENT

DF Critical
(N - 2) Value
(5% certainty)
1 .98769
2 .90000
3 .8054
4 .7293
5 .6694
6 .6215
7 .5822
8 .5494
9 .5214
10 .4973
11 .4762
12 .4575
13 .4409
14 .4259
15 .4124
16 .4000
17 .3887
18 .3783
19 .3687
20 .3598
25 .3233
30 .2960
35 .2746
40 .2573
45 .2428
50 .2306
60 .2108
70 .1954
80 .1829
90 .1726
100 .1638

SA 3.8

You might also like