0% found this document useful (0 votes)
89 views

Correlations: Islamic University of Gaza Statistics and Probability For Engineers (ENGC 6310)

nmh

Uploaded by

Mahmoud Mahmoud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views

Correlations: Islamic University of Gaza Statistics and Probability For Engineers (ENGC 6310)

nmh

Uploaded by

Mahmoud Mahmoud
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

Islamic University of Gaza

Statistics and Probability for Engineers


(ENGC 6310)
 
Lecture 10:

Correlations

Prof. Dr. Yunes Mogheir


Civil and Environmental Engineering Dept.

First Semester/2019
Definition

The linear correlation coefficient r


measures the strength of the linear
relationship between paired x- and y-
quantitative values in a sample.

We can often see a relationship between two


variables by constructing a scatterplot.
Scatterplots of Paired Data
Scatterplots of Paired Data
Requirements

1. The sample of paired (x, y) data is a random


sample.
2. Visual examination of the scatter plot must
confirm that the points approximate a
certain pattern.
3. The outliers must be removed if they are
known to be errors.
Notation for the
Linear Correlation Coefficient
n represents the number of pairs of data present.
 denotes the addition of the items indicated.
x denotes the sum of all x-values.
x2 indicates that each x-value should be squared and then those
squares added.
(x)2 indicates that the x-values should be added and the total then
squared.
xy indicates that each x-value should be first multiplied by its
corresponding y-value. After obtaining all such products, find their
sum.
r represents linear correlation coefficient for a sample.
 represents linear correlation coefficient for a population.
Formula
The linear correlation coefficient r measures the
strength of a linear relationship between the
paired values in a sample.
nxy – (x)(y)
r=
n(x2) – (x)2 n(y2) – (y)2
Example: Calculating r
Using the simple random sample of data
below, find the value of r.
Data
x 3 1 3 5
y 5 8 6 4
Example: Calculating r - cont
Example: Calculating r - cont
Data
x 3 1 3 5
y 5 8 6 4

nxy – (x)(y)
r=
n(x2) – (x)2 n(y2) – (y)2

61 – (12)(23)
r=
4(44) – (12)2 4(141) – (23)2

-32
r= = -0.956
33.466
Properties of the
Linear Correlation Coefficient r
1. –1  r  1
2. The value of r does not change if all values
of either variable are converted to a different
scale.
3. The value of r is not affected by the choice of
x and y. Interchange all x- and y-values and the
value of r will not change.
4. r measures strength of a linear relationship.
Interpreting r :
Explained Variation
The value of r2 is the proportion of the
variation in y that is explained by the linear
relationship between x and y.

For Example if r = 0.926, we get


r2 = 0.857.
We conclude that 0.857 (or about 86%) of the
variation in Y can be explained by the linear
relationship between X and Y. This implies that 14%
of the variation in Y cannot be explained by X
Formal Hypothesis Test

 We wish to determine whether there is a


significant linear correlation between two
variables.
H0: = (no significant linear correlation)
H1: (significant linear correlation)
Test Statistic is t

Test statistic:
r
t=
1–r2
n–2

Critical values:

Use Tables with


degrees of freedom = n – 2
P-value:
Use Tables with
degrees of freedom = n – 2

Conclusion:
If the absolute value of t is > critical value
reject H0 and conclude that there is a linear
correlation. If the absolute value of t ≤ critical
value, fail to reject H0; there is not sufficient
evidence to conclude that there is a linear
correlation.
Test Statistic is t
(follows format of earlier chapters)
Covariance
Measure of linear relationship between variables
If the relationship between the random variables is nonlinear,
the covariance might not be sensitive to the relationship

Slide 17
Pearson’s Correlation Coeff.
Pearson's correlation coefficient between two variables is defined
as the covariance of the two variables divided by the product of
their standard deviations:

The above formula defines the population correlation coefficient,


commonly represented by the Greek letter ρ (rho). Substituting
estimates of the covariances and variances based on a sample
gives the sample correlation coefficient, commonly denoted r :

                                                             
                                                             
                                                             
                                                             
Slide 18
                                                             
Pearson correlation coefficient
The Spearman correlation coefficient is often thought
of as being the Pearson correlation coefficient
between the ranked variables. In practice, however, a
simpler procedure is normally used to calculate ρ.
The n raw scores Xi, Yi are converted to ranks xi, yi,
and the differences di = xi − yi between the ranks of
each observation on the two variables are calculated

Slide 19
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
A Spearman correlation of 1 results when the two variables being
compared are monotonically related, even if their relationship is not linear.
In contrast, this does not give a perfect Pearson correlation
Slide 20
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
When the data are roughly elliptically distributed and there are no prominent
outliers, the Spearman correlation and Pearson correlation give similar
values

Slide 21
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
The Spearman correlation is less sensitive than the Pearson
correlation to strong outliers that are in the tails of both samples

Slide 22
Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

You might also like