0% found this document useful (0 votes)
68 views

Lecture 29

Correlation analysis is used as a statistical tool to ascertain the association between two variables. A value of r = 0 implies a lack of relationship altogether.

Uploaded by

hrishabh_agrawal
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Lecture 29

Correlation analysis is used as a statistical tool to ascertain the association between two variables. A value of r = 0 implies a lack of relationship altogether.

Uploaded by

hrishabh_agrawal
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

QUANTITATIVE METHODS

LECTURE 29:
CORRELATION &REGRESSION

Correlation cient of correlation and is denoted by r. In this study we restrict


By the end of this lesson , you should be able to ourselves to the study of correlation from sample case.
• understand the importance and also the limitation of When high values of one variables are associated with high
correlation analysis. values of the other variable and low values of one variable are
• differentiate between simple, partial and multiple correlation associated with low values of another, then they are said to be
(directly or) positively correlated. On the other hand, if high
• calculate and interpret coefficient of correlation for individual
values of one tend to accompany low values of the other, they
observations as well as for bivariate grouped data
are (or) negatively correlated. If the values follow a random
What is correlation? arbitrary pattern then we may conclude that there is no linear
Correlation analysis is used as a statistical tool to ascertain the relationship between them. It is important to remember that
association between two variables. For example , take the case the correlation coefficient between two variables is a measure of
of income and consumption expenditure. With the help of their relationship and a value of r = 0 implies a lack of linearity
correlation analysis we can be very specific by measuring the and not a lack of relationship altogether.
degree of relationship between the concerned variables. The following are some of the properties of correlation
In this lesson we will be discussing about the relationship coefficient:
between two variables 1. It is independent of the choice of both origin and the scale
We have already studied the measures of central tendency and of observations.
dispersion. There, we dealt with data on single variable or 2. It is pure number and is independent of the units of
criterion. The measures of central tendency and dispersion measurement.
provide us with the information regarding one variable only.
But in many business problems, we may be dealing with two 3. It lies between –1 and +1.
variables which may have relationship or association between * Correlation and Causation
them. For example, statistical data on demand and price, output Most often, one may jump to unjustified conclusions by
and rainfall, volume of sales and expenditure on advertisement mistaking an observed correlation for a cause-effect relationship.
etc. may be given. We may be interested in the following A high sample correlation coefficient does not necessarily signify
questions “Does there exist an ‘association’ between the two any causal relation between two variables. A frequently quoted
variables? If yes, to what extent?” The above questions may be example concerns an observed high positive correlation between
answered by the use of correlation technique. The correlation the number of storks sighted and the number of births in a
coefficient merely expresses the degree of closeness of linear European city. We can conclude that this evidence indicated that
relationship between the two variables. storks bring babies or, worse yet, that killing storks would
control population growth.
The Concept of Correlation
When dealing with the joint variation of two or more variables, The observation that two variables tend to simultaneously vary
a natural question arises as to whether the variables are related in certain direction does not imply the presence of direct
and how close the relationship is. Correlation measures the relationship between them. Simply, it may happen that a third
strength of relationship of two variables whereas regression variable is actually causing the observed correlation between the
gives the mathematical relationship of the (two) variables. As two variables under study. This third variable is called a lurking
we confine to the linear relationship of the two variables. variable. The false correlation that it produces is called spurious
However, we restrict ourselves to the case of two variables in correlation.
this Unit. But correlation does not predict anything about the
cause and effect relationship. Even a high degree of correlation
does not necessarily imply that a cause and effect relationship
exits between the two variables. The study of correlation will
have any importance only if it is studied with regression
analysis. The study of both together is specially interesting
while studying problems in social science, educational research,
policy making and arriving at decisions, etc.
The correlation in a Population is measured by the Population
coefficient of correlation and is denoted by r (Greek lower case).
The correlation in a sample is measured by the Sample coeffi-

© Copy Right: Rai University


144 11.502
QUANTITATIVE METHODS
Introduction to Correlation
Analysis
INTRODUCTION With Correlation Analysis we attempt to answer
the question:
Is there a relationship between two or more variables
measured on a continuous scale (I.e. Interval or Ratio
data)?
e.g. Is there a relationship between:
x: years of education possessed by a
person
and
y: the beginning salary of a person

If these two variables are correlated, then……

Measures of Association
There are many statistical tools to measure Introduction to Correlation
whether two or more variables are associated
with each other. Analysis
…either:
Which tool should you use?
– as x goes up so does y: a positive
– Go to the Statistics Roadmap in the text by Berman. correlation
Let’s begin with data in which the independent
y
and dependent variables are both continuous:
– Correlation Analysis

and
– Regression Analysis
x or

© Copy Right: Rai University


11.502 145
QUANTITATIVE METHODS

Introduction to Correlation Introduction to Correlation


Analysis Analysis
– as x goes down, so does y: also a positive – as x goes down, y goes up: a negative
correlation correlation

y y

x or x

Introduction to Correlation
Introduction to Correlation Analysis
Analysis So, if the two variables move together:
– a positive correlation exists
– as x goes up, y goes down: a negative
And if the two variables move in opposite directions:
correlation
– a negative correlation exists.
y
y y

x or
Positive Correlation x Negative Correlation x

© Copy Right: Rai University


146 11.502
QUANTITATIVE METHODS
Introduction to Correlation Introduction to Correlation
Analysis Analysis
For example, if
Pearson’s r varies between –1 and +1: x: years of education possessed by a
person
-1 indicates a perfect negative correlation and
y: the beginning salary of a person
0 indicates no relationship
Results in a Pearson’s r score of
+1 indicates a perfect positive correlation r = 0.40

From this we get a correlation coefficient This means that:


there is a moderately positive correlation
scale….
between the years of education and
the beginning salary of a person.

The Correlation Coefficient Introduction to Correlation


Analysis
Scale Again, if
x: years of education possessed by a
person
-1 +1 and
-0.60 -0.30 -0.10 0 +0.10 +0.30 +0.60
y: years of unemployment by a person over
a lifetime
Strong Moderate Weak No Weak Moderate Strong
Perfect Perfect
Negative Negative Negative Relationship Positive Positive Positive
Negative
Relationship Relationship Relationship Relatio Relationship
Positive
Relationship Relationship Relationship Results in a Pearson’s r score of
nship
r = - 0.60

This means that:


there is a strong negative correlation between
the years of education and the years of
unemployment of a person over their

© Copy Right: Rai University


11.502 147
QUANTITATIVE METHODS

Introduction to Correlation Introduction to Correlation


Analysis Analysis
HINT: The strength of the correlation is indicated Requirements for using Pearson’s r:
by the number, and the direction of the
relationship is indicated by the sign (“=“ or “-”)? 1. A linear relationship exists between the two
variables

r1 = 0.40 2. The data is measured on a continuous scale


3. Random sampling is conducted
or
4. Both the x and y variables are normally distributed
in the population.
r2 = - 0.60
– NOTE: the Central Limit Theorem states that a sample
size of 30 or more will satisfy this requirement.

Introduction to Correlation
Analysis

Therefore, the stronger correlation is:

r2 = - 0.60

© Copy Right: Rai University


148 11.502

You might also like