SHS Correlation and Regression Final
SHS Correlation and Regression Final
1. Philippines
2. Thailand
Indonesia
3.
Singapore
4.
Malaysia
5.
What’s the Rule?
B = 2A - 3
1 -1
2A – B = 3
2 1
3 3
4 5
5 7
A B
Who is he?
Michael Fred Phelps II
(born June 30, 1985) is an
American competition swimmer
and the most decorated
Olympian of all time, with a
total of 22 medals in three
Olympiads. Phelps also holds
the all-time records for Olympic
gold medals (18, double the
second highest record holders),
Olympic gold medals in
individual events (11), and
Olympic medals in individual
events for a male (13).
You and Michael
Roman writer, architect, and engineer Marcus
Vitruvius proposed, among other relationships, that
a person’s height and their arm span (herein called
“wingspan”) are approximately equal. In this
investigation, students will collect data to assess
whether or not Vitruvius’s proposal was reasonable.
Scatterplots will be drawn to illustrate the data and
a best-fit line will be overlain on the scatterplot.
The equation of the best-fit line will be determined,
and the slope interpreted in context.
Stephen Miller, Winchester Thurston School, http://www.amstat.org/education/stew/
Michael Phelps
Guide questions:
Assessment
1. How well does a line fit the wingspan vs. height
data? What does that mean?
2. Can we claim that the scatterplot represents the
relationship between height and “wingspan” in the
general population? Why or why not?
3. What about Michael Phelps – is he like us or is he
different? How?
4. How do your measurements compare to Michael
Phelps?
Answers
1. The data points do seem to cluster closely to the best-fit
line; there is not a lot of deviation between the line and the
points.
2. No, we cannot claim that the scatterplot represents the
relationship between height and wingspan in the general
population. These data values were collected for students;
there is no guarantee that as adults or younger children this
same relationship between height and wingspan holds true.
3. Although Michael seems to follow the same general trend,
his wingspan seems to be somewhat longer compared to his
height than typical students.
4. Answers may vary. One possible answer is “My height and
wingspan are closer to each other than are Michael’s height
and wingspan.”
Statistics @ Work
A businessperson may want to know whether the volume of
sales for a given month is related to the amount of advertising
the firm does that month.
Educators are interested in determining whether the number of
hours a student studies is related to the student’s score on a
particular exam.
Medical researchers are interested in questions such as, Is
caffeine related to heart damage? or Is there a relationship
between a person’s age and his or her blood pressure?
A zoologist may want to know whether the birth weight of a
certain animal is related to its life span.
Correlation Analysis
Correlation analysis is a method used to measure the
strength of relationship between two variables.
Correlation is a statistical method used to determine
whether a linear relationship between variables
exists.
Examples of Correlated Variables
The students’ mental ability and academic
performance in school are related.
There is a close relationship between reading
comprehension and mathematical ability.
Bivariate data
Bivariate data is a fancy way to say, ‘two-variable
data.’ The easiest way to visualize bivariate data is
through a scatter plot.
Bivariate Data
Can you think of pairs Why do you think there
of variables that may is a link between the
be linked? variables you have
Ice cream sales and chosen?
temperature
Hours spent studying
and Marks in exams
The amount of hours you
work and the amount of
money you earn
Law of Supply Law of Demand
Types of Correlation
1. A positive correlation exists when high scores in one
variable are associated with high scores in the
second variable. This is also true when low scores in
one variable are associated with low scores in the
other. Thus, there is direct relationship that exists in
positively correlated variables.
A B
x y
Types of Correlation
2. A negative correlation exists when high scores in one
variable are associated with low scores in the
second variable. This is also true when low scores in
one variable are associated with high scores in the
other.
M N
Types of Correlation
3. A zero correlation exists when high scores in one
variable tend to score neither systematically high
nor systematically low in the other variable.
Examples
The more you study for a test, the higher your grade will be.
The more you practice a sport, the better you will become.
The more hours you work, the more money you'll have in
your bank account.
The more you go over your notes, the higher your test scores
will be.
The more you shoot a basketball the easier it gets score
The more clubs you join in school, the more friends you can
make.
The more you exercise, the more weight you will lose.
0 No correlation
Jimenez, R. and Parreno, E. (2014). Basic Statistics. Quezon City. C & E Publishing, Inc.
Correlation Coefficient
PERFORMANCE in
STUDENT HISTORY LITERATURE
A 78 79
B 77 80
C 88 85
D 84 78
E 80 89
F 85 80
G 79 80
H 88 85
Possible Relationships Between
Variables
When the null hypothesis has been rejected for a specific a
value, any of the following five possibilities can exist.
1. There is a direct cause-and-effect relationship between the
variables. That is, x causes y. For example, water causes
plants to grow, poison causes death, and heat causes ice to
melt.
2. There is a reverse cause-and-effect relationship between the
variables. That is, y causes x. For example, suppose a
researcher believes excessive coffee consumption causes
nervousness, but the researcher fails to consider that the
reverse situation may occur. That is, it may be that an
extremely nervous person craves coffee to calm his or her
nerves.
Possible Relationships Between
Variables
3. The relationship between the variables may be caused by a third variable.
For example, if a statistician correlated the number of deaths due to
drowning and the number of cans of soft drink consumed daily during the
summer, he or she would probably find a significant relationship. However,
the soft drink is not necessarily responsible for the deaths, since both
variables may be related to heat and humidity.
4. There may be a complexity of interrelationships among many variables. For
example, a researcher may find a significant relationship between students’
high school grades and college grades. But there probably are many other
variables involved, such as IQ, hours of study, influence of parents,
motivation, age, and instructors.
5. The relationship may be coincidental. For example, a researcher may be
able to find a significant relationship between the increase in the number of
people who are exercising and the increase in the number of people who
are committing crimes. But common sense dictates that any relationship
between these two values must be due to coincidence.
Thus, when the null hypothesis is rejected, the
researcher must consider all possibilities and select
the appropriate one as determined by the study.
Remember, correlation does not necessarily imply
causation.
Correlation does not necessarily imply
causation.
That there is a strong positive correlation between ice
cream sales and murder rates in the summer.
As ice cream sales rise, so do murder rates.
Is this because eating ice cream makes us want to
murder people?
The actual explanation is that when the weather is hot,
more people buy ice cream, but they also go out more,
drink more, and socialize more, leading to an increase
in murder rates. Extreme temperatures observed in the
summer also have been shown to increase aggression.
Source: Boundless. “Correlational Research.” Boundless Psychology. Boundless, 26 May. 2016. Retrieved 29
May. 2016 from https://www.boundless.com/psychology/textbooks/boundless-psychology-
textbook/researching-psychology-2/types-of-research-studies-27/correlational-research-125-12660 /
REGRESSION ANALYSIS
2017 MASS TRAINING
OF TEACHERS
for Senior High School
ROLDAN C. BANGALAN
What’s next?
x 1 2 3 4 5
y -2 1 4 7 ?
10
y = 3x - 5
Follow the rule
x 2 3 4 5 7
y -3 -5 -7 -9 ?
-13
y = -2x + 1
What’s the pattern?
x 18 26 32 38 52 59
y 10 5 2 3 1.5 ??
y=?
Scatter It! (Predict Billy’s Height)
In this lesson, students explore the relationship
between age and height in order to help a
hypothetical student predict his height in two years.
Students will examine data that will enable them to
create a scatterplot and approximate a line of best
fit. The scatterplot and line of best fit will be used
to predict height. The slope of the line of best fit
will be interpreted in context.
Susan Haller, St. Cloud State University
Explain to the class that Billy’s parents measured each of their
children’s heights on the first day of school every year