0% found this document useful (0 votes)
14 views

Unit 4 Correlation and Linear Regression

Uploaded by

VN BomXanh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Unit 4 Correlation and Linear Regression

Uploaded by

VN BomXanh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Warm up

Time studying 75 30 35 65 110 40 40 80 56 70 50 110 18


Test score 32 25 30 38 60 20 39 47 35 57 32 33 10

IB Maths AA
Correlation and Linear Regression

Unit 4 - Statistics

IB Maths AA
TOK Connections

IB Maths AA
TOK Connections

IB Maths AA
Bivariate Data (2 variable)
However! Data in the Real World is never perfect.

Data follows “trends” that can be “approximately” linear (or non-


linear or exponential...etc etc)
Check some Data out!
(Income per person vs CO2 Emissions)
(Babies per woman vs Life Expectancy)
(Government Health Spending vs Life Expectancy)

Our goal in this section is to describe relationships and


characteristics of real world data in a more rigorous way (statistics)

IB Maths AA
70
60
50

Test Score
40
30
20
10
0
0 20 40 60 80 100 120

Study time

IB Maths AA
Scatter Plot
➢ A scatter diagram is a way of graphing bivariate data

○ One variable will be on the x-axis and the other will be on the y-axis

○ The variable that can be controlled in the data collection is known as


the independent or explanatory variable and is plotted on the x-axis

○ The variable that is measured or discovered in the data collection is


known as the dependent or response variable and is plotted on the y-
axis
➢ Scatter diagrams can contain outliers that do not follow the trend of the data

IB Maths AA
Scatter Plot
With Bivariate Data we create a Scatter Plot.
- Plot one set of data on the x-axis
- Plot other set of data on the y-axis

Look at any “trend” or “correlation” in the data

IB Maths AA
Scatter Plot
What can we say about this Correlation?
- Is it (approximately) Linear?
- How “Strong” is it? (does it follow an exact curve, or only loosely)
- What direction is the relation going in?

IB Maths AA
Lines of Best Fit
A line of best fit is drawn on the data plotted that represents the general trend of
the data

IB Maths AA
Lines of Best Fit
Process of drawing a Line of Best Fit:
1) Find the average of each data set
2) Plot Coordinate ( ) in your scatterplot (indicate it differently than other
points)
3) Draw a Line of Best Fit that goes through this “Average Point” and
demonstrates the best trend through the data

IB Maths AA
Lines of Best Fit
Example: A company records the amount of money they spend on advertising and the
number of products they sold in store. They want to see if there is a relationship between
these sets of data. Their record is below:
Average x (Advertising) =
(45+55+47+75+90+100+100+95+88+50+45+58)/12
= 70.67

Average y (Items Sold) =


(15+25+17+34+41+47+50+46+37+22+20+30)/12
=32
So create the scatter plot with the bivariate data
Plot (70.67, 32)
Physically Draw a Line of Best Fit that crosses this point

IB Maths AA
Lines of Best Fit

Draw Scatter Plot and Average Point Draw Line of Best Fit through Average
Point

IB Maths AA
Describing Correlations
Descriptors of Correlation
Positive, Negative, No Correlation Strong, Moderate, Weak

IB Maths AA
Describing Correlations

IB Maths AA
Describing Correlations

IB Maths AA
Pearson’s Product-Moment Correlation
Coefficient (r)
PMCC or “Pearson’s Coefficient” or just denoted by r, is a value between -1 and 1 that
quantitatively tells us how strong our correlation is

r = 1 indicates a perfectly strong positive correlation


r = -1 indicates a perfectly strong negative correlation
r = 0 indicates no correlation

The closer r is to either 1 or -1, the stronger the correlation between the data
IB Maths AA
Pearson’s Coefficient (r)

IB Maths AA
Pearson’s Coefficient (r)
These are “approximate” boundaries for r and what is “strong”, “moderate” and
“weak”. There is no exact ranges, but this gives you an idea

IB Maths AA
Calculating Pearson's Coefficient
You must use the Graphing Calculator to calculate r.

http://www.youtube.com/watch?v=DBWAmboDVtg
IB Maths AA
Linear Regression (Line of Best Fit on a
Calculator
We can also perform a “Linear Regression” on our data. This means finding the
Line of Best Fit and its Equation that gives the strongest Pearson’s Coefficient

IB Maths AA
Regression

The equation of linear regression


is found by minimising the total of
the vertical distances (residuals)
between the points of the data set
and the line.

IB Maths AA
Regression
The equation of linear regression
is found by minimising the total of
the vertical distances (residuals)
between the points of the data set
and the line.

IB Maths AA
Practice
A teacher is interested in the relationship between the number of hours her students spend on a
phone per day and the number of hours they spend on a computer. She takes a sample of nine
students and records the results in the table below.

a) Draw a scatter diagram for the data, where x is hours spent on the phone and y is hours spent
on the computer.
b) The relationship can be modelled by the regression equation , Find the value of r the
correlation coefficient.
c) Comment on the relationship.
d) If a student spent 5 hours on their phone, how long would we expect them to spend on the PC
IB Maths AA
Practice
Revision Village - https://www.revisionvillage.com/ib-math/analysis-and-
approaches-hl/questionbank/statistics-and-probability/bivariate-statistics/
Questions Questions Questions
1 10 14
4 12
5 13
6
7
8
9

IB Maths AA
Summary
The gradient of a straight line tells us how much y increases for each unit increase in x.
So the gradient of the line of regression tells us how much the dependent variable
increases for each unit increase in the independent variable.

The y-intercept of the line of regression gives the value of the dependent variable when
the independent variable is 0.
This value is often an extrapolation and so should be used cautiously

IB Maths AA

You might also like