0% found this document useful (0 votes)
7 views

Recall From Last Time: Section 8 Scatter Plots and Linear Regression

The document discusses topics from a math class including homework, a quiz, and final exam. It covers scatter plots, correlation, lines of best fit, and using regression equations to make predictions. Guidelines are provided for determining if a linear correlation exists between variables and interpreting the correlation coefficient.

Uploaded by

goflux pwns
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Recall From Last Time: Section 8 Scatter Plots and Linear Regression

The document discusses topics from a math class including homework, a quiz, and final exam. It covers scatter plots, correlation, lines of best fit, and using regression equations to make predictions. Guidelines are provided for determining if a linear correlation exists between variables and interpreting the correlation coefficient.

Uploaded by

goflux pwns
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

MAT 128

4/17/2019 Wednesday

 Complete Chapter 7 homework, start Chapter 8 homework


 Take home quiz on Chapter 8. Due on Friday 4/19
 Cumulative Final Exam Thursday 5/9 at 2 pm

Recall From Last Time: Section 8 Scatter Plots and Linear Regression
1. Scatter Plots

2. Correlation: measures how much two or more variables fluctuate together.


REMEMBER: Correlation does not imply cause and effect!!

3. Line of best fit( or trend line, regression line):  is a straight line that best represents the data on a scatter
plot.  
This line may pass through some of the points, none of the points, or all of the points.

4. Linear correlation coefficient (r): measures the strength of the linear relationship between paired x and y
values.
-1 ≤ r ≤ 1 The sign indicates if the variables are negatively correlated or positively correlated.
r is very sensitive to outliers.

5. How to determine if correlation exists:


 3 conditions must be met
#1 Quantitative Data Condition –must have paired (x,y) quantitative data.

#2 Straight Enough Condition - confirm that scatter plot displays straight line pattern - correlation
coefficient r is for LINEAR Correlation only!

#3 Outlier Condition - Outliers are removed if they are known to be errors. Outliers are data values that
deviate significantly from the other data values.

 Calculate r - pretty nasty calculations so usually done with a calculator, round r to 3 decimal places.

 Interpreting Linear Correlation Coefficient r


At what level of r do we conclude a relationship exists? That depends on sample size.
The smaller the sample, the higher r must be to say relationship between x and y is “unlikely to have
occurred by random chance”.
General rule: if |r|*√ n > 3, then we say correlation exists. We are saying the relationship between the
variables is unlikely to be due just to random chance.
Today: finish Chapter 8
The Best fit line (regression line, least squares line) is ^y =b 0+b 1 x ¿ ^y =mx+b or ^y =ax+ b
Minimizes the sum of
(actual data point – predicted data point)2 or ( y− ^y )2
This unexplained error y− ^y is called a “residual”
The least squares regression line has the smallest sum of unexplained errors.

Coefficient of Determination r2 - indicates variation of y predicted by the regression equation. It is the


proportion of variation explained by the linear model over total variation.

A higher coefficient is an indicator of a better goodness of fit for the observations


Regression equation useful model to make predictions – use only if:
1) Line shows good fit to data
2) Correlation coefficient r is significant
3) Do not predict too far past sample data (data for years 1990 to 2015 Do not predict year 2050)

***Otherwise use the mean value of the response variable ý as the best predicted value***

Residual Plots
Residuals y− ^y should have no pattern. A “best fit” line will have some data above and some below.
Graph of (x, y -^y ) should also have no pattern. This indicates that the line of best fit is a good approximation
for the data.
A residual plot with a non-random pattern (either right-side up or inverted U-shape) indicates that the line of
best fit is not a good approximation for the data; a non-linear estimation would be a better fit.

Examples here -

You might also like