0% found this document useful (0 votes)

52 views11 pages

Linear Regression Overview and Analysis

The document discusses linear regression, including how to calculate the regression equation and use it to predict outcomes. It explains how to assess how well the regression model fits the data, including calculating R2 and using ANOVA. It also provides an example of performing a linear regression analysis on a dataset.

Uploaded by

ghania azhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views11 pages

Linear Regression Overview and Analysis

Uploaded by

ghania azhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Regression

PSYC 300B - Lecture 1

Dr. J. Nicol

Linear Regression
• Pearson’s correla?on (r) measures the degree to which
a set of data forms a linear rela?onship
• Regression is a sta?s?cal procedure for determining
the equa?on for the straight line that best fits a set of
data
• The equa?on for the best-fiLng straight line is called
the regression equa?on
• The regression equa?on makes it possible to find the
predicted value of Ŷ (called “Y hat”) for any value of X

Ŷ = bX + a
The equa?on provides the best predic?on for Ŷ for a
value of X and it results in the least squared error
between the data points and the regression line

The Linear Equa?on

• Slope (b) - determines the direc?on and degree to

which the best-fiLng straight line is ?lted
• Y-intercept (a) - determines the point where the best-
fiLng straight line crosses the Y-axis
• The regression equa?on provides the best predic?on
for a value of Ŷ for a given value of X
• e.g., for Y = 2X - 7, if X = 3 then Ŷ = 2(3) - 7 = -1
The regression equa?on for the straight line that best-
fits the data produces the least sum of squared errors
between the line and the actual data

X Y

2 3 12

6 11 10

0 6 8

4 6 Y 6

5 7 4

2
7 12
0
5 10
0 1 2 3 4 5 6 7
3 9 X

X (X-MX) (X-MX)2 Y (Y-MY) (Y-MY)2

2 -2 4 3 -5 25
6 2 4 11 3 9
0 -4 16 6 -2 4
4 0 0 6 -2 4
5 1 1 7 -1 1
7 3 9 12 4 16
5 1 1 10 2 4
3 -1 1 9 1 1
∑X = 32 SSX = 36 ∑Y = 64 SSY = 64
MX = 4 MY = 8
2
sX = SSX/N-1 = 36/7 = 5.14 sY2 = SSY/N-1 = 64/7 = 9.14
sX =√5.14 = 2.27 sY =√9.14 = 3.02
(X-MX) (Y-MY) (X-MX)(Y-MY)
-2 -5 10
2 3 6
-4 -2 8
0 -2 0
1 -1 -1
3 4 12
1 2 2
-1 1 -1
SP = 36

Cov = SP/N-1 = 36/7 = 5.14

r = Cov/sXsY = 5.14/(2.27)(3.02) = 0.75

b = SP/SSX = 36/36 = 1
or
b = r(sY/sX) = 0.75(3.02/2.27) = 1
or
b = cov/s2X = 5.14/5.14 = 1

a = MY-(b)MX = 8-(1)4 = 4

Ŷ = bX + a = (1)X + 4

Ŷ=X+4
The Standard Error of the Es?mate
• The variability in the outcome variable that is not
predicted by the regression equa?on (SSRESIDUAL) can be
used to indicate the average accuracy of the predic?on
• The standard error of the es?mate (sY-Ŷ) is the average
distance between the regression line and the actual data
(i.e., average error when using the regression line to
make predic?ons)

X Y Ŷ=X+4 (Y - Ŷ) (Y - Ŷ)2
2 3 6 -3 9
6 11 10 1 1
0 6 4 2 4
4 6 8 -2 4
5 7 9 -2 4
7 12 11 1 1
5 10 9 1 1
3 9 7 2 4
SSRESIDUAL= 28

sY-Ŷ = √(∑(Y-Ŷ)2/df) = √(SSRESIDUAL/N-2) = √(28/6) = 2.16

Assessing the Goodness of Fit

• The mean of the outcome variable is a model of “no
rela?onship” between the predictor and outcome variables
• SSTOTAL represents how good the mean is as a model of the
observed data
• SSRESIDUAL represents the degree of inaccuracy when the best
model is fiied to the observed data
• We can use these two values to calculate how much beier the
regression model is than the baseline model (i.e., the mean)
• The improvement in predic?on resul?ng from using the
regression model rather than the mean is determined by
calcula?ng the difference between SSTOTAL and SSRESIDUAL
Assessing the Goodness of Fit
• SSMODEL (i.e., SSTOTAL - SSRESIDUAL) shows the reduc?on in
the inaccuracy of the model that comes from fiLng the
regression model to the data
• If the value of SSMODEL is large it means the regression
model represents a big improvement to how well the
outcome variable can be predicted
• If the value of SSMODEL is small it means the regression
model is not much beier than the baseline model for
making predic?ons about the outcome variable

When the model results in beier predic?on than using the

baseline model, then SSMODEL is much greater than SSRESIDUAL

SST
Total Variance In The Data

SSM SSR
Improvement Due to the Model Error in Model

R2
• R2 measures the propor?on of variability in Y scores that can
be accounted for by predictor variable (i.e., the variability in Y
that the regression equa?on predicts, or can account for)
• R2 = SSMODEL / SSTOTAL
• 1 - R2 measures variability in Y scores that cannot be accounted
for by the predictor variable (i.e., the variability in Y the
regression equa?on does not predict, or can account for)

model R2

R2
Signiﬁcance Tes?ng and Regression

• The overall signiﬁcance of the regression model is

evaluated by compu?ng the F-ra?o
• A significant F-ra?o indicates that the model predicts a
significant por?on of the variability in the Y scores (i.e.,
more than would be expected by chance alone)
• To compute the F-ra?o, we first calculate a variance,
called a mean square (MS), for the predicted variability
and for the unpredicted variability

Analysis of Variance (ANOVA)

model
model
model

Analysis of Variance (ANOVA)

model
X Y

2 3

6 11

0 6

4 6

5 7

7 12

5 10

3 9

• Compute R2 (i.e., the propor?on of the variance in Y

scores that can predicted from variable X) for the
rela?onship
• Determine if the regression model accounts for a
signiﬁcant (α = 0.05, two-tailed) por?on of the variance
in Y scores, and if variable X is a signiﬁcant predictor of
variable Y
• Compute the standard error of the es?mate (sY-Ŷ)

• r = 0.75, so R2 = 0.563
• H0: the regression model does not account for a
significant por?on of the variance in Y scores (i.e., R2 = 0)
• H1: the regression model accounts for a significant
propor?on of the variance in Y scores (i.e., R2 ≠ 0)
• SSMODEL = R2SSY = 0.563(64) = 36
• SSRESIDUAL = (1 - R2)SSY = 0.438(64) = 28
• N = 8, so F-cri?cal (1, 6) = 5.99 and t (6) cri?cal = 2.45
• MSMODEL = SSMODEL/dfMODEL = 36/1 = 36
• MSRESIDUAL = SSRESIDUAL/dfRESIDUAL = 28/6 = 4.67
• F = MSMODEL/MSRESIDUAL = 36/4.67 = 7.71
• Reject H0, and conclude the regression model accounts
for a significant por?on of the variance in Y scores; F
(1,6) = 7.71, p < .05; R2 = 0.56, and the slope of the
regression line is significant, so variable X is a
significant predictor of variable Y; t (6) = 2.78, p < .05
• sY-Ŷ = √(SSRESIDUAL/df) = √(28/6) = 2.16

Double the one-tailed p-value for a two-tailed test (i.e., p = .032)

• A professor obtains SAT scores and first-year GPAs for a
sample of N = 15 students. The SAT scores have a MSAT
= 580 with SSSAT = 22,400 and the GPAs have a MGPA =
3.10 with SSGPA = 1.26, and SP = 84
• Find the regression equa?on for predic?ng GPA from
SAT scores
• Compute R2 for the rela?onship
• Determine if the regression model accounts for a
significant (α = 0.05, two-tailed) por?on of the variance
in GPA and if SAT scores are a significant predictor of
GPA
• Compute the standard error of the es?mate (sY-Ŷ)

• b = SP/(SSSAT) = 84/(22,400) = 0.00375

• a = MGPA - (b)MSAT = 3.10-(0.00375)(580) = 0.925
• Ŷ = bX + a = 0.00375X + 0.925

• s2SAT = 22,400/14 = 1600

• sSAT = √1600 = 40
• s2GPA = 1.26/14 = 0.09
• sGPA = √0.09 = 0.3
• Cov = SP/N-1 = 84/14 = 6

• r = Cov/(sSAT)(sGPA) = 6/(40)(0.3) = 0.5

2 2
• R = 0.5 = 0.25

• H0: the regression model, with SAT scores as a predictor, does

not account for a significant por?on of the variance in GPA
(i.e., R2 = 0)
• H1: the regression model, with SAT scores as a predictor,
accounts for a significant por?on of the variance in GPA
(i.e., R2 ≠ 0)
• SSMODEL = R2SSSAT = (0.25)(1.26) = 0.315
• SSRESIDUAL = (1 - R2)SSSAT = (0.75)(1.26) = 0.945
• N = 15, so F-cri?cal (1, 13) = 4.67 and t (13) cri?cal = 2.16
• MSMODEL = SSMODEL/dfMODEL= 0.315/1 = 0.315
• MSRESIDUAL = SSRESIDUAL/dfRESIDUAL = 0.945/13= 0.073
• F = MSMODEL/MSRESIDUAL = 0.315/0.073 = 4.32
• t = √F = √4.32 = 2.08
• Fail to reject H0, the regression model does not
account for a significant por?on of the variance in GPA
scores; F(1,13) = 4.32, ns; R2 = 0.25, and the slope of
the regression line is not significant, so SAT scores do
not predict GPA; t (13) = 2.08, ns
• sY-Ŷ = √(SSRESIDUAL/df) = √(0.945/13) = 0.27

End of Lecture

Introduction to Regression Analysis
No ratings yet
Introduction to Regression Analysis
5 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
57 pages
Interpreting R-Squared in Regression
No ratings yet
Interpreting R-Squared in Regression
51 pages
Understanding Covariance and Regression Analysis
No ratings yet
Understanding Covariance and Regression Analysis
3 pages
Strongest Linear Regression Analysis
No ratings yet
Strongest Linear Regression Analysis
5 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
10 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
37 pages
Regression and Correlation Analysis Basics
No ratings yet
Regression and Correlation Analysis Basics
39 pages
Simple Linear Regression Cheat Sheet
No ratings yet
Simple Linear Regression Cheat Sheet
8 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
41 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
23 pages
Understanding Regression Analysis Concepts
No ratings yet
Understanding Regression Analysis Concepts
10 pages
Regression Analysis and Correlation Insights
No ratings yet
Regression Analysis and Correlation Insights
3 pages
Linear Regression Analysis Techniques
0% (1)
Linear Regression Analysis Techniques
238 pages
Understanding Explanatory Variables in Regression
No ratings yet
Understanding Explanatory Variables in Regression
60 pages
FNB 3104 Lecture 2
No ratings yet
FNB 3104 Lecture 2
32 pages
IntroLinear Cal
No ratings yet
IntroLinear Cal
5 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
24 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
33 pages
Cautions in Simple Regression Analysis
No ratings yet
Cautions in Simple Regression Analysis
93 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
56 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
17 pages
Understanding Linear Regression Models
No ratings yet
Understanding Linear Regression Models
10 pages
Understanding Regression Models
No ratings yet
Understanding Regression Models
7 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
19 pages
Understanding Regression Equations
No ratings yet
Understanding Regression Equations
56 pages
Lect 12 - Regression Analysis II
No ratings yet
Lect 12 - Regression Analysis II
32 pages
Understanding Simple Regression Analysis
No ratings yet
Understanding Simple Regression Analysis
1 page
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
10 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
43 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
52 pages
Correlation and Simple Linear Regression Guide
No ratings yet
Correlation and Simple Linear Regression Guide
26 pages
Regression Analysis in Psychology
No ratings yet
Regression Analysis in Psychology
34 pages
Linear Regression and Correlation Basics
No ratings yet
Linear Regression and Correlation Basics
33 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
15 pages
Understanding SSE in Regression Analysis
No ratings yet
Understanding SSE in Regression Analysis
25 pages
Regression Main
No ratings yet
Regression Main
18 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
Simple Linear Regression Equation Explained
No ratings yet
Simple Linear Regression Equation Explained
12 pages
Introduction to Linear Regression
No ratings yet
Introduction to Linear Regression
6 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
31 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
16 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
37 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
7 pages
Understanding OLS Estimators and ANOVA
No ratings yet
Understanding OLS Estimators and ANOVA
42 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
33 pages
Simple Linear Regression in Excel Guide
No ratings yet
Simple Linear Regression in Excel Guide
19 pages
Introduction to Linear Regression Analysis
No ratings yet
Introduction to Linear Regression Analysis
6 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
36 pages
Understanding R-Squared in Regression Analysis
No ratings yet
Understanding R-Squared in Regression Analysis
3 pages
Understanding Regression Analysis Basics
50% (2)
Understanding Regression Analysis Basics
44 pages
Introduction to Regression Analysis
No ratings yet
Introduction to Regression Analysis
37 pages
CH14 Test Answer Key and Analysis
No ratings yet
CH14 Test Answer Key and Analysis
3 pages
Regression Model Building Steps
No ratings yet
Regression Model Building Steps
4 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
35 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
6 pages
Probability Theory for Data Science
No ratings yet
Probability Theory for Data Science
27 pages
Bayes' Theorem in Medical Testing
No ratings yet
Bayes' Theorem in Medical Testing
23 pages
Linear Regression Analysis Examples
No ratings yet
Linear Regression Analysis Examples
19 pages
Symmetric Histogram Analysis Techniques
No ratings yet
Symmetric Histogram Analysis Techniques
19 pages
Mscs 1
No ratings yet
Mscs 1
2 pages
Open Source Data Science Tools Survey
No ratings yet
Open Source Data Science Tools Survey
32 pages
Intro to Machine Learning Course Overview
No ratings yet
Intro to Machine Learning Course Overview
68 pages
DA-IICT MSc Data Science Sample Paper
No ratings yet
DA-IICT MSc Data Science Sample Paper
13 pages
Student Report Card System in C++
60% (5)
Student Report Card System in C++
20 pages
ODM Public School Mid-Term Exam Schedule
No ratings yet
ODM Public School Mid-Term Exam Schedule
2 pages
Transportation Trends: Shipper-Carrier Collaboration
No ratings yet
Transportation Trends: Shipper-Carrier Collaboration
17 pages
Exploring November Writing Styles
No ratings yet
Exploring November Writing Styles
7 pages
1 Elements of Teaching and Learning
No ratings yet
1 Elements of Teaching and Learning
4 pages
Business Data Processing & IT Courses
No ratings yet
Business Data Processing & IT Courses
8 pages
Ajay Rawat's Uttarakhand Book Overview
No ratings yet
Ajay Rawat's Uttarakhand Book Overview
2 pages
Understanding Brain Functions and Research
No ratings yet
Understanding Brain Functions and Research
2 pages
B2B e-CRM Practices Typology
No ratings yet
B2B e-CRM Practices Typology
18 pages
UCC6C Software Testing Pyq
No ratings yet
UCC6C Software Testing Pyq
19 pages
Understanding Scoring Rubrics
No ratings yet
Understanding Scoring Rubrics
3 pages
200 Inspiring Quotes on Ethics
No ratings yet
200 Inspiring Quotes on Ethics
10 pages
Grade 10 Mathematics Exam Reviewer
No ratings yet
Grade 10 Mathematics Exam Reviewer
4 pages
Graphic Design History Overview
No ratings yet
Graphic Design History Overview
7 pages
Rose Window Instability in Electromagnetics
No ratings yet
Rose Window Instability in Electromagnetics
3 pages
Urban and Regional Planning Basics
100% (1)
Urban and Regional Planning Basics
84 pages
Bioenergetics: Exergonic vs. Endergonic
No ratings yet
Bioenergetics: Exergonic vs. Endergonic
19 pages
Fact Sheet Congenital Cataracts: Some of The Symptoms of Cataracts
No ratings yet
Fact Sheet Congenital Cataracts: Some of The Symptoms of Cataracts
2 pages
Market Research Process Overview
No ratings yet
Market Research Process Overview
3 pages
Understanding Spiral Dynamics: Stage Yellow
100% (1)
Understanding Spiral Dynamics: Stage Yellow
16 pages
Klann Mechanism for Robotic Mobility
No ratings yet
Klann Mechanism for Robotic Mobility
7 pages
Rector's and Dean's Merit List 2022
No ratings yet
Rector's and Dean's Merit List 2022
3 pages
Differences Between American and British English
100% (1)
Differences Between American and British English
15 pages
Help Desk Call Handling Procedures
No ratings yet
Help Desk Call Handling Procedures
17 pages
Jahan-e-Tib: Unani Medicine Journal
No ratings yet
Jahan-e-Tib: Unani Medicine Journal
4 pages
IAS Prelims Error Book Method
No ratings yet
IAS Prelims Error Book Method
34 pages
Change Management Exam Questions
No ratings yet
Change Management Exam Questions
1 page
Modal Verbs: Obligation & Advice Guide
No ratings yet
Modal Verbs: Obligation & Advice Guide
7 pages
Six Sigma Yellow Belt Training Guide
100% (1)
Six Sigma Yellow Belt Training Guide
130 pages
Differential Calculus Concepts Explained
No ratings yet
Differential Calculus Concepts Explained
85 pages

Linear Regression Overview and Analysis

Uploaded by

Linear Regression Overview and Analysis

Uploaded by

Regression

PSYC 300B - Lecture 1

The Linear Equa?on

• Slope (b) - determines the direc?on and degree to

X (X-MX) (X-MX)2 Y (Y-MY) (Y-MY)2

Cov = SP/N-1 = 36/7 = 5.14

sY-Ŷ = √(∑(Y-Ŷ)2/df) = √(SSRESIDUAL/N-2) = √(28/6) = 2.16

Assessing the Goodness of Fit

When the model results in beier predic?on than using the

• The overall signiﬁcance of the regression model is

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA)

• Compute R2 (i.e., the propor?on of the variance in Y

Double the one-tailed p-value for a two-tailed test (i.e., p = .032)

• b = SP/(SSSAT) = 84/(22,400) = 0.00375

• s2SAT = 22,400/14 = 1600

• r = Cov/(sSAT)(sGPA) = 6/(40)(0.3) = 0.5

• H0: the regression model, with SAT scores as a predictor, does

You might also like