0% found this document useful (0 votes)

34 views33 pages

Week 2

Uploaded by

Philip Owusu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views33 pages

Week 2

Uploaded by

Philip Owusu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Introduction to Regression Analysis

Lecturer:
Wilhemina Adoma Pels

KNUST

January 24, 2024

1 / 33
SIMPLE LINEAR REGRESSION

2 / 33
REGRESSION

Regression is a statistical method used to describe the nature of the

relationship between variables, that is, positive or negative, linear or
nonlinear.
Regression Analysis is used to predict the value of one variable(the
dependent variable) on the basis of other variables(the independent
variables)
The variable we are trying to predict is called the response or
dependent variable: denoted Y
The variable predicting this is called the explanatory or independent
variable: denoted X
If we only have one independent variable, the model is
y = β0 + β1 x + ε (1)
This model is referred to as simple linear regression.
3 / 33
Applications

Economics

Social Science

Engineering

Management

Life & Biological Sciences

4 / 33
SIMPLE LINEAR REGRESSION
Is a model that estimate the linear relationship between a single dependent
variable Y and an independent variable X.
Model
Yi = β0 + β1 Xi + εi i = 1, · · · , n (2)
Variables:
X = Independent Variable(we provide this )
Y = Dependent Variable (we observe this)
Parameters:
β0 = Y-intercept
β1 = Slope
ε = random error
In this model β0 , β1 and εi are parameters and Yi and Xi are measured
values.
5 / 33
SIMPLE LINEAR REGRESSION

Required Conditions OR Assumption

For these regression methods to be valid, the following for conditions for
the error variable ε must be met:
The probability distribution of ε is normal.
The mean of the distribution is 0; that is, E (ε) = 0.
The standard deviation of ε is σε which is a constant regardless of the
value of x .
The value of ε associated with any particular value of y is
independent of ε associated with any other value of y .

6 / 33
LEAST SQUARE ESTIMATION OF THE PARAMETERS

Estimating the Coefficients

In much the same way we base estimates of µ on x̄ , we estimate β0
with βˆ0 and β1 with βˆ1 ,the y-intercept and slope respectively of the
least squares or regression line given by:

ŷ = βˆ0 + βˆ1 x (3)

This is an application of the least squares method and it produces a

straight line that minimizes the sum of the squared differences
between the points or the observation yi and the fitted line.

7 / 33
The Least Squares Line

Figure: Least Squares Line

8 / 33
LEAST SQUARE ESTIMATION OF THE PARAMETERS

n
X n
X
L = min ε̂2 = (Y − Ŷ )2 (4)
i=1 i=1
n n
(yi − βˆ0 − βˆ1 xi )2
X X
L = min ε2i = (5)
i=1 i=1
n
δL
= −2 (yi − βˆ0 − βˆ1 xi ) = 0
X
(6)
δ βˆ0 i=1
n
δL
(yi − βˆ0 − βˆ1 xi )xi = 0
X
= −2 (7)
δ βˆ1 i=1
simplifying the equations yields
n n
nβˆ0 + βˆ1
X X
xi = yi (8)
i=1 i=1

9 / 33
LEAST SQUARE ESTIMATION OF THE PARAMETERS

Cont’d
n n n
βˆ0 +βˆ1
X X X
xi2 = yi xi (9)
i=1 i=1 i=1

The solution to the equations results in the least squares estimators of βˆ0
and βˆ1 The least squares estimates of the intercept and slope in the simple
linear regression model are;

βˆ0 = ȳ − βˆ1 x̄ (10)

and Pn Pn
Pn ( i=1 yi )( i=1 xi )
i=1 yi xi −
βˆ1 = Pn n 2 (11)
Pn 2 ( i=1 xi )
i=1 xi −
n
10 / 33
LEAST SQUARE ESTIMATION OF THE PARAMETERS

Cont’d
the fomular of the slope can be denoted using the sum of squares,
Sxy
βˆ1 = (12)
Sxx
where;
n Pn Pn
X ( i=1 yi )( i=1 xi )
Sxy = yi xi − (13)
i=1
n
and
n Pn 2
X ( i=1 xi )
Sxx = xi2 − (14)
i=1
n

11 / 33
SIMPLE LINEAR REGRESSION

Regression Equation
Regression Equation describes the regression line mathematically by βˆ0
and βˆ1 the intercept and the slope. We replace a by βˆ0 and b by βˆ1 in the
graph below.

12 / 33
REGRESSION
Cont’d

13 / 33
Example

The amount of a chemical compound y, which is dissolved in 100 grams of

water at various temperatures x, were recorded as follows

xoC 6 5 10 7 8 12 5 9 7 11
y(grams) 21 19 31 25 28 33 20 29 22 32

1 Fit the linear regression model y = β0 + β1 x + ε to these data, using

the method of least squares

2 Estimate the amount of the chemical compound which will dissolve in

100 grams of water at 7.5o C

14 / 33
Solution

xi yi xi yi xi2
6 21 126 36
5 19 95 25
10 31 310 100
7 25 175 49
8 28 224 64
12 33 396 144
5 20 100 25
9 29 261 81
7 22 154 49
11 32 352 121
Σ = 80 260 2193 694

15 / 33
Solution

(260) × (80)
Sxy = 2193 − = 113
10
(80)
Sxx = 694 − = 54
10
Sxy 113
βˆ1 = = = 2.093
Sxx 54

βˆ0 = ȳ − βˆ1 x̄ = 26 − (2.093 × 8) = 9.259

The regression model is ŷ = 9.259 + 2.093x
2. When x=7.5
ŷ = 9.259 + 2.093(7.5) = 24.954

16 / 33
Interpretation of Coefficients in Regression Analysis

The coefficients describe the mathematical relationship between each

independent variable and the dependent variable.
The size of the coefficient for each independent variable gives you the
size of the effect that variable is having on your dependent variable,
and the sign on the coefficient (positive or negative) gives you the
direction of the effect.
1 A positive coefficient indicates that as the value of the independent
variable increases, the dependent variable also tends to increase.
2 A negative coefficient suggests that as the independent variable
increases, the dependent variable tends to decrease or vice versa.
The intercept is the average amount when the independent variable is
zero

17 / 33
Interpretation of Coefficients in Regression Analysis

Cont’d
Now interpret this Regression Equation;

ŷ = 4.692 + 0.923x (15)

18 / 33
SIMPLE LINEAR REGRESSION
Line of best fit Plot

19 / 33
SIMPLE LINEAR REGRESSION

Estimating the Variance of the error term ε

The residual;
εi = yi − yˆi (16)
is used to obtain the estimate of the error term.
The sum of squares of the residuals(Error Sum of Squares) is;
n
X n
X
SSE = ε2i = (yi − yˆi )2 (17)
i=1 i=1

The expected value of the error sum of square is;

E (SSE ) = (n − 2)σ 2 (18)

20 / 33
REGRESSION

Cont’d
Therefore the unbiased estimator of σ 2 is;
SSE
σˆ2 = (19)
n−2
Also the standard error of estimate is;
s
SSE
Sε = (20)
n−2

If Sε is Zero, all the points fall on the regression line. If Sε is small, the fit
is excellent and the linear model should be used for forecasting. If Sε is
large, the model is poor.

21 / 33
Example

The following measurements of the specific heat of a certain chemical were

made in order to investigate the variation in specific heat with
temperature.

Temperature 0 C (x) 0 10 20 30 40
Specific heat (y) 0.51 0.55 0.57 0.59 0.63
Find the least squares regression line of specific heat on temperature, and
hence estimate the value of the specific heat when the temperature is 25
0C .

22 / 33
Solution
x y xy x2
0 0.51 0 0
10 0.55 5.55 100
20 0.57 11.4 400
30 0.59 17.7 900
40 0.63 25.2 1600
P
= 100 2.85 59.8 3000
Sxy = ni=1 xy − n1 ( ni=1 x )( ni=1 y )
P P P

Sxy = 5i=1 (59.8) − 51 (100)(2.85)

Sxy = 2.8
Sxx = ni=1 x 2 − n1 ( ni=1 x )2
P P

Sxx = 5i=1 (3000) − 51 (100)2

Sxx = 1000

Sxy 2.8
β1 = = = 0.00028
Sxx 1000
23 / 33
Solution

βˆ0 = ȳ − βˆ1 x̄

2.85
ȳ = = 0.57
5
100
x̄ = = 20
5

βˆ0 = 0.57 − 0.0028(20) = 0.5644

The fitteed squares regression line is ŷ = βˆ0 + βˆ1 x

ŷ = 0.5644 + 0.00028x

24 / 33
Solution

at 250 C
ŷ = 0.5644 + 0.00025(25)
ŷ = 0.5714

25 / 33
REGRESSION

Testing the slope

If no linear relationship exists between the two variables, we would
expect the regression line to be horizontal, that is, to have a slope of
zero.
We want to see if there is a linear relationship, i.e. we want to see if
the slope(β1 ) is something other than zero. Our research hypothesis
becomes:
H0 = β1 = 0 [no linear relationship]
H1 = β1 ̸= 0 [there is linear relationship]

26 / 33
REGRESSION

Cont’d
We can implement this test statistic to try our hypothesis:

βˆ1 − β1
t= (21)
Sβˆ1

Where Sβˆ1 is the standard deviation of βˆ1 , defined as:

s
σˆ2
Sβˆ1 = (22)
Sxx
where
n Pn 2
X ( i=1 xi )
Sxx = xi2 − (23)
i=1
n

27 / 33
REGRESSION

Cont’d
If the error term ε is normally distributed, the test statistic has a
student t-distribution with n-2 degrees of freedom. The rejection
region depends on whether or not we’re doing a one or two tail
test(two tail test is most typical)
We reject the null hypothesis H0 if tcal > tα/2 , n − 2

28 / 33
Properties of the OLS estimates

These can be summarized by: OLS Estimator is BLUE

B-Best
L-Linear
U-Unbiased
E-Estimator
Note: The Gauss Markov theorem is required for the proof.

29 / 33
GROUP ASSIGNMENT

1 PROVE THAT OLS IS BLUE

2 Estimate β0 and β1
Show working

30 / 33
Trial Questions

A study was made on the amount of converted sugar(y) in a certain

process at various temperatures(x). The data were coded and
recorded as follows:
(x) 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
(y) 8.1 7.8 8.5 9.8 9.5 8.9 8.6 10.2 9.3 9.2 10.5
a. Find the equation of the least squares regression line.
b. Estimate the converted sugar when the coded sugar is 1.75.

31 / 33
Trial Question

Regression methods were used to analyze the data from from a study
investigating the relationship between roadway surface temperature(x)
and pavement deflection(y).Summary quantities were;
P P 2 P P 2
n = 20, yi = 12.75, yi = 8.86, xi = 1478, xi =
P
143215.8 and xi yi = 1083.67
a. Calculate the least squares estimates of the slope and intercept of the
linear regression line.
b. Use the equation of the fitted regression line to predict the pavement
deflection when the surface temperature is 75 0 F .

32 / 33
Thank You.

33 / 33

Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
28 pages
Simple Linear Regression Assumptions
No ratings yet
Simple Linear Regression Assumptions
20 pages
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
No ratings yet
Raw Introduction to Linear Regression (서울대 회귀분석 강의노트)
226 pages
Regression Analysis in Data Science
No ratings yet
Regression Analysis in Data Science
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
25 pages
Linear Regression for Engineers
No ratings yet
Linear Regression for Engineers
56 pages
Chapter 7 - New 1
No ratings yet
Chapter 7 - New 1
29 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
27 pages
Regression Analysis
No ratings yet
Regression Analysis
22 pages
Daunit 3
No ratings yet
Daunit 3
32 pages
Simple - Linear - Regression-Presentation - Review-Analysis - Covariance
No ratings yet
Simple - Linear - Regression-Presentation - Review-Analysis - Covariance
10 pages
Regression Analysis
No ratings yet
Regression Analysis
37 pages
Simple Linear Regression Analysis - ReliaWiki
No ratings yet
Simple Linear Regression Analysis - ReliaWiki
29 pages
Basics of Regression Analysis
No ratings yet
Basics of Regression Analysis
63 pages
Theory of Linear Regression
No ratings yet
Theory of Linear Regression
4 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Regression Analysis
No ratings yet
Regression Analysis
38 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Simple Linear
No ratings yet
Simple Linear
10 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
12 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
MAP 716 Lecture 4 Simple Linear Regression
No ratings yet
MAP 716 Lecture 4 Simple Linear Regression
23 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
47 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
STAT630Slide Adv Data Analysis
0% (1)
STAT630Slide Adv Data Analysis
238 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
18 pages
Chapter 6 Student
No ratings yet
Chapter 6 Student
21 pages
Chapter 14 Simple Linear Regression
No ratings yet
Chapter 14 Simple Linear Regression
45 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
56 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
55 pages
2022 CIVN2011A and STAT3029A - Lecture Series 10a - Simple Linear Regression
No ratings yet
2022 CIVN2011A and STAT3029A - Lecture Series 10a - Simple Linear Regression
14 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
30 pages
Chapter 1 Linear Regression Notes (As FS2)
No ratings yet
Chapter 1 Linear Regression Notes (As FS2)
29 pages
06 Least Squar Regression
No ratings yet
06 Least Squar Regression
25 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
Unit III
No ratings yet
Unit III
18 pages
Chapter 5 Regression Analysis
No ratings yet
Chapter 5 Regression Analysis
14 pages
Fe5209 3 Ay 2024
No ratings yet
Fe5209 3 Ay 2024
59 pages
Simple Linear Regression and Correlation PDF
No ratings yet
Simple Linear Regression and Correlation PDF
7 pages
Lecture1 STAT4355
No ratings yet
Lecture1 STAT4355
59 pages
Lesson 11 Simple Linear Regression and Correlation
No ratings yet
Lesson 11 Simple Linear Regression and Correlation
38 pages
Unit III
No ratings yet
Unit III
11 pages
Unit III
No ratings yet
Unit III
24 pages
Regression Analysis Quiz
No ratings yet
Regression Analysis Quiz
5 pages
Linear Regression for Managers
No ratings yet
Linear Regression for Managers
9 pages
Lecture 4
No ratings yet
Lecture 4
11 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
46 pages
Unit - Iii
No ratings yet
Unit - Iii
9 pages
NASA Regression Lecture
No ratings yet
NASA Regression Lecture
268 pages
John Kofi Poku (Knust)
No ratings yet
John Kofi Poku (Knust)
125 pages
Unit 5
No ratings yet
Unit 5
30 pages
Unit 2C
No ratings yet
Unit 2C
39 pages
Biology For Mathematics Microbiology BIOL 181 16032023 1
No ratings yet
Biology For Mathematics Microbiology BIOL 181 16032023 1
90 pages
Front Template
No ratings yet
Front Template
1 page
Project Report Restructured
No ratings yet
Project Report Restructured
10 pages
Linear Programming MCQs Full75
No ratings yet
Linear Programming MCQs Full75
17 pages
(Updated) Assignment 1
No ratings yet
(Updated) Assignment 1
3 pages
BRN Mtbi 2016
No ratings yet
BRN Mtbi 2016
48 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Convergence and Least Squares Method
No ratings yet
Convergence and Least Squares Method
3 pages
Assignment 3
0% (1)
Assignment 3
7 pages
When I Consider How My Light Is Spent
No ratings yet
When I Consider How My Light Is Spent
29 pages
LaTeX Basics for Academic Writing
No ratings yet
LaTeX Basics for Academic Writing
25 pages
Week 1
No ratings yet
Week 1
32 pages
Disentangling Classical and Bayesian Approaches To Uncertainty Analysis
No ratings yet
Disentangling Classical and Bayesian Approaches To Uncertainty Analysis
19 pages
Stats
No ratings yet
Stats
8 pages
Ai ML
No ratings yet
Ai ML
19 pages
Ecology Laboratory Manual
100% (2)
Ecology Laboratory Manual
194 pages
Advanced Statistical Mechanics Guide
No ratings yet
Advanced Statistical Mechanics Guide
21 pages
Corporate Fraud Impact Study
No ratings yet
Corporate Fraud Impact Study
47 pages
Forecasting: To Accompany by Render, Stair, and Hanna Power Point Slides Created by Brian Peterson
No ratings yet
Forecasting: To Accompany by Render, Stair, and Hanna Power Point Slides Created by Brian Peterson
84 pages
Impact of Rewards On Employee Performance
No ratings yet
Impact of Rewards On Employee Performance
7 pages
Engle 1982
100% (1)
Engle 1982
22 pages
MSC Chemistry 2sem Course 2. 4
No ratings yet
MSC Chemistry 2sem Course 2. 4
321 pages
Econometrics - Lecture Notesa
No ratings yet
Econometrics - Lecture Notesa
231 pages
Pharma Analytical Validation Guide
No ratings yet
Pharma Analytical Validation Guide
19 pages
Factorial ANOVA Explained
No ratings yet
Factorial ANOVA Explained
54 pages
Causal Machine Learning For Supply Chain Risk Prediction and Intervention Planning
No ratings yet
Causal Machine Learning For Supply Chain Risk Prediction and Intervention Planning
22 pages
pmwj27 Oct2014 Wain Updating The Lang Factor Featured Paper PDF
No ratings yet
pmwj27 Oct2014 Wain Updating The Lang Factor Featured Paper PDF
17 pages
Regression Analysis of Iris Measurements
No ratings yet
Regression Analysis of Iris Measurements
10 pages
Smoothed Bootstrap - Nelson-Siegel Revisited June 2010
No ratings yet
Smoothed Bootstrap - Nelson-Siegel Revisited June 2010
38 pages
Significance of The Stochastic Disturbance Term
No ratings yet
Significance of The Stochastic Disturbance Term
5 pages
CFA 2 Notes
No ratings yet
CFA 2 Notes
44 pages
Kuliah 3-Taburan PersempelanM4 TABURAN PERSAMPELAN
No ratings yet
Kuliah 3-Taburan PersempelanM4 TABURAN PERSAMPELAN
42 pages
10999-Manuscript (Word) - 48093-2-15-20231227
No ratings yet
10999-Manuscript (Word) - 48093-2-15-20231227
8 pages
Quality Control 4. What Are Meant by Median and Mode? Give Examples From A Set of Data Characterize The Other Types of QC Charts
No ratings yet
Quality Control 4. What Are Meant by Median and Mode? Give Examples From A Set of Data Characterize The Other Types of QC Charts
2 pages
Problems in CVP Set B
No ratings yet
Problems in CVP Set B
9 pages
Dela Cruz, Nathaniel - SPSS Final Project
No ratings yet
Dela Cruz, Nathaniel - SPSS Final Project
9 pages
The Impact of Nostalgia On Brand Resurrection and Iconness The Importance of Being Local
No ratings yet
The Impact of Nostalgia On Brand Resurrection and Iconness The Importance of Being Local
22 pages
PSF Extractor
No ratings yet
PSF Extractor
31 pages
Solution Ecom30004 Homework2 Questions-1
No ratings yet
Solution Ecom30004 Homework2 Questions-1
6 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
Regression With Excel
No ratings yet
Regression With Excel
28 pages
Fast Food Restaurants
100% (1)
Fast Food Restaurants
53 pages

Week 2

Uploaded by

Week 2

Uploaded by

Introduction to Regression Analysis

January 24, 2024

Regression is a statistical method used to describe the nature of the

Life & Biological Sciences

Required Conditions OR Assumption

Estimating the Coefficients

ŷ = βˆ0 + βˆ1 x (3)

This is an application of the least squares method and it produces a

Figure: Least Squares Line

βˆ0 = ȳ − βˆ1 x̄ (10)

The amount of a chemical compound y, which is dissolved in 100 grams of

1 Fit the linear regression model y = β0 + β1 x + ε to these data, using

2 Estimate the amount of the chemical compound which will dissolve in

βˆ0 = ȳ − βˆ1 x̄ = 26 − (2.093 × 8) = 9.259

The coefficients describe the mathematical relationship between each

ŷ = 4.692 + 0.923x (15)

Estimating the Variance of the error term ε

The expected value of the error sum of square is;

E (SSE ) = (n − 2)σ 2 (18)

The following measurements of the specific heat of a certain chemical were

Sxy = 5i=1 (59.8) − 51 (100)(2.85)

Sxx = 5i=1 (3000) − 51 (100)2

βˆ0 = 0.57 − 0.0028(20) = 0.5644

The fitteed squares regression line is ŷ = βˆ0 + βˆ1 x

Testing the slope

Where Sβˆ1 is the standard deviation of βˆ1 , defined as:

These can be summarized by: OLS Estimator is BLUE

1 PROVE THAT OLS IS BLUE

A study was made on the amount of converted sugar(y) in a certain

You might also like