0% found this document useful (0 votes)

10 views

L2 SLR model 2

The document discusses the Ordinary Least Squares (OLS) estimator for simple linear regression, detailing how to derive the estimators for the slope and intercept by minimizing the sum of squared residuals. It includes an example using data on samosa consumption in relation to price, demonstrating the regression analysis and interpretation of results. Additionally, it covers properties of the estimators, including unbiasedness and efficiency, and introduces the concepts of total, explained, and residual sums of squares.

Uploaded by

dangikunal1001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

L2 SLR model 2

Uploaded by

dangikunal1001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

GC code: trkn2ml

Simple Linear Regression Model cont.

The OLS estimator
Recap: Given the three assumptions, outlined earlier, we show that
we could choose β̂0 and β̂1 using OLS, by minimizing the sum of
squared residuals (SSR)
n
X
Minβ̂0 ,β̂1 S = [yi − β̂0 − β̂1 xi ]2
i

First order conditions:

n n n
∂S X X X
= −2[yi −β̂0 −β̂1 xi ] = 0 ⇒ yi = β̂0 n+β̂1 xi (1)
∂β0
i i i

n n n
∂S X X X X
= −2[yi −β̂0 −β̂1 xi ]xi = 0 ⇒ xi yi = β̂0 xi +β̂1 xi2 (2)
∂β1
i i i

(1) and (2) are referred to as the normal equations.

Note that
(1) ⇒ ȳ = β̂0 + β̂1 x̄ ⇒ β̂0 = ȳ − β̂1 x̄.
Plugging this into (2) and recognizing
•
P
xi = nx̄ where x̄ is the mean of x
• (xi − x̄)(yi − ȳ ) = xi2 − nx̄ ȳ
P P

it is easy to show that

Pn
iP(xi − x̄)(yi − ȳ )
β̂1 = n 2
i (xi − x̄)
These are the OLS estimators of β0 and β1 .
There is a close relationship between the estimated slope
coefficient β̂1 and the correlation coefficient. Recall that the
sample correlation coefficient ρ̂ is given by
Pn
(xi − x̄)(yi − ȳ )
ρ̂= Pn i
p Pn
2 2
i (xi − x̄) i (yi − ȳ )
Example

Eco 221 students of 2024 asked how many samosas they would buy
per month at various prices. Here are some summary statistics:

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------
quantity | 336 7.639881 7.288393 1 22
price | 336 14.85714 6.137028 5 25

We now want to ask by how much would the average consumption

of samosas decrease if price was increased by one (10) rupee(s)?
To answer, we can regress quantity (y ) on price (x)
. reg quantity price

------------------------------------------------------------------------------
quantity | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.2929574 .062975 -4.65 0.000 -.416835 -.1690799
_cons | 11.99239 1.012088 11.85 0.000 10.00152 13.98326
------------------------------------------------------------------------------

For now we focus only on the column ”coefficient” β̂1 is

approximately -0.293 and β̂0 (also called the constant) is
approximately 12
Interpretation: Every |10 increase in price of samosa would be
associated with a decline in average consumption of approximately
3 (2.93 rounded up) samosas
Deriving the OLS slope coefficient manually

gen q1 = quantity - 7.639881. // This generates y_i - ybar

gen p1 = price -14.85714 // This generate x_i - xbar

. gen q1p1= q1*p1 // This generates the product of the two

. gen p1sq = p1^2 // This generates the square of x_i - xbar

. summ q1p1 p1sq

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------
q1p1 | 336 -11.00085 41.75874 -141.5497 115.2241
p1sq | 336 37.55102 40.58872 .020409 102.8776

. disp -11.00085*336 // This is the sum of q1p1 (the numerator of the OLS estimator)
-3696.2856

. disp 37.5512*336 // This is the sum of p1sq (the denominator of the OLS estimator)
12617.203

. disp -3696.2856/12617.203
-.29295602 // This is the estimated slope coefficient (with some rounding error)
Algebraic properties of the SLR model

The normal equations from the first order conditions imply

P
1. The estimated residuals sum to zero: i ûi = 0. This follows
from the first order condition (1)
2. The regression equation passes through the mean. I.e ȳ = ŷ¯ ,
so that ȳ = β̂0 + β̂1 x̄. So the residual at the sample means is
zero. Also follows from the first order condition (1).
3. The
P covariance between the x and the û is zero. I.e.
i xi ûi = 0. This follows from the first order condition (2).
This in turn means that the sample covariance between x and
û is zero. Why? Cov (x, û) = E(x û) − E(x)E(û) = E(x û) = 0.
Partitioning the sums of squares

One consequence of property 1 and property 3 is that the

covariance between ŷ and û is zero. This is because ŷ is simply a
coefficient multiplied by x, which is uncorrelated with û by
property 3. Thus yi = ŷi + ûi in effect is a decomposition of y into
two bits that are uncorrelated in the sample.
Next we want to ask, how good a job did we do in terms of getting
ŷ very close to y ? To answer this,
Define:
• SST (total sum of squares) = ni=1 (yi − ȳ )2
P

• SSE (explained sum of squares) = ni=1 (ŷi − ȳ )2

• SSR (residual sum of squares) = ni=1 ûi2

Then the above decomposition implies that SST = SSE + SSR

Proof

n
X n
X
2
SST = (yi − ȳ ) = [(yi − ŷi ) + (ŷi − ȳ )]2
i=1 i=1
n
X
[ûi + (ŷi − ȳ )]2
i=1
n
X n
X n
X
= ûi2 +2 ûi (ŷi − ȳ ) + (ŷi − ȳ )2
i=1 i=1 i=1

= SSR + 0 + SSE
The second term is zero by application of algebraic property 3 as
noted above
Goodness of fit

Goodness of fit R 2 is then defined as the ratio of SSE to SST

Pn Pn
2 i=1 (ŷi − ȳ )2 i=1 û
2
R = Pn 2
= 1 − P n 2
i=1 (yi − ȳ ) i=1 (yi − ȳ )

Clearly, 0 ≤ R 2 ≤ 1
It is the proportion of variation in y that is explained by x It
provides a visual assessment of how good a job we did Pin 2
minimizing the (scaled by y) distance represented by ûi . The
2
higher the R the better the fit.
SST, SSR, SSE and R 2 in our model

Source | SS df MS Number of obs = 336

-------------+---------------------------------- F(1, 334) = 21.64
Model | 1082.85435 1 1082.85435 Prob > F = 0.0000
Residual | 16712.5712 334 50.0376384 R-squared = 0.0609
-------------+---------------------------------- Adj R-squared = 0.0580
Total | 17795.4256 335 53.1206734 Root MSE = 7.0737

Note that while the β0 and β1 are population parameters and are
therefore given scalars, the OLS estimators β̂0 and β̂1 are random
variables. The actual magnitudes of the estimated slopes and
constants depend on the given sample that happens to be drawn.

To see this, let’s draw 15 random samples from our data set of 336
observations, and run the same OLS regression on each of these 15
samples.
. splitsample, generate(samp) nsplit(15)

. tab samp

samp | Freq. Percent Cum.

------------+-----------------------------------
1 | 22 6.55 6.55
2 | 23 6.85 13.39
3 | 22 6.55 19.94
4 | 23 6.85 26.79
5 | 22 6.55 33.33
6 | 22 6.55 39.88
7 | 23 6.85 46.73
8 | 22 6.55 53.27
9 | 23 6.85 60.12
10 | 22 6.55 66.67
11 | 22 6.55 73.21
12 | 23 6.85 80.06
13 | 22 6.55 86.61
14 | 23 6.85 93.45
15 | 22 6.55 100.00
------------+-----------------------------------
Total | 336 100.00
. reg quantity price if samp==1
------------------------------------------------------------------------------
quantity | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
price | -.3990968 .2349877 -1.70 0.104 -.8877804 .0895869
_cons | 14.91626 4.003795 3.73 0.001 6.58991 23.24261
------------------------------------------------------------------------------

. reg quantity price if samp==2

------------------------------------------------------------------------------
quantity | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
price | -.4074795 .2300374 -1.77 0.092 -.8873291 .0723702
_cons | 13.16623 4.239979 3.11 0.006 4.321793 22.01068
------------------------------------------------------------------------------

Rather than listing these out 15 times, easier to visualize the

estimated slope coefficient for each of the 15 samples via a
histogram
Plotting the 15 estimated slopes

Thus, β̂1 is a random variable which has its own sampling

distribution with a mean and variance. The same will be true of β̂0 .
“Desirable” properties of estimators

• Unbiasedness
• Efficiency (minimum variance) [we will cover this later]
These are finite sample properties

An estimator θ̂ of θ is said to be unbiased if E(θ̂) = θ

In the SLR case, we want β̂0 and β̂1 to be unbiased. I.e. that
E[β̂1 ] = β1 and E[β̂0 ] = β0

Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
Simple Regression Model - Estimation
No ratings yet
Simple Regression Model - Estimation
9 pages
The Three-Variable Model: Notation and Assumptions
No ratings yet
The Three-Variable Model: Notation and Assumptions
8 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Ordinary Least Squares With A Single Independent Variable
No ratings yet
Ordinary Least Squares With A Single Independent Variable
6 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
No ratings yet
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
4 pages
Pertemuan 4
No ratings yet
Pertemuan 4
24 pages
Multiple Regression Analysis: I 0 1 I1 K Ik I
100% (1)
Multiple Regression Analysis: I 0 1 I1 K Ik I
30 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Ols 23-24
No ratings yet
Ols 23-24
87 pages
UNIT-3 NOTES
No ratings yet
UNIT-3 NOTES
16 pages
Lecture 2-3_Properties of the OLS Estimates
No ratings yet
Lecture 2-3_Properties of the OLS Estimates
20 pages
Chapter 3 Multiple Regression
No ratings yet
Chapter 3 Multiple Regression
49 pages
EC2C4__Econometrics_II (11)
No ratings yet
EC2C4__Econometrics_II (11)
56 pages
Econ 329 - Statistical Properties of The Ols Estimator: Sanjaya Desilva
No ratings yet
Econ 329 - Statistical Properties of The Ols Estimator: Sanjaya Desilva
12 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Ols Derivation
No ratings yet
Ols Derivation
3 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
No ratings yet
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
7 pages
Lesson01 PDF 02
No ratings yet
Lesson01 PDF 02
5 pages
derex econom
No ratings yet
derex econom
13 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
STATS135 REVIEWER
No ratings yet
STATS135 REVIEWER
5 pages
CHP 3 PDF
No ratings yet
CHP 3 PDF
31 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
CHP 3 Notes, Gujarati
No ratings yet
CHP 3 Notes, Gujarati
4 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
6 pages
Econ 399 Chapter2a
No ratings yet
Econ 399 Chapter2a
40 pages
AG909 Quantitative Methods For Finance
No ratings yet
AG909 Quantitative Methods For Finance
7 pages
Chapter 3 Multiple regression
No ratings yet
Chapter 3 Multiple regression
49 pages
Econometrics Hawas-1
No ratings yet
Econometrics Hawas-1
83 pages
Chap 2
No ratings yet
Chap 2
15 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
No ratings yet
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
43 pages
Computational Properties and Goodness-of-Fit of The OLS Sample Regression Equation
No ratings yet
Computational Properties and Goodness-of-Fit of The OLS Sample Regression Equation
18 pages
The Simple Regression Model: DR Jin Hongfei 1
No ratings yet
The Simple Regression Model: DR Jin Hongfei 1
41 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
L1 The SLR model
No ratings yet
L1 The SLR model
11 pages
Manual ML 1
No ratings yet
Manual ML 1
8 pages
Ols Estimates
No ratings yet
Ols Estimates
16 pages
Group 6 - FNC01 - K48 - HW1
No ratings yet
Group 6 - FNC01 - K48 - HW1
11 pages
Ec410 Lecture 4 - Simple Regression II
No ratings yet
Ec410 Lecture 4 - Simple Regression II
8 pages
Linear Regression Model: Topic 2
No ratings yet
Linear Regression Model: Topic 2
49 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
4basic Econometrics Chapter III
No ratings yet
4basic Econometrics Chapter III
13 pages
3-TheSimpleLinearRegressionModelPart2
No ratings yet
3-TheSimpleLinearRegressionModelPart2
38 pages
HW-1
No ratings yet
HW-1
9 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
2024 1 Metrics 6 Multipleols 4
No ratings yet
2024 1 Metrics 6 Multipleols 4
18 pages
Econ 2220 Lecture 5
No ratings yet
Econ 2220 Lecture 5
26 pages
14 Simple Linear Regression
No ratings yet
14 Simple Linear Regression
13 pages
UnivariateRegression 2
No ratings yet
UnivariateRegression 2
72 pages
Math644 Chapter 1 Part1
No ratings yet
Math644 Chapter 1 Part1
5 pages
Reading 4
No ratings yet
Reading 4
15 pages
Chapter 2
No ratings yet
Chapter 2
12 pages
C1 English
No ratings yet
C1 English
26 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet
1. Lecture+Notes+-+Advanced+Regression
No ratings yet
1. Lecture+Notes+-+Advanced+Regression
12 pages
Release Notes Isatis - Neo 2020.06: Last Update: June 28, 2020
No ratings yet
Release Notes Isatis - Neo 2020.06: Last Update: June 28, 2020
35 pages
649_E-Book Econometrics Imp
No ratings yet
649_E-Book Econometrics Imp
18 pages
Syllabus Applied Microeconometrics 23fall
No ratings yet
Syllabus Applied Microeconometrics 23fall
5 pages
PDF (eBook PDF) Loss Models: From Data to Decisions 5th Edition download
100% (7)
PDF (eBook PDF) Loss Models: From Data to Decisions 5th Edition download
56 pages
ELEN4903 hw1 Spring2018
No ratings yet
ELEN4903 hw1 Spring2018
2 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Template For Parameter Estimation With Matlab Optimization Toolbox PDF
No ratings yet
Template For Parameter Estimation With Matlab Optimization Toolbox PDF
8 pages
M.E. Bda
No ratings yet
M.E. Bda
106 pages
An Introduction To Bootstrap Methods With Applications To R
No ratings yet
An Introduction To Bootstrap Methods With Applications To R
236 pages
Statistics Syllabus
No ratings yet
Statistics Syllabus
36 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
PA
No ratings yet
PA
8 pages
Introductory Econometrics Midterm Examnation INSTRUCTIONS: - This Is The Open-Book Exam
No ratings yet
Introductory Econometrics Midterm Examnation INSTRUCTIONS: - This Is The Open-Book Exam
2 pages
SYLLABUS Statistics For Business and Economics
No ratings yet
SYLLABUS Statistics For Business and Economics
17 pages
Chris Brooks Chapter 3 Slides
No ratings yet
Chris Brooks Chapter 3 Slides
80 pages
Minimum Variance Unbiased Estimators
No ratings yet
Minimum Variance Unbiased Estimators
4 pages
Statistics For Management and Economics, Tenth Edition Formulas
No ratings yet
Statistics For Management and Economics, Tenth Edition Formulas
11 pages
PS2 Econ320 2024
No ratings yet
PS2 Econ320 2024
2 pages
Medical Image Analysis (The MICCAI Society book Series) 1st Edition Alejandro Frangi instant download
100% (1)
Medical Image Analysis (The MICCAI Society book Series) 1st Edition Alejandro Frangi instant download
46 pages
Sample Data Sets For Linear Regression1
0% (1)
Sample Data Sets For Linear Regression1
3 pages
Chapter 9 - Correlation and Regression
No ratings yet
Chapter 9 - Correlation and Regression
112 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
SDET Formulae MidSem2 2018 Ver3
No ratings yet
SDET Formulae MidSem2 2018 Ver3
2 pages
Introduction To Econometrics, Tutorial
No ratings yet
Introduction To Econometrics, Tutorial
22 pages
Statistics - Linear Regression - Correlation Worksheet PDF
No ratings yet
Statistics - Linear Regression - Correlation Worksheet PDF
2 pages
A New Look at The Statistical Model Identification PDF
No ratings yet
A New Look at The Statistical Model Identification PDF
8 pages
SPSS Answers (Chapter 5)
No ratings yet
SPSS Answers (Chapter 5)
6 pages
[FREE PDF sample] (eBook PDF) Statistics for Business and Economics 8th Edition ebooks
100% (1)
[FREE PDF sample] (eBook PDF) Statistics for Business and Economics 8th Edition ebooks
40 pages

L2 SLR model 2

Uploaded by

L2 SLR model 2

Uploaded by

GC code: trkn2ml

Simple Linear Regression Model cont.

First order conditions:

(1) and (2) are referred to as the normal equations.

it is easy to show that

Variable | Obs Mean Std. dev. Min Max

We now want to ask by how much would the average consumption

For now we focus only on the column ”coefficient” β̂1 is

gen q1 = quantity - 7.639881. // This generates y_i - ybar

gen p1 = price -14.85714 // This generate x_i - xbar

. gen p1sq = p1^2 // This generates the square of x_i - xbar

. summ q1p1 p1sq

Variable | Obs Mean Std. dev. Min Max

The normal equations from the first order conditions imply

One consequence of property 1 and property 3 is that the

• SSE (explained sum of squares) = ni=1 (ŷi − ȳ )2

• SSR (residual sum of squares) = ni=1 ûi2

Then the above decomposition implies that SST = SSE + SSR

Goodness of fit R 2 is then defined as the ratio of SSE to SST

Source | SS df MS Number of obs = 336

samp | Freq. Percent Cum.

. reg quantity price if samp==2

Rather than listing these out 15 times, easier to visualize the

Thus, β̂1 is a random variable which has its own sampling

An estimator θ̂ of θ is said to be unbiased if E(θ̂) = θ

You might also like