0% found this document useful (0 votes)

5 views

Topic 3a

Uploaded by

Edlyn Linet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Topic 3a

Uploaded by

Edlyn Linet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 64

Topic 3

Simple Linear Regression

1
Learning objectives
 Explain the assumptions of classical linear
regression using the Gauss-Markov Theorem
(GMT) framework.
 Estimate the parameters of a regression using
the ordinary least squares (OLS) method.
 Interpret the coefficients of a regression.
 Understand the steps of hypothesis testing
 Standard errors
 Confidence intervals

2
Gauss-Markov Theorem:
Under the 5 Gauss-Markov assumptions,
the OLS estimator is the best, linear,
unbiased estimator of the true parameters
(β’s) conditional on the sample values of
the explanatory variables. In other words,
the OLS estimators is BLUE

3
5 Gauss-Markov Assumptions for
Simple Linear Model (Wooldridge, p.65)
1. Linear in Parameters y  0  1 x1  u
2. Random Sampling of n ( xi , yi ) : i 1, 2,...n
observations

3. Sample variation in x ( x1 x2 x3 ... xn )

explanatory variables. xi’s
are not all the same value

4. Zero conditional mean. The E (u x ) 0

error u has an expected
value of 0, given any values
of the explanatory variable

5. Homoskedasticity. The error Var (u x)  2

has the same variance given
any value of the explanatory
variable. 4
How Good are the Estimates?
Properties of Estimators
 Small Sample Properties
 Trueregardless of how much data we have
 Most desirable characteristics
 Unbiased
 Efficient
 BLUE (Best Linear Unbiased Estimator)

5
“Second Best” Properties of
Estimators
 Asymptotic (or large sample) Properties
 True in hypothetical instance of infinite data
 In practice applicable if N>50 or so
 Asymptotically unbiased
 Consistency
 Asymptotic efficiency

6
Bias
 A parameter is unbiased if

ˆ
E (  j )  j , j 0,1,...., k
 In other words, the average value of the estimator
in repeated sampling equals the true parameter.
 Note that whether an estimator is biased or not
implies nothing about its dispersion.
7
Efficiency
 An estimator is efficient if its variance is less
than any other estimator of the parameter.

 This criterion ˆonly useful in combination with

others. (e.g.  j =2 is low variance, but biased)
ˆ j is the “best” Unbiased estimator if
Var ( ˆ j ) Var (  j )
,where  j is any other unbiased estimator
of β
8
F(βx)
Unbiased and
efficient estimator
of β Biased estimator
High Sampling
of β
Variance means
inefficient
estimator of β

0 9
True β β + bias
BLUE
(Best Linear Unbiased Estimate)
 An Estimator ˆ j is BLUE
if:

 ˆ j is a linear function

 ˆ j is unbiased: E ( ˆ j )  j , j 0,1,...., k
ˆ j is the most efficient: Var ( ˆ j ) Var (  j )


10
Large Sample Properties
 Asymptotically Unbiased
 As n becomes larger E( ˆ j ) trends toward β
j
 Consistency
 If the bias and variance both decrease as n
gets larger, the estimator is consistent.
 Asymptotic Efficiency
 asymptotic distribution with finite mean and
variance
 is consistent
 no estimator has smaller asymptotic variance
11
F(βx)
Demonstration of
Consistency

n=50

n=16

n=4

0 12
True β
Linear Regression Model

13
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable M odels Variables

Simple M ultiple

Non- Non-
Linear Linear
Linear Linear

14
Linear Equations
Y
Y = mX + b
C ha ng e
m = S lo pe in Y
C ha ng e in X
b = Y -in terce pt
X

15
Linear Regression Model
 1. Relationship Between Variables Is
a Linear Function
Population Population Random
Y-Intercept Slope Error

Yi   0  1X i   i
Dependent Independent
(Response) (Explanatory) Variable
Variable (e.g., Years s. serocon.)
(e.g., CD+ c.)
Population & Sample
Regression Models
Population

 


17
Population & Sample
Regression Models
Population

Unknown

Relationship
Yi  0  1X i   i
 


18
Population & Sample
Regression Models
Population Random Sample

Unknown

Relationship
Yi  0  1X i   i 

 


19
Population & Sample
Regression Models
Population Random Sample

Unknown

Relationship
Yi  0  1X i   i 

 


20
Population Linear Regression Model

Y Yi  0   1X i   i Observed
value

i = Random error

E Y   0  1 X i

X
Observed value
21
Sample Linear Regression
Model
Y 𝑒^𝑖=𝑌 𝑖 −^𝑌 𝑖

^i = Random
error
Unsampled
observation
yˆi ˆ0  ˆ1 xi
X
Observed value
22
Estimating Parameters:
Ordinary Least Squares
Method

23
Scatter plot
 1. Plot of All (Xi, Yi) Pairs
 2. Suggests How Well Model Will Fit
Y
60
40
20
0 X
0 20 40 60

24
Thinking Challenge

How would you draw a line through the

points? How do you determine which line
‘fits best’?

Y
60
40
20
0 X
0 20 40 60

25
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?

Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept unchanged
26
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?
Slope unchanged

Y
60
40
20
0 X
0 20 40 60
Intercept changed
27
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?

Slope changed
Y
60
40
20
0 X
0 20 40 60
Intercept changed
28
Ordinary Least Squares
 1. ‘Best Fit’ Means Difference Between
Actual Y Values & Predicted Y Values Are
a Minimum. But Positive Differences Off-
Set Negative ones

29
Ordinary Least Squares
 1. ‘Best Fit’ Means Difference Between
Actual Y Values & Predicted Y Values is a
Minimum. But Positive Differences Off-Set
Negative ones. So square errors!

   ˆ
n n
2
Yi  Yˆi 2
i
i 1 i 1

30
Ordinary Least Squares
 1. ‘Best Fit’ Means Difference Between
Actual Y Values & Predicted Y Values Are
a Minimum. But Positive Differences Off-
Set Negative. So square errors!

   ˆ
n n
2
Yi  Yˆi 2
i
i 1 i 1
 2. LS Minimizes the Sum of the
Squared Differences (errors) (SSE)
31
Ordinary Least Squares Graphically
𝑛
^ ^ 2 ^ 2 ^2 ^
𝑂 𝐿𝑆𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒𝑠 ∑ 𝑒𝑖 =𝑒1 +𝑒2 +𝑒3 +𝑒 4
2 2

𝑖=1

Y 𝑒^2=𝑌 2 −𝑌
^ 2

^4
^2
^1 ^3
yˆi ˆ0  ˆ1 xi
X
32
Coefficient Equations
 Prediction equation
yˆi ˆ0  ˆ1 xi

 Sample slope
SS xy  xi  x yi  y 
ˆ1  
2
SS xx  ix  x 
 Sample Y - intercept

ˆ0  y  ˆ1x
33
Interpretation of Coefficients
^
 1. Slope (1)
^
 Estimated Y Changes by 1 for Each 1 Unit
Increase in X
^
 If 1 = 2, then Y Is Expected to Increase by 2 for
Each 1 Unit Increase in X

34
Interpretation of Coefficients
^
 1. Slope (1)
^
 Estimated Y Changes by 1 for Each 1 Unit
Increase in X
 A 1 unit increase in X leads to a (+/-) unit
change
^ in Y
 If 1 = 2, then Y Is Expected to Increase by 2 for
Each 1 Unit Increase
^ in X
 2. Y-Intercept (0)
 Average
^ Value of Y When X = 0
 If  = 4, then Average Y Is Expected to Be
0
35
4 When X Is 0
Parameter Estimation Example
 Obstetrics: What is the relationship between
Mother’s Estriol level & Birthweight using the
following data?
Estriol Birthweight
(mg/24h) (g/1000)
1 1
2 1
3 2
4 2
5 4

36
Exercise
 Plot a scatter diagram
 Estimate the linear regression
 Interpret your results based on economic theory
 Show that
 Show that the SSE ≈ 0

37
Scatterplot
Birthweight vs. Estriol level
Birthweight

4
3
2
1
0
0 1 2 3 4 5 6

Estriol level
38
Parameter Estimation Solution
Table
Xi Yi Xi2 Yi2 XiYi
1 1 1 1 1
2 1 4 1 2
3 2 9 4 6
4 2 16 4 8
5 4 25 16 20
15 10 55 26 37
39
Parameter Estimation Solution
n
   n

  X i    Yi 
n
   i 1  1510

i 1
X Y
i i  37 
n 5
ˆ1  i 1
 0.70

n

2

15
2

  i X 55 
n
  5
 2 i 1
X i 
i 1 n

ˆ0 Y  ˆ1 X 2  0.70 3  0.10

40
Coefficient Interpretation
Solution
^
 1. Slope (1)
A 1 unit Increase in Estriol (X) leads to a 0.7
unit increase in birthweight (Y)

41
Coefficient Interpretation
Solution
^
 1. Slope (1)
 Birthweight (Y) Is Expected to Increase by .7
Units for Each 1 unit Increase in Estriol (X)
^
 2. Intercept (0)
 Average Birthweight (Y) Is -.10 Units When
Estriol level (X) Is 0
 Difficult to explain
 The birthweight (or any weight for that matter)

should always be positive

42
43
Limitations of simple linear regression
 Only considers one independent variable.
 The dependent variable must be continuous.
 Cannot show causation.
 Sensitive to outliers.
 Can only describe linear relationships.
 Only looks at the mean of the dependent variable.

44
Statistical Inference

Hypothesis testing
Confidence intervals
Hypothesis Testing

Two-Tailed Test about a Population Mean: Small n

Reject H0 Reject H0
Anderson, Sweeney, and Williams

/2
/2

t
-t/2 t/2
0
46
(Critical values)
Student’s t-test
 The t-test is used to test hypotheses about
means when the population variance is
unknown (the usual case). Closely related
to z, the unit normal.
 Remember: If the sample is small (n < 30)
and the population variance s is
unknown, then we use the t-test and not
the z-test.
Steps of Hypothesis Testing

1. Determine the null and alternative hypotheses.

2. Specify the level of significance .
3. Select and calculate the test statistic that will be used to
test the hypothesis.
Using the Test Statistic
4. Use to determine the critical value for the test statistic.
The critical value comes from the Student’s t-distribution
table.
Anderson, Sweeney, and Williams

5. State the rejection rule for H0 . Use the value of the test
statistic and the rejection rule to determine whether to
reject H0.
6. Make a conclusion on the statistical significance of the
coefficient.

48
How do we compute the test statistic?

For our cases =0

How do we get the t-critical?
 From the Student’s t-distribution tables

Recall that this is a 2-tailed test, so check =

0.05 from tables
Finding the Standard Errors
and = SE

and = SE =
Where:

51
How do we find the critical values?
t distribution values
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.28

.90 1.812 1.725 1.697 1.64
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.58

Note: t Z as n increases

from “Statistics for Managers” Using Microsoft ® Excel 4th Edition, Prentice-Hall 2004
Confidence Intervals
Confidence Interval: An interval of values computed from the
sample, that is almost sure to cover the true population
value.

We make confidence intervals using values computed from the sample, not the
known values from the population.

The confidence level is the probability that we do not find a statistically

significant effect of the effect of an independent variable is zero.

It is related to the significance level and it is defined as 1 - 

Confidence Intervals
Interpretation: In 95% of the samples we take, the true
population proportion (or mean) will be in the interval.

We are 95% confident that lies between the lower limit and the
upper limit

This is also the same as saying we are 95% confident that the true population
proportion (or mean) will be in the interval
How do we compute the intervals?
Single population mean (small n, normally distributed)
How do we compute the intervals?
Single population mean (small n, normally distributed)
Hypothesis Testing Example
 Obstetrics: What is the relationship between
Mother’s Estriol level & Birthweight using the
following data?
Estriol Birthweight
(mg/24h) (g/1000)
1 1
2 1
3 2
4 2
5 4

57
Exercise 2
 Compute the standard errors for and
 Test the statistical significance of the slope at 5% level
()
 Compute the confidence intervals for and
 Write out the compact form of the regression equation:

() (SE)
=?
n =?

58
Exercise 3
 The following data relates to the quantity
demanded and price of a commodity
collected from five markets.
Price 1 2 3 4 5
Quantity demanded 15 10 14 8 3

59
Exercise 3
 Plot a scatter diagram
 Estimate the linear regression
 Interpret your results based on economic theory
 Show that
 Show that the SSE ≈ 0
 Compute the standard error for
 Test the statistical significance of the slope at 5% level ()
 Write out the compact form of the regression equation
 Compute the confidence intervals for

60
Conclusion from Statistical Analysis
Types of Statistical Errors
Type I and Type II Error
Type I and Type II Error

 False Positive: (Type 1 Error)

 Interpretation: You predicted positive and it’s false.
 You predicted that a man is pregnant but he actually
is not.
 False Negative: (Type 2 Error)
 Interpretation: You predicted negative and it’s false.
 You predicted that a woman is not pregnant but she
actually is.

Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Lecture 4 Linear Regression
No ratings yet
Lecture 4 Linear Regression
75 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
ECN 306
No ratings yet
ECN 306
43 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Lecture10 Regression2 TS PDF
No ratings yet
Lecture10 Regression2 TS PDF
22 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
econometrics final
No ratings yet
econometrics final
13 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
C1 English
No ratings yet
C1 English
26 pages
Regression 101
No ratings yet
Regression 101
18 pages
ECMT1020 Formulas 2021
No ratings yet
ECMT1020 Formulas 2021
9 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
15Multiple Linear Regression
No ratings yet
15Multiple Linear Regression
168 pages
The Linear Regression Model
No ratings yet
The Linear Regression Model
25 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
MGT-Three
No ratings yet
MGT-Three
86 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Multiple Regression
No ratings yet
Multiple Regression
49 pages
Statics Thinking-Regression
No ratings yet
Statics Thinking-Regression
51 pages
Econometrics Chapter Three (1)
No ratings yet
Econometrics Chapter Three (1)
55 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
Basic Econometrics Health
No ratings yet
Basic Econometrics Health
183 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Simple Linear Regression Analysis - Final
No ratings yet
Simple Linear Regression Analysis - Final
46 pages
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Chaeat Sheet Econometrics
100% (2)
Chaeat Sheet Econometrics
5 pages
Chap02-5 (Autosaved)
No ratings yet
Chap02-5 (Autosaved)
66 pages
Simple Regression
No ratings yet
Simple Regression
27 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Simple Regression Model CH02
No ratings yet
Simple Regression Model CH02
60 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Unit V
No ratings yet
Unit V
27 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
23 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
NASA Regression Lecture
No ratings yet
NASA Regression Lecture
268 pages
Econometrics Chap 3
No ratings yet
Econometrics Chap 3
19 pages
Module 5
No ratings yet
Module 5
24 pages
1 Preliminaries: 1.1 Motivation
No ratings yet
1 Preliminaries: 1.1 Motivation
7 pages
Notes2
No ratings yet
Notes2
16 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Basics of The OLS Estimator: Study Guide For The Midterm
No ratings yet
Basics of The OLS Estimator: Study Guide For The Midterm
7 pages
Eco 5
No ratings yet
Eco 5
30 pages
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
No ratings yet
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
35 pages
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Exercises of Function Study
From Everand
Exercises of Function Study
Simone Malacrida
No ratings yet
Introduction to Logarithms and Exponentials
From Everand
Introduction to Logarithms and Exponentials
Simone Malacrida
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Business Economics Cia - 1: Q1. Do You Think Aston Martin Car Presents An Exception To The Law of Demand? If Yes, Explain
No ratings yet
Business Economics Cia - 1: Q1. Do You Think Aston Martin Car Presents An Exception To The Law of Demand? If Yes, Explain
4 pages
A History of Knowledge
No ratings yet
A History of Knowledge
132 pages
REN Documentation
No ratings yet
REN Documentation
79 pages
Bipolar Mood Tracking Chart
No ratings yet
Bipolar Mood Tracking Chart
2 pages
ICT JCB Chapter Test 2
No ratings yet
ICT JCB Chapter Test 2
9 pages
Impact of shock waves on glass wool composition and properties
No ratings yet
Impact of shock waves on glass wool composition and properties
5 pages
The Company Man - Revision
No ratings yet
The Company Man - Revision
2 pages
Ions Scattering Spectroscopy (ISS)
No ratings yet
Ions Scattering Spectroscopy (ISS)
22 pages
Daoist Identity - History, Lineage, and Ritual
100% (6)
Daoist Identity - History, Lineage, and Ritual
345 pages
Grade 8 Big Summative 2 (Gulshan Anar)
No ratings yet
Grade 8 Big Summative 2 (Gulshan Anar)
2 pages
Caterpillar Cat 18M3 MOTOR GRADER (Prefix N9A) Service Repair Manual Instant Download
No ratings yet
Caterpillar Cat 18M3 MOTOR GRADER (Prefix N9A) Service Repair Manual Instant Download
30 pages
Chemical Bonds
No ratings yet
Chemical Bonds
8 pages
ATR72
No ratings yet
ATR72
3 pages
Persian Literature Ry PK A
No ratings yet
Persian Literature Ry PK A
4 pages
maths cheet sheet
No ratings yet
maths cheet sheet
5 pages
Exercises - Chapter 3
No ratings yet
Exercises - Chapter 3
6 pages
Is There A Relation Between Divorce Risk and Intelligence? Evidence From The Netherlands?
No ratings yet
Is There A Relation Between Divorce Risk and Intelligence? Evidence From The Netherlands?
20 pages
Ringkasan MP PDF
No ratings yet
Ringkasan MP PDF
43 pages
DVPR Template
No ratings yet
DVPR Template
2 pages
Converting Rubric Scores To Percentages For Grading
No ratings yet
Converting Rubric Scores To Percentages For Grading
2 pages
Contor Electronic Monofazat de Energie Electrică: Caracteristici Tehnice
No ratings yet
Contor Electronic Monofazat de Energie Electrică: Caracteristici Tehnice
2 pages
3D Seismic Data Acquisition
100% (1)
3D Seismic Data Acquisition
44 pages
AnimationMentor School Overview
No ratings yet
AnimationMentor School Overview
22 pages
Fao PDF
100% (1)
Fao PDF
132 pages
Robert Venosa Interview - Pod Collective Pod Collective
No ratings yet
Robert Venosa Interview - Pod Collective Pod Collective
5 pages
Experiment - Rate of Fermentation of Fruit Juices
No ratings yet
Experiment - Rate of Fermentation of Fruit Juices
5 pages
Chen Et Al., 2019
No ratings yet
Chen Et Al., 2019
34 pages
Strength of Lug
No ratings yet
Strength of Lug
8 pages
WI QA 08-Plug Gauges
No ratings yet
WI QA 08-Plug Gauges
2 pages
C5 - 6. Water in Soils
No ratings yet
C5 - 6. Water in Soils
49 pages

Topic 3a

Uploaded by

Topic 3a

Uploaded by

Topic 3

Simple Linear Regression

3. Sample variation in x ( x1 x2 x3 ... xn )

4. Zero conditional mean. The E (u x ) 0

5. Homoskedasticity. The error Var (u x)  2

 This criterion ˆonly useful in combination with

How would you draw a line through the

ˆ0 Y  ˆ1 X 2  0.70 3  0.10

should always be positive

Two-Tailed Test about a Population Mean: Small n

1. Determine the null and alternative hypotheses.

For our cases =0

Recall that this is a 2-tailed test, so check =

.80 1.372 1.325 1.310 1.28

The confidence level is the probability that we do not find a statistically

It is related to the significance level and it is defined as 1 - 

 False Positive: (Type 1 Error)

You might also like