0% found this document useful (0 votes)

14 views

P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor

The document provides an overview of linear regression with one regressor using ordinary least squares (OLS). It defines key concepts in regression analysis including the population and sample regression functions, regression coefficients, slope, intercept, and error terms. It describes how OLS finds the intercept and slope that minimize the sum of squared errors between predicted and actual values. The document also outlines assumptions of the classical linear regression model including that the error term has a mean of zero and constant variance.

Uploaded by

ruthbadoo54

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor

Uploaded by

ruthbadoo54

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

P1.T2.

Quantitative Analysis

Stock & Watson, Introduction to Econometrics

Linear Regression with one regressor

Bionic Turtle FRM Video Tutorials

By David Harper, CFA FRM
Linear Regression with one regressor

• Explain how regression analysis in econometrics

measures the relationship between dependent
and independent variables.
• Define and interpret a population regression
function, regression coefficients, parameters,
slope, intercept, and the error term.
• Define and interpret a sample regression function,
regression coefficients, parameters, slope, intercept,
and the error term.
• Describe the key properties of a linear regression.

Page 2
Linear Regression with one regressor -- continued

• Define an ordinary least squares (OLS)

regression and calculate the intercept and
slope of the regression.
• Describe the method and three key
assumptions of OLS for estimation of parameters.
• Summarize the benefits of using OLS estimators.
• Describe the properties of OLS estimators and their sampling
distributions, and explain the properties of consistent estimators in
general.
• Define and interpret the explained sum of squares, the total sum of
squares, the residual sum of squares, the standard error of the
regression, and the regression R 2.
• Interpret the results of an OLS regression.

Page 3
Explain how regression analysis in econometrics measures the
relationship between dependent and independent variables.
A linear regression may have one or more of the following objectives:
• To estimate the (conditional) mean or average value of dependent
variable
• Test hypotheses about nature of dependence
• To predict or forecast dependent
• One or more of the above

Page 4
Explain how regression analysis in econometrics measures the
relationship between dependent and independent variables.
A linear regression may have one or more of the following objectives:
• To estimate the (conditional) mean or average value of dependent variable
• Test hypotheses about nature of dependence
• To predict or forecast dependent
• One or more of the above

The linear regression model is given by:

In Stock and Watson, the authors regress student test scores against class size:

= + × +

Page 5
Define and interpret a population regression function, regression
coefficients, parameters, slope, intercept, and the error term.
In a univariate linear regression, there are two important coefficients:
• The slope coefficient measures the average change in the dependent variable
given a unit change in the independent variable. For example, if we regress
average weekly lotto spending against weekly income and the slope is 0.080, then a
one-dollar increase in income is associated with an 8 cent increase in lotto spending.

Page 6
Define and interpret a population regression function, regression
coefficients, parameters, slope, intercept, and the error term.
In a univariate linear regression, there are two important coefficients:
• The intercept is the predicted value of the dependent if the independent
variable is equal to zero. Assume an intercept of 7.62. This indicates that if income
were zero, the average lotto spending would be $7.62.

Page 7
Define and interpret a sample regression function, regression
coefficients, parameters, slope, intercept, and the error term.

In theory, there is one unobserved population with one set of

parameters. In practical applications, the intercept and
slope of the populations regression are unknown.
We use sample data to infer (estimate) the unknown slope and
intercept of the population regression line. But there are many
samples, each sample has a:
Sample Regression Function → Estimator (statistic) → Estimate

Stochastic population regression function (PRF) = + +

Sample regression function (SRF) = +
Stochastic sample regression function (SRF) = + +

Page 8
Define and interpret a sample regression function, regression
coefficients, parameters, slope, intercept, and the error term
(continued)

Each sample produces its own scatterplot. Through this

sample scatterplot, we can plot a sample regression line
(SRL). The sample regression function (SRF) characterizes
this line; the SRF is analogous to the PRF, but for each
sample.

• = intercept = sample regression coefficient One population, but each

sample is different
( sampling variation)
• = slope = sample regression coefficient
• = the residual term of the sample

The regression coefficients and of the sample that minimize the sum of
squared errors between the observed values and those estimated by the
sample regression equation are called the ordinary least square estimators and
respectively. These OLS estimators are sample counterparts of the population
coefficients and .

Page 9
Define and interpret a sample regression function, regression
coefficients, parameters, slope, intercept, and the error term
(continued)

Note the correspondence between error term and the residual. As we specify
the model, we ex ante anticipate an error; after we analyze the observations, we
ex post observe residuals.

Unlike the PRF which is

presumed to be unobserved
but stable, the SRF varies
with each sample. The chart
below contains two samples
(each n = 20) drawn from the
same population of 1,000
observations. Although we can
expect similarities, each
sample and its coefficients will
vary.

Page 10
Describe the key properties of a linear regression.

According to Stock and Watson, the linear regression model requires at

least three assumptions:

• The error term, u(i), has a conditional mean zero given X(i); that is E[u(i)|X(i)]
=0
• [X(i),Y(i)] are independent and identically distributed (i.i.d.) draws from their
joint distribution; and
• Large outliers are unlikely: X(i) and Y(i) have nonzero finite fourth moments

Page 11
Describe the key properties of a linear regression (continued)

For reference, the classical linear regression

model (CLRM) makes the following
assumptions:

• Regression model is correctly specified

• Linear in the parameters (although
variables may be non-linear)
• Regressor(s) are uncorrelated with
error term
• Expected value of error term is zero:
E(u) = 0

Page 12
Describe the key properties of a linear regression (continued)

For reference, the classical linear regression

model (CLRM) makes the following
assumptions:

• The variance of the error is constant;

i.e., homoscedastic
• There is no correlation between error
terms; i.e., no serial- or auto-correlation
• Error term (u) is normally distributed
• In the case of a multivariate, there
exists no exact linear relationship
between two regressors. This is the
requirement of no perfect
multicollinearity.

Page 13
Describe the key properties of a linear regression (continued)

For our purposes, the key ideas here include:

• There are many ways to draw a straight line through a

scatterplot, but doing so using ordinary least square (OLS)
regression produce estimates (the values generated by
estimators) that are BLUE: best linear unbiased estimators
(which includes efficient estimators).

• OLS is unreliable if there are large outliers

Page 14
Describe the key properties of a linear regression (continued)

For our purposes, the key ideas here include:

• There are many ways to draw a straight line through a

• OLS is unreliable if there are large outliers

• Multicollinearity is when the regressors (independent variables) are
correlated. Perfect multicollinearity is a problem, but some
multicollinearity may not be a problem
• Heteroscedasticity is when the variance of the error is not constant
• Autocorrelation is when the error terms are correlated
• Don’t forget that the OLS regression line passes through the
average of each variable: = +

Page 15
Define an ordinary least squares (OLS) regression and calculate the
intercept and slope of the regression.

In a linear regression model, let

and obtained from a sample
be the estimators of population
parameters and respectively.

The regression coefficients chosen from the sample data are such that
these estimators of intercept and slope minimize the sum of squared
mistakes between the actual values and those estimated by the sample
regression equation. That is, the sum of the squared prediction errors is
minimized by these estimators.

( − ) = ( − − )

Page 16
Define an ordinary least squares (OLS) regression and calculate the
intercept and slope of the regression (continued)
The regression coefficients and of the sample that minimize the squared errors
are called the ordinary least square estimators and respectively. The OLS
estimators of intercept and slope are given as:

∑ ( )( )
= − = ∑
=
( )

where and are the sample averages of the independent and dependent variables
respectively. The OLS regression line, aka the sample regression line or sample
regression function, is the straight line constructed using these OLS estimators:

= +

The OLS residual is the difference between the actual and predicted values of :

= −

Page 17
Define an ordinary least squares (OLS) regression and calculate the
intercept and slope of the regression (continued)
While detailed OLS estimation in beyond our scope, consider the following two
tips. To illustrate, assume this really simple sample of ten data points (that is also
plotted below):
X(i) 1 2 3 4 5 6 7 8 9 10
Y(i) 2 4 6 7 4 5 7 9 6 9

Page 18
Define an ordinary least squares (OLS) regression and calculate the
intercept and slope of the regression (continued)

• If we are given either the slope or intercept,

we can use = + to infer the other.

In this case, if we are given the intercept

assumption of 2.80, then because
X(average) = 5.50 and Y(average) = 5.90,
we can infer the slope is given by (5.90 –
2.80)/5.50 = 0.5636

• If we are told (or can calculate) the standard deviations of the variables and the
correlation between them, we know the slope. In this case, σ(X) = 2.872, σ(Y) = 2.120,
and ρ(X,Y) = 0.7640. Because the slope is equal to σ(X,Y)/σ^2(y) = ρ(X,Y)*σ(Y)/ σ(Y),
we can see that the slope here is given by 0.7640*2.120/2.872 = 0.5636

Page 19
Describe the method and three key assumptions of OLS for
estimation of parameters.

The process of ordinary least squares estimation seeks to achieve the minimum
value for the residual sum of squares (squared residuals).

Page 20
Describe the method and three key assumptions of OLS for
estimation of parameters (continued)
The three key assumptions of the ordinary least squares (OLS) linear
regression model are:
1. Assumption # 1: The conditional distribution of the error term, given , has a
mean of zero. This assumption is a formal mathematical statement about the
“other factors” contained in the error term and asserts that these other factors are
unrelated to the independent variable, , in the following sense: given a value of
, the mean of the distribution of these other factors is zero.

2. Assumption #2: , and are independently and identically distributed across

observations. This assumption is a statement about how the sample is drawn. If
the observations are drawn by simple random sampling from a single large
population, then [ , ] are i.i.d.

3. Assumption # 3: The third least squares assumption is that large outliers are
unlikely. Large outliers can make OLS regression results misleading. Another
way to state this assumption is that X and Y have finite kurtosis. The
assumption of finite kurtosis is used in the mathematics to justify the large-
sample approximations to the distributions of the OLS test statistics.

Page 21
Summarize the benefits of using OLS estimators.

There are both practical and theoretical reasons to use the OLS
estimators.
• Commonly accepted and widely familiar: Because OLS
is the dominant method used in practice, it has become
the common language for regression analysis throughout
economics, finance and the social sciences. Presenting
results using OLS means that you are “speaking the same
language” as other economists and statisticians. The OLS
formulas are built into virtually all spreadsheet and
statistical software packages, making OLS easy to use.

• Desirable theoretical properties (unbiased and consistent):

The OLS estimators also have desirable theoretical properties.
They are analogous to the desirable property of as an
estimator of the population mean. Specifically, the OLS estimator
is unbiased and consistent. The OLS estimator is also efficient
among a certain class of unbiased estimators; however, this
efficiency result holds under some additional special conditions.

Page 22
Describe the properties of OLS estimators and their sampling
distributions, and explain the properties of consistent estimators in
general.
Given the assumptions of the classical linear regression model, the least-
squares (OLS) estimates possess ideal properties. These properties are
contained in the well-known Gauss–Markov theorem. To understand this
theorem, we need to consider the best linear unbiasedness property of an
estimator. An OLS estimator is said to be a best linear unbiased estimator
(BLUE) if the following hold:

• It is linear, that is, a linear function of a random variable, such as the dependent
variable Y in the regression model.
• It is unbiased, that is, its average or expected value is equal to the true value
• It has minimum variance in the class of all such linear unbiased estimators; an
unbiased estimator with the least variance is known as an efficient estimator.

In the regression context, it can be proved that the OLS estimators are BLUE. This is the gist
of the famous Gauss–Markov theorem, which can be stated as follows:
• Gauss–Markov Theorem: Given the assumptions of the classical linear regression
model, the least-squares estimators, in the class of unbiased linear estimators, have
minimum variance, that is, they are BLUE.

Page 23
Define and interpret the explained sum of squares (ESS), the total
sum of squares (TSS), the sum of squared residuals (SSR), the
standard error of the regression (SER), and the regression .
We can break the regression equation into three parts:
1. The explained sum of squares (ESS) is the sum of squared deviation or distance
between the predicted ( )and the mean of ( )

ESS = ( − )

2. The sum of squared residuals (SSR) is the summation of each squared deviation
between the observed or actual ( )and the predicted . Thus, the sum of
squared residual (SSR) is also the square of the error term. The ordinary least square
(OLS) approach minimizes the SSR.

SSR = ( − ) =

3. The total sum of squares (TSS) is the sum of squared deviation of observed
( ) from its mean ( ). Therefore, TSS is equal to sum of ESS and SSR.

= ( − ) = +

Page 24
Define and interpret the explained sum of squares (ESS), the total sum of
squares (TSS), the sum of squared residuals (SSR), the standard error of
the regression (SER), and the regression (continued)

Measures of Fit: and the standard error of the regression (SER)

are called the measures of fit, as they measure how well the OLS
regression line fits the data.

: It measures the fraction of the variance of independent variable that is

explained by the dependent variable, . Mathematically, it can be written as the
ratio of explained sum of squares(ESS) to the total sum of squares(TSS).
Alternatively, since ESS = TSS-SSR, it can be given as:

= = −
ranges from 0 to 1 and a larger value indicates that the regressor is good at
predicting the dependent variable.

Note: For a linear regression with only one independent variable, is the
square of the correlation between the dependent variable and the
independent variable .

Page 25
Define and interpret the explained sum of squares (ESS), the total sum of
squares (TSS), the sum of squared residuals (SSR), the standard error of
the regression (SER), and the regression (continued)

SER: It is the standard deviation of the Y values around the regression line.
That is, it is an estimator of the standard deviation of the regression error . So,
the SSR and the standard error of regression (SER) are directly related as:

= =
− −

Therefore, SSR is related to SER as: SSR = SER ( − 2)

Note the use of the use of − 2 instead of in the denominator. Division by this
smaller number—in this case ( − 2) instead of —is referred to as “an
unbiased estimate.”

Page 26
Define and interpret the explained sum of squares (ESS), the total sum of
squares (TSS), the sum of squared residuals (SSR), the standard error of
the regression (SER), and the regression (continued)

The denominator − 2 is used because the two-variable

regression has − 2 degrees of freedom (d.f.). In order to
compute the slope and intercept estimates, two independent
observations are consumed.
• If k = the number of explanatory variables plus the intercept (e.g., 2 if one
explanatory variable; 3 if two explanatory variables), then:

SER =
−

• If k = the number of slope coefficients (excluding the intercept), then similarly,

SER =
− −

Page 27
Interpret the results of an ordinary least squares (OLS) regression.

Taking the authors example of regressing test scores and class size, the
PRF is:

= + × +

Here we do not know the population value of , the slope of the

unknown population regression line relating (class size) and (test
scores). But it is possible to estimate the unknown slope and intercept of the
population regression line by using sample data.

Page 28
Interpret the results of an ordinary least squares (OLS) regression
(continued)
The sample data analyzed here consist of test scores (districtwide average) and
class sizes (districtwide student- teacher ratio) in 1999 in 420( ) California
school districts. The table below summarizes the distributions of tests cores and
class sizes for this sample. A scatterplot of these observations is shown in the
graph below.

Page 29
Interpret the results of an ordinary least squares (OLS) regression
(continued)
• The sample correlation is 0.23, indicating a weak negative relationship
between the two variables. Despite this low correlation, the sample
estimates a straight regression line which can predict the test scores
based on the class size (or STR)
• The sample regression coefficients
found using the equations given for
and are 698.93 and -2.2798
respectively.
• Thus, the sample regression line
estimated using these OLS
estimators is
= 698.93 − 2.2798 ×

Page 30
Interpret the results of an ordinary least squares (OLS) regression
(continued)
• From the slope of the regression equation, we can interpret
that an increase in the student-teacher ratio by one student
per class is, on average, associated with a decline in
districtwide test scores by approximately 2.28 points on the
test. The negative slope indicates that more students per
teacher (larger classes) is associated with poorer
performance on the test.

• In this example, strictly speaking, the intercept is the predicted value of

test scores when there are no students in the class! Thus, the intercept in
this particular case has no real-world meaning and it is best to think of it
mathematically as the coefficient that determines the level of the
regression line.

Page 31
Interpret the results of an ordinary least squares (OLS) regression
(continued)
The regression results are:
Excel Linest(.) This is the legend for the
regression output Linest(.) output to the left
STR intercept
(2.280) 698.933 β1 β0
0.480 9.467 SE(β1) SE(β0)
0.051 18.581 R^2 SER
22.575 418 F df
7,794 144,315 ESS SSR

TestScore = 698.9 + -2.280 × STR, R^2 = 0.051, SER = 18.58

(9.47) (0.48)

Note: Both the slope and intercept are both significant at 95%, at least. The test
statistics are 73.8(=698.93/9.47) for the slope and 4.75 (=2.28/0.48). Given the
very high test statistic for the slope, its p-value is approximately zero.

Page 32
Interpret the results of an ordinary least squares (OLS) regression
(continued)
Manual regression calculations for Test Score
Data (See XLS for raw data) Note
Sum [X(i)*Y(i)] 5,392,704.6
Sum [X(i)^2 = str(i)^2) 163,513.0
Sum [Δx(i) = X(i) - Xavg] 0.0 by definition (average)
Sum [Δy(i) = Y(i) - Yavg] 0.0 by definition (average)
Sum [Δx(i)^2] 1,499.6
Sum [Δy(i)^2] 152,109.6
Sum [Δx(i)Δy(i)] (3,418.8)
Sum [(Y(i)Pred - Yavg)^2] 7,794.1 Explained sum of squares (ESS)
Sum [e(i) = Y(i) - fitted Y(i)] 0.0 by definition (OLS)
Sum [e(i)^2] 144,315.5 Sum of squared residuals (SSR)
Sum [X(i)*e(i)] 0.0 by OLS design per uncorrelated X(i), e(i)

TestScore = 698.9 + -2.280 × STR, R^2 = 0.051, SER = 18.58

(9.47) (0.48)

Measures of fit (S&W section 4.3) Note

Number of observations 420.0
Explained sum of squares (ESS) 7,794.1 = Sum[(Y(i)Pred - Yavg)^2]
Sum of squared residuals (SSR) 144,315.5 = Sum[e(i)^2]
Total sum of squares (TSS) 152,109.6 = Sum[Δy(i)^2] = ESS + SSR
Coefficient of determination (R^2) 5.12% = ESS/TSS = 7,794.1 / 152,109.6
Variance of error 345.3 = SSR/df = SSR/(n-2) = 144,315.5 / (420-2)
Standard error of the regression (SER) 18.6 = sqrt(variance of error) = sqrt(345.3)

Page 33
The End