0% found this document useful (0 votes)
16 views

Lecture 5

Uploaded by

iub.foisal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Lecture 5

Uploaded by

iub.foisal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Quantitative Methods in Finance

Lecture 5 : Simple Linear Regression


Multiple Linear Regression

November 2, 2023

Douglas Turatti
[email protected]
Aalborg University Business School
Denmark
Variance of OLS Estimates
Quantitative Methods

Douglas Turatti
▶ The real parameters β0 and β1 are not random variables.
They are fixed unknown constants. 1 Variance of OLS
Estimates

Additional Topics
▶ However, the estimators β̂0 and β̂1 are random variables with
Multiple Regression
a distribution called sampling distribution. Why they are Analysis

random? Properties of OLS in


Multiple Regression

Variance of OLS
▶ We have stated that as the estimators are unbiased under some Estimators in Multiple
Regression
assumptions, the expected value of the distribution is the real
Inference in
parameter. Regression Models

Hypothesis Testing:
▶ Now, we will characterize the variance of the estimator. t-test

▶ The fact that the OLS estimator has a variance means that
we have a range of possible values for β̂, not only one result.

▶ The variance of the estimators is very important, as it is the basis


of hypothesis testing.
Aalborg University Business
School
49 Denmark
Variance of OLS Estimates
Quantitative Methods

Douglas Turatti
▶ The variance also tells us how far from β1 the estimator β̂1 can be.
2 Variance of OLS
Estimates
▶ To find the variance of the estimator, we need to add a new assumption.
Additional Topics
This assumption is not necessary, but it simplifies a lot the analysis.
Multiple Regression
Analysis
Assumption Properties of OLS in
All error terms ui have same variance var[ui ] = σ 2 and they are all Multiple Regression

uncorrelated with each other, i.e. cov[ui , uj ] = 0. Variance of OLS


Estimators in Multiple
Regression
var[ui |X ] = σ 2 , for all i = 1, . . . , N (1)
Inference in
var[u|X ] = E[uu ′ ] = σ 2 I, In matrix form (2) Regression Models

Hypothesis Testing:
Then, the error terms are homoscedastic (i.e. equal variance) and t-test

uncorrelated.

▶ This mean that every error term ui for all i = 1, . . . , has the same
variance.

▶ If the error terms have different variances then they are heteroscedastic.
Aalborg University Business
School
49 Denmark
Variance of OLS Estimates
Quantitative Methods
Homoscedastic Errors: The dispersion of the dependent Douglas Turatti
variable Y around the conditional mean does not depend
3 Variance of OLS
on the value of X . Estimates

Additional Topics

Multiple Regression
Analysis

Properties of OLS in
Multiple Regression

Variance of OLS
Estimators in Multiple
Regression

Inference in
Regression Models

Hypothesis Testing:
t-test

Aalborg University Business


School
49 Denmark
Variance of OLS Estimates
Quantitative Methods
Heteroscedastic Errors: The dispersion of the dependent Douglas Turatti
variable Y around the conditional mean depends on the
4 Variance of OLS
value of X . Estimates

Additional Topics

Multiple Regression
Analysis

Properties of OLS in
Multiple Regression

Variance of OLS
Estimators in Multiple
Regression

Inference in
Regression Models

Hypothesis Testing:
t-test

Aalborg University Business


School
49 Denmark
Variance of OLS Estimates
Quantitative Methods

Douglas Turatti

5 Variance of OLS
Estimates
▶ Serial correlation: in the error terms normally occurs when
Additional Topics
using time series data.
Multiple Regression
Analysis
▶ For example, if you omit a regressor, the error terms (residuals) Properties of OLS in
Multiple Regression
may appear correlated as this pattern is removed from the
Variance of OLS
conditional mean. Estimators in Multiple
Regression

▶ Another possibility, is when using geographic data, observations Inference in


Regression Models
coming close from each other are correlated and so the error Hypothesis Testing:
terms. t-test

▶ The error terms may be correlated even though we have


specified the correct model. More on that later.

Aalborg University Business


School
49 Denmark
Variance of OLS Estimates
Quantitative Methods
Now, I will obtain the variance of OLS estimates. Again, I will use matrix form, Douglas Turatti
because it makes easier and faster to show results. Also results are more
intuitive. 6 Variance of OLS
Estimates

Additional Topics
A variance is defined for the OLS case as
Multiple Regression
′ Analysis
E[(β̂ − β)(β̂ − β) |X ] (3)
Properties of OLS in
This is the variance conditional on X. This variance is a 2 × 2 matrix for the Multiple Regression

simple linear regression case. Variance of OLS


Estimators in Multiple
Regression
We know β̂ − β equals to Inference in
(X ′ X )−1 X ′ u (4) Regression Models

Using E uu ′ |X = σ 2 I (assumption V) it is possible to show that:


 
Hypothesis Testing:
t-test

var[β̂] = σ 2 (X ′ X )−1 (5)


This is the variance matrix of OLS estimators. It depends on two terms
1. Variance of error terms σ 2 : The more the error terms are spread, the
higher uncertainty on OLS estimates.
2. Variability in X : The term (X ′ X )−1 is the inverse of variability in X . The
more spread values for X , the more precise the estimate becomes. Aalborg University Business
School
49 Denmark
Variance of OLS Estimates
Quantitative Methods

Douglas Turatti
▶ With the formula for the variance of the estimates, it is possible to
obtain confidence intervals and calculate hypothesis tests. 7 Variance of OLS
Estimates

Additional Topics
▶ However, they depend on the real variance of the error
Multiple Regression
terms, which is unknown. Hence, we need to estimate it and Analysis

then obtain an estimator for the variance of the estimator. Properties of OLS in
Multiple Regression

Variance of OLS
▶ Applying the sample variance formula to the residuals yields, Estimators in Multiple
Regression

N Inference in
1 X 2 SSR
ˆ [u] = σ̂ 2 =
var ûi = (6)
Regression Models

n−2 n−2 Hypothesis Testing:


i=1 t-test

▶ Why divided by n − 2? This yields an unbiased estimator for σ 2 .

▶ We then get the estimator for the variance of OLS estimates as,

c β̂] = σ̂ 2 (X ′ X )−1
var[ (7)
Aalborg University Business
School
49 Denmark
Additional Topics
Regression through the Origin

Quantitative Methods

Douglas Turatti
▶ Sometimes, we wish to impose the restriction that, when x = 0,
Variance of OLS
the expected value of y is zero. Estimates
8 Additional Topics

▶ A regression through the origin is, Multiple Regression


Analysis

Properties of OLS in
y = β̃1 x + ui (8) Multiple Regression

Variance of OLS
▶ A regression as this is called regression through the origin Estimators in Multiple
Regression
because the line passes through the point x = 0, y = 0. The Inference in
estimator for β̃1 is still obtained by OLS. Regression Models

Hypothesis Testing:
t-test
▶ Obtaining an estimate of β̃1 using regression through the origin is
not done very often in applied work, and for good reason: if the
intercept β0 ̸= 0 then β̂1 is a biased estimator of β1 .

▶ The conservative approach is to include β0 even if you think it is


0.
Aalborg University Business
School
49 Denmark
Additional Topics
The Effects of Changing Units of Measurement

Quantitative Methods

Douglas Turatti

▶ The OLS estimates change in entirely expected ways when the Variance of OLS
Estimates
units of measurement of the dependent and independent 9 Additional Topics
variables change. Multiple Regression
Analysis

▶ If the dependent variable is multiplied by the constant c then the Properties of OLS in
Multiple Regression
OLS intercept and slope estimates are also multiplied by c. Variance of OLS
Estimators in Multiple
Regression
▶ If the independent variable is divided or multiplied by some
Inference in
nonzero constant, c, then the OLS slope coefficient is multiplied Regression Models

or divided by c, respectively. Hypothesis Testing:


t-test

y = β0 + β1 x + u (9)
y = β0 + β2 cx + u (10)
βˆ2 = c βˆ1 (11)

Aalborg University Business


School
49 Denmark
Additional Topics
Log-Log model

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates
10 Additional Topics
▶ We can define a linear regression model in logs as Multiple Regression
Analysis

ln y = β0 + β1 ln x + u (12) Properties of OLS in


Multiple Regression

Variance of OLS
▶ Is this a linear model in econometrics? Estimators in Multiple
Regression

Inference in
▶ How to estimate this model? Regression Models

Hypothesis Testing:
▶ β1 is interpreted as the elasticity, i.e. the expected percentual t-test

variation in y when x changes 1%.

Aalborg University Business


School
49 Denmark
Additional Topics
Linear-Log model

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

11 Additional Topics
▶ We can define a linear regression model in logs as Multiple Regression
Analysis

y = β0 + β1 ln x + u (13) Properties of OLS in


Multiple Regression

Variance of OLS
▶ Is this a linear model in econometrics? Estimators in Multiple
Regression

Inference in
▶ How to estimate this model? Regression Models

Hypothesis Testing:
▶ β1 is interpreted as the expected change in y when x changes t-test

1%.

Aalborg University Business


School
49 Denmark
Additional Topics
Log-Linear model

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

12 Additional Topics
▶ We can define a linear regression model in logs as Multiple Regression
Analysis

ln y = β0 + β1 x + u (14) Properties of OLS in


Multiple Regression

Variance of OLS
▶ Is this a linear model in econometrics? Estimators in Multiple
Regression

Inference in
▶ How to estimate this model? Regression Models

Hypothesis Testing:
▶ β1 is interpreted as the expected percentual change in y when x t-test

changes 1 unit.

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Introduction

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics
▶ The simple linear regression is useful and provides a framework.
13 Multiple Regression
Analysis
▶ However, the fact that there is only one regressor is too Properties of OLS in
restrictive for applied work. Multiple Regression

Variance of OLS
Estimators in Multiple
▶ If there are more factors that explain y , then the error terms Regression

are likely to incorporate them. Hence, the error terms will Inference in
Regression Models
not be pure random shocks.
Hypothesis Testing:
t-test
▶ A regression model with one regressand and several regressors
is called multiple regression model. Multiple vs Multivariate.

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Introduction

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

14 Multiple Regression
Analysis
▶ The multiple regression model is probably the most used
Properties of OLS in
vehicle for empirical analysis in social sciences. Multiple Regression

Variance of OLS
▶ The previous results apply, and multiple regression is just an Estimators in Multiple
Regression
extension of the simple linear case. Inference in
Regression Models

▶ Let’s consider one case of much interest in finance. Hypothesis Testing:


t-test

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Motivation in Finance: Fama-French Model

Quantitative Methods

Douglas Turatti

Variance of OLS
▶ The Fama-French style of regressions is still one of the most Estimates

Additional Topics
used techniques in asset management, corporate finance and
15 Multiple Regression
alike. Analysis

Properties of OLS in
▶ The Fama-French extends the CAPM model to consider that the Multiple Regression

expected returns for an asset is driven by two extra factors: Variance of OLS
Estimators in Multiple
Regression
▶ SMB : size premium. Small (market capitalization) Minus Big. Inference in
Regression Models

Hypothesis Testing:
▶ HML: Value premium. High (book-to-market ratio) Minus Low t-test

E[Ri,t ] − Rf = α + β1 [Rm,t − Rf ] + β2 SMBt + β3 HMLt (15)

▶ This is a linear regression model with 3 regressors.

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Regression Model with 2 regressors

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates
▶ A regression model with 2 regressors is, Additional Topics
16 Multiple Regression
y = β0 + β1 x1 + β2 x2 + u (16) Analysis

Properties of OLS in
Multiple Regression
▶ β0 is the intercept.
Variance of OLS
Estimators in Multiple
▶ β1 = dE[y |x1 ]/dx1 measures the expected change in y with Regression

Inference in
respect to x1 , holding other factors fixed. Regression Models

Hypothesis Testing:
▶ β2 = dE[y |x2 ]/dx2 measures the expected change in y with t-test

respect to x2 , holding other factors fixed.

▶ u are the error terms or disturbances.

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Regression Model with 2 regressors

Quantitative Methods

Douglas Turatti
▶ Mechanically, there will be no difference in using the method of OLS to
Variance of OLS
estimate this model with 2 regressors and the simple linear regression. Estimates

Additional Topics
▶ As before, we need to make assumptions on the error terms so the
17 Multiple Regression
parameter set (β0 , β1 , β2 ) is unique. Analysis

Properties of OLS in
▶ This is called identification. Multiple Regression

Variance of OLS
Estimators in Multiple
▶ The assumptions are basically the same as in the simple regression. Regression
However, now we have 3 parameters so we need 3 assumptions, Inference in
Regression Models
E[u] = 0 (17)
Hypothesis Testing:
E[u|x1 ] = cov(u, x1 ) = 0 (18) t-test

E[u|x2 ] = cov(u, x2 ) = 0 (19)

▶ Hence, the error terms must have unconditional mean equal to 0


and be uncorrelated with any set of regressors. This assumption also
means that the error terms are uncorrelated with any linear combination
of the variables.
Aalborg University Business
School
49 Denmark
Multiple Regression Analysis
Regression Model with K regressors

Quantitative Methods

Douglas Turatti

Variance of OLS
▶ The general multiple linear regression model (also called the Estimates

multiple regression model) can be written as the DGP as, Additional Topics
18 Multiple Regression
Analysis
y = β0 + β1 x1 + β2 x2 + . . . βk xk + u (20)
Properties of OLS in
Multiple Regression
▶ βj measures the expected change in y giving a unit change in xj
Variance of OLS
when all other variables are kept fixed. Estimators in Multiple
Regression

Inference in
▶ All β are referred as slope parameters. Regression Models

Hypothesis Testing:
t-test
▶ The error term have the same assumptions,

E[u] = 0 (21)
E[u|xj ] = 0, j = 1, . . . k (22)

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
OLS Estimators

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics
▶ The OLS objective function is, 19 Multiple Regression
Analysis
N
X Properties of OLS in
minβ0 ,β1 ,...,βk = (yi − β0 − β1 xi,1 − · · · − βk xi,k )2 , (23) Multiple Regression

i=1 Variance of OLS


Estimators in Multiple
Regression
▶ To find the minimum, we have to set all partial derivatives equal Inference in
Regression Models
to 0.
Hypothesis Testing:
t-test
▶ This is a system of k + 1 equations. It is way simpler in matrix
form.

Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
A Little on Matrix Form

Quantitative Methods
▶ Let y be a N × 1 vector,
 Douglas Turatti
y1

y  Variance of OLS
 2 Estimates
y = .  (24)
 
 .  Additional Topics
 . 
yN 20 Multiple Regression
Analysis

Then X can be written as N × (k + 1) matrix, Properties of OLS in


Multiple Regression

1 x1,1 x2,1 ... xk ,1
 Variance of OLS
Estimators in Multiple
1 x1,2 x2,2 ... xk ,2  Regression
X = . . . .  (25)
 
. . . .. .  Inference in
. . . . . Regression Models
1 x1,N x2,N ... xk ,N
Hypothesis Testing:
t-test
▶ Let β be the (k + 1) × 1 vector of parameters, i.e


β0
β 
 1
β= .  (26)
 
 . 
 . 
βk
Aalborg University Business
School
49 Denmark
Multiple Regression Analysis
A Little on Matrix Form

Quantitative Methods

Douglas Turatti
▶ Let u be a N × 1 vector,
  Variance of OLS
Estimates
u1
Additional Topics
 u2 
 
21 Multiple Regression
y = .  (27)
 
Analysis
 . 
 .  Properties of OLS in
Multiple Regression
uN
Variance of OLS
Estimators in Multiple
Regression
▶ Then, the regression model is represented as
Inference in
Regression Models
y = X β + u (28) Hypothesis Testing:
(N×1) (N×(k +1))((k +1)×1) (N×1) t-test

▶ The OLS estimator is



β̂ =( X X )−1 X′ y (29)
((k +1)×1) ((k +1)×N) (N×(k +1)) ((k +1)×N)(N×1)

▶ The OLS cannot be further simplified for the multiple case.


Aalborg University Business
School
49 Denmark
Multiple Regression Analysis
OLS Estimators

Quantitative Methods
▶ The fitted values are, Douglas Turatti

ŷ = β̂0 + β̂1 x1 + β̂2 x2 + . . . β̂k xk (30) Variance of OLS


Estimates

Additional Topics
▶ The residual for the observation i is defined as: ûi = yi − ŷi
22 Multiple Regression
Analysis
▶ Some math properties of residuals: Properties of OLS in
Multiple Regression
1. The sample average of the residuals is zero and so ȳ = ŷ¯ .
Variance of OLS
2. The sample covariance between each independent variable Estimators in Multiple
Regression
and the OLS residuals is zero.
Inference in
3. The point (ȳ , x̄1 , . . . , x̄k ) is always on the regression line as Regression Models

ȳ = β̂0 + β̂1 x̄1 + · · · + β̂k x̄k Hypothesis Testing:


t-test

▶ The expected effect on y when xj changes is,

∆ŷ = β̂j ∆xj (31)

▶ The expected effect on y when all x changes is,

∆ŷ = β̂1 ∆x1 + · · · + β̂k ∆xk (32) Aalborg University Business


School
49 Denmark
Multiple Regression Analysis
Comparison of Simple and Multiple Regression Estimates

Quantitative Methods

Douglas Turatti
▶ Suppose an estimated simple linear model,
Variance of OLS
Estimates
ỹ = β̃0 + β̃1 x1 (33) Additional Topics
23 Multiple Regression
and an estimated multiple linear model, Analysis

Properties of OLS in
ŷ = β̂0 + β̂1 x1 + β̂2 x2 (34) Multiple Regression

Variance of OLS
Estimators in Multiple
▶ Is β̃1 equal to β̂1 ? Regression

Inference in
Regression Models
▶ No! As a general case they are different. They are equal only
Hypothesis Testing:
when the correlation between x1 and x2 is 0. t-test

▶ What about the intercepts? They are equal if an only if β2 = 0,


thus no relationship between x2 and y .

▶ Thus, as a rule you should include x2 if there is indication that β2


may be different from 0.
Aalborg University Business
School
49 Denmark
Multiple Regression Analysis
Comparison of Simple and Multiple Regression Estimates

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

24 Multiple Regression
▶ Goodness-of-Fit: The R 2 is defined in the same way as in the Analysis

simple model. Properties of OLS in


Multiple Regression

Variance of OLS
▶ Regression through the Origin: In multiple models we may also Estimators in Multiple
Regression
have the intercept equal to 0.
Inference in
Regression Models
▶ It is still advised to avoid this type of regression. You should Hypothesis Testing:
include the intercept even if you think it is zero. t-test

Aalborg University Business


School
49 Denmark
Properties of OLS in Multiple Regression
Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

▶ As before, we need to find the statistical qualities of the OLS Multiple Regression
Analysis
estimator. 25 Properties of OLS in
Multiple Regression
▶ You should remember that statistical properties have little to do Variance of OLS
Estimators in Multiple
with a particular sample, but rather with the property of Regression

estimators when random sampling is done repeatedly. Inference in


Regression Models

▶ Here, we will only state the conditions for unbiasedness. They Hypothesis Testing:
t-test
follow our previous discussion.

Aalborg University Business


School
49 Denmark
Properties of OLS in Multiple Regression
Unbiasedness of OLS

Quantitative Methods

Douglas Turatti
Assumption
Variance of OLS
Linear on parameters: In DGP, the dependent variable, y, is related to Estimates
the independent variable, x, and the error u, via a linear specification Additional Topics

on parameters, Multiple Regression


Analysis

y = β0 + β1 x1 + · · · + βk xk + u (35) 26 Properties of OLS in


Multiple Regression

Variance of OLS
Estimators in Multiple
Assumption Regression

Inference in
Correct specification: In DGP, the dependent variable, y, is related to Regression Models

the independent variable, x, and the error u, as Hypothesis Testing:


t-test

y = β0 + β1 x1 + · · · + βk xk + u (36)

and there can be no other variable with a systematic effect ony rather
than x1 , . . . , xk .
▶ This assumption implies that error terms are unsystematic, or
they have no obvious pattern. They must be pure random. Aalborg University Business
School
49 Denmark
Properties of OLS in Multiple Regression
Unbiasedness of OLS

Quantitative Methods

Douglas Turatti

Assumption Variance of OLS


Estimates
No Perfect Collinearity: In the sample and in the DGP, none of the
Additional Topics
independent variables is constant, and there are no exact linear
Multiple Regression
relationships among the independent variables Analysis

▶ This assumption is only for mathematical calculation. In 27 Properties of OLS in


Multiple Regression
economics/finance the regressors xi and xj are usually correlated Variance of OLS
but never equal to 1. Estimators in Multiple
Regression

Assumption Inference in
Regression Models
Zero Mean: The error terms must have unconditional mean equal to Hypothesis Testing:
0. Additionally, all conditional mean w.r.t the matrix X must be 0. t-test

E[u] = E[u|xj ] = 0 (37)

for all j = 1, . . . , k .

▶ This is the previous assumption to identify all β’s.


Aalborg University Business
School
49 Denmark
Properties of OLS in Multiple Regression
Unbiasedness of OLS

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

Multiple Regression
Theorem Analysis
28 Properties of OLS in
Under assumptions (1)-(4) it is possible to show that the OLS Multiple Regression
estimator in the multiple regression case is unbiased, Variance of OLS
Estimators in Multiple
Regression
E[β̂j ] = βj , for all j = 0, 1, . . . , k . (38)
Inference in
Regression Models
for any values of the population parameter βj . In other words, the OLS Hypothesis Testing:
estimators are unbiased estimators of the DGP parameters. t-test

Aalborg University Business


School
49 Denmark
Properties of OLS in Multiple Regression
Including Irrelevant Variables

Quantitative Methods

Douglas Turatti
▶ Overspecified model: Inclusion of irrelevant variables in the
Variance of OLS
regression model. Estimates

Additional Topics
▶ Suppose we specify the model as, Multiple Regression
Analysis

y = β0 + β1 x1 + β2 x2 + β3 x3 + u (39) 29 Properties of OLS in


Multiple Regression

Variance of OLS
▶ Suppose the real β3 = 0. What is the impact on the OLS Estimators in Multiple
Regression
estimators?
Inference in
Regression Models
▶ OLS is still unbiased!
Hypothesis Testing:
t-test

▶ Note that E[β3 ] = 0. Even though β3 itself will never be exactly


zero, its average value across all random samples will be zero.

▶ However, including irrelevant variables, may affect the variances


of OLS. Estimates get less precise when irrelavant variables
are included.
Aalborg University Business
School
49 Denmark
Properties of OLS in Multiple Regression
Omitted Variable Bias

Quantitative Methods

▶ Underspecified model: excluding a relevant variable. Douglas Turatti

Variance of OLS
▶ This is an example of misspecification. Estimates

Additional Topics

▶ Suppose the DGP is, Multiple Regression


Analysis
30 Properties of OLS in
y = β0 + β1 x1 + β2 x2 + u (40) Multiple Regression

Variance of OLS
but you estimate, Estimators in Multiple
Regression
y = β0 + β1 x1 + u (41)
Inference in
Regression Models
Then the OLS estimator is,
Hypothesis Testing:
t-test
β̂ = (X1′ X1 )−1 X1′ (β1 X1 + β2 X2 + u) (42)
β̂ = β1 + (X1′ X1 )−1 X1′ β2 X2 + (X1′ X1 )−1 X1′ u (43)

The term β2 x2 will induce a bias in the estimator β̂1 .

▶ This is a serious problem. Omitted variables may lead to


poor results. Aalborg University Business
School
49 Denmark
Variance of OLS Estimators in Multiple
Regression
Quantitative Methods

▶ In addition to knowing the central tendencies of the β̂j , we also Douglas Turatti

have a measure of the spread in its sampling distribution. Variance of OLS


Estimates

Additional Topics
▶ The variances of the OLS estimates are fundamental to obtain
Multiple Regression
confidence intervals, and for hypothesis testing. Analysis

Properties of OLS in
▶ As before, to simplify for now, we assume homoscedastic and Multiple Regression
31 Variance of OLS
uncorrelated errors, Estimators in Multiple
Regression
Assumption Inference in
Regression Models
The error u has the same conditional variance given any value of the
Hypothesis Testing:
explanatory variables. t-test

var[ui |x1, . . . , xk ] = σ2 (44)


cov[ui , uj ] = 0 for all i ̸= j (45)
or in matrix form: (46)
2
var[u|x1, . . . , xk ] = σ I (47)
Aalborg University Business
School
49 Denmark
Variance of OLS Estimators in Multiple
Regression
Homocedasticity vs Heteroscedasticity
Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

Multiple Regression
Analysis

Properties of OLS in
Multiple Regression
32 Variance of OLS
Estimators in Multiple
Regression

Inference in
Regression Models
(a) Homoscedastic errors (b) Heteroscedastic errors Hypothesis Testing:
t-test

Figure: Variance of errors

Aalborg University Business


School
49 Denmark
Variance of OLS Estimators in Multiple
Regression
Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics
▶ This assumption implies that all error terms have same
Multiple Regression
variance. Analysis

Properties of OLS in
▶ This is violated if y becomes more spread as we move along Multiple Regression
33 Variance of OLS
values of x. Estimators in Multiple
Regression

▶ This assumption also means that the error terms do not Inference in
Regression Models
depend on each other. Hence, they are completely random. Hypothesis Testing:
t-test
▶ This last implication is usually violated when we have omitted
variables.

Aalborg University Business


School
49 Denmark
Variance of OLS Estimators in Multiple
Regression
Quantitative Methods

▶ As before the variance of the estimators is Douglas Turatti

2 T −1 Variance of OLS
var[β̂] = σ (X X ) (48) Estimates

This is a (k + 1) × (k + 1) matrix. Additional Topics

Multiple Regression
Analysis
▶ We also need an estimate for σ 2 :
Properties of OLS in
N Multiple Regression
2 1 X SSR
ˆ [u] = σ̂ =
var ui2 = (49) 34 Variance of OLS
n−k −1 n−k −1 Estimators in Multiple
i=1 Regression

why n − k − 1? Inference in
Regression Models

▶ Thus, the estimator for the variance of the estimator, Hypothesis Testing:
t-test

ˆ β̂] = σ̂ 2 (X T X )−1
var[ (50)

▶ The estimated standard error of the estimator β̂i is,


q
ˆ β̂i ) = var[
se( ˆ β̂]i,i (51)

i.e. the square root of the ith element in the diagonal of the matrix. Aalborg University Business
School
49 Denmark
Variance of OLS Estimators in Multiple
Regression
Multicollinearity
Quantitative Methods

▶ Multicollinearity refers to a situation in which two or more Douglas Turatti

explanatory variables in a multiple regression model are highly Variance of OLS


Estimates
correlated, for example,
Additional Topics

corr[X1 , X2 ] = 0.7 (52) Multiple Regression


Analysis

Properties of OLS in
▶ It is not necessarily a problem, but a characteristic of the Multiple Regression

dataset. 35 Variance of OLS


Estimators in Multiple
Regression
▶ However, if the correlation is too high, for example, higher Inference in
Regression Models
than 0.8 we may need to take a look at the data. If it is higher
Hypothesis Testing:
than 0.9, there must be a problem in the data construction. t-test

▶ What happens when the correlation between regressors is too


high?

▶ It becomes difficult to separate the effects of x1 and x2 , so the


uncertainty in the parameters increases and their estimated
variance. Aalborg University Business
School
49 Denmark
Inference in Regression Models
Introduction

Quantitative Methods

Douglas Turatti

▶ In applied work, we are usually interested in testing hypothesis Variance of OLS


Estimates
about the parameters of the DGP.
Additional Topics

Multiple Regression
▶ The OLS provides the point estimator. However, the estimator Analysis
is a random variable with a variance, thus there is a range of Properties of OLS in
Multiple Regression
possible values.
Variance of OLS
Estimators in Multiple
▶ For example, suppose the estimated CAPM, Regression
36 Inference in
Regression Models
Ri,t = 0.02 + 0.98Rm,t + εt (53)
Hypothesis Testing:
t-test
▶ Is α = 0? Is β = 1? Without more information you cannot answer
these questions.

▶ You need to find the sampling distribution of the estimator α̂ and


β̂.

Aalborg University Business


School
49 Denmark
Inference in Regression Models
Introduction

Quantitative Methods
▶ The unbiased property of OLS does not tell us the shape of the Douglas Turatti
distribution of the estimators.
Variance of OLS
Estimates
▶ In order to carry out hypothesis testing, we need more than the Additional Topics
mean and variance. We need a distribution assumption. Multiple Regression
Analysis
▶ Conditional on the values of the regressors x1 , . . . , xk , the only random Properties of OLS in
Multiple Regression
variable are the error terms. Thus we need to take a look at their
distribution. Variance of OLS
Estimators in Multiple
Regression
Assumption 37 Inference in
The error term u is independent of the explanatory variables x1 , x2 , . . . , xk and Regression Models
is normally distributed with zero mean and variance σ 2 , i.e. ui ∼ N(0, σ 2 ) for Hypothesis Testing:
t-test
all i = 1, . . . , N. In matrix form this is represented as
u ∼ N(0, σ 2 I) (54)
The error terms are homoscedastic, uncorrelated and follow a multivariate
normal distribution.

▶ This is the normality assumption on the error term. It is actually not a


requirement for hypothesis testing, but it makes things simpler. Aalborg University Business
School
49 Denmark
Inference in Regression Models
CLM

Quantitative Methods

▶ A regression model that satisfies the previous assumptions plus Douglas Turatti

normality is called Classical Linear Model (CLM). Variance of OLS


Estimates

▶ This is a very strong assumption. We will be able to drop Additional Topics

Multiple Regression
this assumption in a further step. Analysis

Properties of OLS in
▶ Only when the sample size is very large, it is possible to show Multiple Regression

that β̂ has an approximate normal distribution (under some Variance of OLS


Estimators in Multiple
conditions). For now, we assume it holds for every sample size. Regression
38 Inference in
Regression Models
▶ Under normality of the error terms,
Hypothesis Testing:
t-test
2
y |x ∼ N(β0 + β1 x1 + · · · + βk xk , σ ) (55)

Thus, conditional on x, y has a normal distribution with mean


linear in x1 , . . . , xk and a constant variance.

▶ In any application, whether normality of u can be assumed is


really an empirical matter. Complicated in financial models. Aalborg University Business
School
49 Denmark
Inference in Regression Models
Sampling Distribution

Quantitative Methods

Douglas Turatti
▶ Under the Assumptions of the CLM it is possible to show that,
Variance of OLS
Estimates

β̂i |x ∼ N βi , var[β̂]i,i (56)
Additional Topics

Multiple Regression
or the OLS estimator β̂i is normally distributed with mean βi and Analysis
variance var[β̂]. Properties of OLS in
Multiple Regression

▶ The intuition is not complicated. Recall the formula for the OLS Variance of OLS
Estimators in Multiple
estimator Regression

β̂ = β + (X ′ X )−1 X ′ u (57) 39 Inference in


Regression Models

Hence, the distribution of β̂ conditional on x depends on the Hypothesis Testing:


t-test
distribution of u.

▶ Additionally, we can standardize the estimator,

β̂i − βi
∼ N(0, 1) (58)
sd[β̂i ]
Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
Quantitative Methods

Douglas Turatti

▶ When using an estimate for the standard deviation (the standard Variance of OLS
Estimates
error), it is possible to show that,
Additional Topics

Multiple Regression
β̂i − βi Analysis
∼ tn−k −1 (59)
se[β̂i ] Properties of OLS in
Multiple Regression

The standardized estimator follows a student-t distribution with Variance of OLS


Estimators in Multiple
n − k − 1 degrees of freedom. Regression

Inference in
Regression Models
▶ This result allows us to do hypothesis testing.
40 Hypothesis Testing:
t-test
▶ Hypothesis testing uses the estimated parameters to make
inferences on the DGP.

▶ A test made on the assumptions of the CLM (including normality)


is called an exact test. For now, we only consider exact tests.

Aalborg University Business


School
49 Denmark
Hypothesis Testing: The t-test
CAPM example

Quantitative Methods
▶ Let’s take the CAPM as example,
Douglas Turatti

Ri,t = α + βRm,t + εt (60) Variance of OLS


Estimates

How can we test that α = 0? Additional Topics

Multiple Regression
Analysis
▶ We state a null hypothesis that α = 0 in the DGP,
Properties of OLS in
Multiple Regression
H0 : α = 0 (61) Variance of OLS
Estimators in Multiple
Regression
▶ Using the estimated parameters, we compute the t-statistic (59)
Inference in
assuming the null hypothesis is correct, Regression Models

41 Hypothesis Testing:
α̂ − 0 t-test
t= ∼ tn−k −1 (62)
se[α̂]
note that the estimator is assumed unbiased.

▶ Now, we ask how likely is this value of the t-statistic to come


from a t distribution when the null is indeed correct. It is
very likely, we would expect values closer to the center of Aalborg University Business
School
the distribution. 49 Denmark
Hypothesis Testing: The t-test
General Case

Quantitative Methods
▶ Suppose the model,
Douglas Turatti

y = β0 + β1 x1 + β2 x2 + · · · + βk xk + u (63) Variance of OLS


Estimates
▶ How can we test if βi = 0? State the null and alternative Additional Topics

hypothesis, Multiple Regression


Analysis
H0 : βi = 0 (64) Properties of OLS in
Multiple Regression
H1 : βi > 0 (65) Variance of OLS
Estimators in Multiple
thus under the null the value of βi on the DGP is equal to 0. Regression

Under the alternative, the value is greater than 0. This is an Inference in


Regression Models
one-side test.
42 Hypothesis Testing:
t-test
▶ Compute the t-statistic,
β̂i − 0
tβi = ∼ tn−k −1 (66)
se[β̂i ]
note that the estimator is again assumed unbiased.

▶ Check how likely this value is to come from the student-t Aalborg University Business
School
distribution. 49 Denmark
Hypothesis Testing: The t-test
General Case: Rejecting the Null

Quantitative Methods

Douglas Turatti
▶ How to check how likely is the value tβi ?
Variance of OLS
Estimates
▶ We need a significance level: a probability when it is very Additional Topics
unlikely to have occurred given the null hypothesis. Multiple Regression
Analysis

▶ Or also, the probability of rejecting the null hypothesis, given that Properties of OLS in
Multiple Regression
the null hypothesis was assumed to be true. Variance of OLS
Estimators in Multiple
Regression
▶ The convention in statistics is 0.05. A value of probability lower
Inference in
than this number would indicate that the null hypothesis is Regression Models

unlikely given these estimates. 43 Hypothesis Testing:


t-test

▶ Decision rule: Reject the null hypothesis is an one-side test if

|tβi | > cα (67)

where cα is the critical value, which is the (1 − α)th quantile of


the student-t distribution.
Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
General Case: Rejecting the Null

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

Multiple Regression
Analysis

Properties of OLS in
Multiple Regression

Variance of OLS
Estimators in Multiple
Regression

Inference in
Regression Models

44 Hypothesis Testing:
t-test
▶ The test also works in the same way for,

H0 : βi = 0 (68)
H1 : βi < 0 (69)

Because we get the absolute value of the t-statistic.


Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
General Case: Two-sided tests

Quantitative Methods

Douglas Turatti
▶ Suppose the model,
Variance of OLS
Estimates
y = β0 + β1 x1 + β2 x2 + · · · + βk xk + u (70) Additional Topics

Multiple Regression
▶ How can we test if βi = 0 against βi ̸= 0 Analysis

Properties of OLS in
Multiple Regression
▶ State the null and alternative hypothesis,
Variance of OLS
Estimators in Multiple
H0 : βi = 0 (71) Regression

Inference in
H1 : βi ̸= 0 (72) Regression Models
45 Hypothesis Testing:
▶ We calculate the t-statistic under the null. Rule for rejecting the t-test

null: Reject the null hypothesis in a two-sided test if

|tβi | > cα/2 (73)

where cα/2 is the critical value, which is the (1 − α/2)th quantile


of the student-t distribution.
Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
General Case: Rejecting the Null

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates

Additional Topics

Multiple Regression
Analysis

Properties of OLS in
Multiple Regression

Variance of OLS
Estimators in Multiple
Regression

Inference in
Regression Models
46 Hypothesis Testing:
t-test

▶ If we reject the null we usually say that the result is statistically


significant at 5%.

▶ Note on terminology: you reject or you do not reject a null


hypothesis. You never accept a hypothesis. Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
General Hypothesis

Quantitative Methods

Douglas Turatti
▶ Suppose the model,
Variance of OLS
Estimates
y = β0 + β1 x1 + β2 x2 + · · · + βk xk + u (74) Additional Topics

Multiple Regression
▶ How can we test if βi = βi∗ against βi ̸= βi∗ Analysis

Properties of OLS in
Multiple Regression
▶ State the null and alternative hypothesis, Variance of OLS
Estimators in Multiple

H0 : βi = βi∗ (75) Regression

Inference in
H1 : βi ̸= βi∗ (76) Regression Models

47 Hypothesis Testing:
t-test
▶ Calculate the t-statistic under the null,

β̂i − βi∗
tβi = ∼ tn−k −1 (77)
se[β̂i ]

▶ Proceed in the same way as before.


Aalborg University Business
School
49 Denmark
Hypothesis Testing: The t-test
P-Values

Quantitative Methods
▶ The choice of significance level can be somewhat arbitrary. Instead, we
Douglas Turatti
can think as What is the smallest significance level for which the test
rejects. This is the p-value. Variance of OLS
Estimates

▶ The p-value for two-tailed test is calculated as, Additional Topics


 Multiple Regression
p-value = 2.P T > |t| (78) Analysis

Properties of OLS in
where P() is the cumulative density function. Multiple Regression

Variance of OLS
▶ For example: suppose a calculated t-statistic as t = −2.01 with 25 d.f. Estimators in Multiple
Regression
The p-value for a two-tailed test is the double of the probability of values
greater than the calculated t-statistic Inference in
Regression Models

p-value = 2.P T > 2.01 = 2 × 0.0277 = 0.0553 (79) 48 Hypothesis Testing:
t-test
where P(.) is the cumulative density function of the t-distribution. The
P-value for one-tailed test for this calculated t-test is

p-value = P T > 2.01 = 0.0277 (80)

▶ Decision rule: Reject the null if


p-value < significance level (81)
Aalborg University Business
otherwise do not reject the null. 49
School
Denmark
Hypothesis Testing: The t-test
Testing Hypotheses about a Single Linear Combination of the Parameters

Quantitative Methods

Douglas Turatti

Variance of OLS
Estimates
▶ It is also possible to use the student-t test for linear combinations Additional Topics

such as, Multiple Regression


Analysis

Properties of OLS in
H0 : β1 = β2 (82) Multiple Regression

H1 : β1 < β 2 (83) Variance of OLS


Estimators in Multiple
Regression
This means, Inference in
Regression Models

H0 : β1 − β2 = 0 (84) 49 Hypothesis Testing:


t-test
H1 : β1 − β2 < 0 (85)

▶ The test for these cases is decribed in the main textbook.

Aalborg University Business


School
49 Denmark

You might also like