0% found this document useful (0 votes)
2 views

Econometrics 2 Module 5 Video 2 Canvas

Uploaded by

Maarten Overeem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Econometrics 2 Module 5 Video 2 Canvas

Uploaded by

Maarten Overeem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

FEB22005(X): Econometrics 2

Module 5 – Video 2:
Binary Variables
Andreas Pick

Erasmus University Rotterdam


Econometric Institute
Non-linear Model for Probability of Success

Instead, use non-linear model for pi :

P(yi = 1) = F (xi0 β),

with 0 ≤ F (z) ≤ 1 for all possible values of z

Choice of function F (·) gives two typical variants, logit and probit

For both variants, there are two common interpretations:


1 In terms of probability distributions
2 In terms of latent (underlying/unobserved) variables

ERASMUS SCHOOL OF ECONOMICS 1/12


1. Interpretation w/ Probability Distributions

The linear model

yi = xi0 β + εi , with εi ∼ N(0, σ 2 )

can also be written as

yi ∼ N(µi , σ 2 ) with µi = xi0 β

Likewise, the model for the 0/1-variable can be written as

yi ∼ B(pi ) with pi = P(yi = 1) = F (xi0 β),

where B(p) is the Bernoulli distribution with success probability p

ERASMUS SCHOOL OF ECONOMICS 2/12


2. Interpretation w/ Latent Variables

We introduce an additional variable: yi∗

yi∗ is a continuous but unobserved variable

yi∗ = xi0 β + εi , with E[εi ] = 0

yi∗ is, e.g., “urge” to give a donation to a charity, or “urge” to work

The variable yi∗ is related to the outcome yi through a threshold,


usually set at 0 (
0 if yi∗ ≤ 0
yi =
1 if yi∗ > 0

ERASMUS SCHOOL OF ECONOMICS 3/12


2. Interpretation w/ Latent Variables

Probability of “success” in the model with the latent variable becomes:

P(yi = 1) = P(yi∗ > 0)


= P(xi0 β + εi > 0)
= P(εi > −xi0 β)
= P(εi ≤ xi0 β)

assuming εi has a symmetric distribution

⇒ If F is the CDF of εi , then

P(yi = 1) = F (xi0 β)

ERASMUS SCHOOL OF ECONOMICS 4/12


2. Interpretation w/ Latent Variables

y *i -2

-4

-6

-8
-4 -2 0 2 4

xi

ERASMUS SCHOOL OF ECONOMICS 5/12


Two Variants: Logit and Probit

Two variants of the model (two choices for F (·))

Logit: εi ∼ LOG(0, 1), such that

exp(xi0 β)
F (xi0 β) = Λ(xi0 β) =
1 + exp(xi0 β)
1
=
1 + exp(−xi0 β)

(Machine Learning: Standard choice for “sigmoid” function)

Probit: εi ∼ N(0, 1), such that


Z xi0 β  2
1 z
F (xi0 β) = Φ(xi0 β) = √ exp − dz
−∞ 2π 2

ERASMUS SCHOOL OF ECONOMICS 6/12


whereas the cumulative distribution function F ¼ F of the probit
Normal and Logistic Densities
model should be computed numerically by approximating the integral

0.5
standardized
logistic
0.4 standard
normal

f (x ) 0.3

0.2

0.1

0.0
−2 0 2 4
x

Exhibit 6.2 Normal and logistic densities


⇒ Difference between logit and probit is not big; choice is mostly a
matterofofthe“taste”
Densities standard (for the logit
normal model(dashed
distribution more line)
things
andcan be logistic
of the deriveddistribution
(solid line, scaled so that both densities have standard deviation equal to 1). As compared with
straightforwardly) √
⇒ βLOGIT
both tails
≈ 1.8β far away
(for values of xPROBIT
, as σ 0). = π/ 3around the mean (x ¼ 0) and also in
the normal density, the logistic density has larger values
fromLOGIT

ERASMUS SCHOOL OF ECONOMICS 7/12


Logit Function

1.0
β0=−2, β1=1
0.8 β0=−2, β1=2
β0=−4, β1=1
Λ(β0 + β1xi )

0.6

0.4

0.2

0.0
-4 -2 0 2 4

x
i

ERASMUS SCHOOL OF ECONOMICS 8/12


Parameter Estimation

The parameters of the logit/probit model can be estimated with


Maximum Likelihood (ML)

As yi has a Bernoulli distribution, yi ∼ B(pi ), where pi = P(yi = 1), the


probability density function (pdf; NL: “kansdichtheid”) is

f (yi ) = piyi (1 − pi )1−yi ,

such that (
1 − pi , if yi = 0,
f (yi ) =
pi , if yi = 1

Continue with ML:


→ Formulate a likelihood and maximize this over the parameters in
pi = F (xi0 β), that is, maximize over β

ERASMUS SCHOOL OF ECONOMICS 9/12


ML Estimation
Under the assumption: y1 , y2 , . . . , yn independent the likelihood
function (NL: “aannemelijkheidsfunctie”) is
n
Y n
Y
L(β) = f (y1 , . . . , yn ) = f (yi ) = piyi (1 − pi )1−yi
i=1 i=1

⇒ Estimate β by maximizing L(β)


More convenient: maximize the log of L(β):
n
X
`(β) = yi log(pi ) + (1 − yi ) log(1 − pi )
i=1

Have β̂ consistent and asymptotically normal, such that β̂ ≈ N(β, V̂ ),


with V̂ the estimated covariance matrix; this asymptotic distribution is
also used for testing significance or parameters (“z-scores”)

ERASMUS SCHOOL OF ECONOMICS 10/12


Estimates for Example

Dependent Variable: RECESSION Dependent Variable: RECESSION


Method: ML - Binary Logit (Newton-Raphson / Marquardt steps) Method: ML - Binary Probit (Newton-Raphson / Marquardt steps)
Sample (adjusted): 1948M02 2019M12 Sample (adjusted): 1948M02 2019M12
Included observations: 863 after adjustments Included observations: 863 after adjustments
Convergence achieved after 7 iterations Convergence achieved after 6 iterations
Coefficient covariance computed using observed Hessian Coefficient covariance computed using observed Hessian

Variable Coefficient Std. Error z-Statistic Prob. Variable Coefficient Std. Error z-Statistic Prob.

C -0.598467 0.441854 -1.354445 0.1756 C -0.354163 0.243458 -1.454718 0.1457


CPI(-1) 0.228173 0.042293 5.395090 0.0000 CPI(-1) 0.130144 0.022965 5.666971 0.0000
IP(-1) -0.499122 0.051346 -9.720766 0.0000 IP(-1) -0.284369 0.026476 -10.74060 0.0000
UNEMPLRATE(-1) -0.309816 0.079585 -3.892892 0.0001 UNEMPLRATE(-1) -0.179566 0.044180 -4.064437 0.0000

McFadden R-squared 0.238644 Mean dependent var 0.141367 McFadden R-squared 0.248137 Mean dependent var 0.141367
S.D. dependent var 0.348602 S.E. of regression 0.311886 S.D. dependent var 0.348602 S.E. of regression 0.310818
Akaike info criterion 0.629680 Sum squared resid 83.55739 Akaike info criterion 0.621945 Sum squared resid 82.98605
Schwarz criterion 0.651745 Log likelihood -267.7070 Schwarz criterion 0.644009 Log likelihood -264.3692
Hannan-Quinn criter. 0.638126 Deviance 535.4140 Hannan-Quinn criter. 0.630390 Deviance 528.7383
Restr. deviance 703.2377 Restr. log likelihood -351.6188 Restr. deviance 703.2377 Restr. log likelihood -351.6188
LR statistic 167.8237 Avg. log likelihood -0.310205 LR statistic 174.4994 Avg. log likelihood -0.306337
Prob(LR statistic) 0.000000 Prob(LR statistic) 0.000000

Obs with Dep=0 741 Total obs 863 Obs with Dep=0 741 Total obs 863
Obs with Dep=1 122 Obs with Dep=1 122

ERASMUS SCHOOL OF ECONOMICS 11/12


Output Logit

1.0

0.8

0.6

0.4

0.2

0.0
50 55 60 65 70 75 80 85 90 95 00 05 10 15

RECESSION RECESSIONF

ERASMUS SCHOOL OF ECONOMICS 12/12

You might also like