0% found this document useful (0 votes)
3 views

additional-cheatsheet-en (1)

The document is an econometrics cheat sheet that covers various topics including the variance-covariance matrix, OLS matrix notation, proxy variables, instrumental variables, and the two-stage least squares method. It also discusses the importance of model specification, information criteria, and statistical definitions such as mean, variance, and covariance. Additionally, it provides insights on the implications of variable omission and the use of logistic regression for binary dependent variables.

Uploaded by

witifit696
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

additional-cheatsheet-en (1)

The document is an econometrics cheat sheet that covers various topics including the variance-covariance matrix, OLS matrix notation, proxy variables, instrumental variables, and the two-stage least squares method. It also discusses the importance of model specification, information criteria, and statistical definitions such as mean, variance, and covariance. Additionally, it provides insights on the implications of variable omission and the use of logistic regression for binary dependent variables.

Uploaded by

witifit696
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Additional Cheat Sheet Variance-covariance matrix of u Variable omission correction

By Marcelo Moreno - Universidad Rey Juan Carlos Proxy variables


Has the following
 shape:
The Econometrics Cheat Sheet Project
Var(u1 ) Cov(u1 , u2 ) . . .
Cov(u1 , un )
 Is the approach when a relevant variable is not available
 Cov(u2 , u1 ) Var(u2 ) ...
Cov(u2 , un ) because it is non-observable, and there is no data available.
OLS matrix notation Var(u) =  ˆ A proxy variable is something related with the non-

 .. .. ..
.. 
 . . . . 
Cov(un , u1 ) Cov(un , u2 ) . . .
Var(un )
observable variable that has data available.
The general econometric model: For example, the GDP per capita is a proxy variable for
yi = β0 + β1 x1i + · · · + βk xki + ui Under no heterocedasticity and no autocorrelation, the
variance-covariance matrix:  the life quality (non-observable).
Can be written in matrix notation as: σu2 0 . . . 0

Instrumental variables
y = Xβ + u  0 σu2 . . . 0 
2
Var(u) = σu · In =  .. When the variable of interest (x) is observable but endoge-
Let’s call û the vector of estimated residuals (û ̸= u):
 
.. . . . 
 . . . ..  nous, the proxy variables approach is no longer valid.
û = y − X β̂ 0 0 . . . σu2 ˆ An instrumental variable (IV) is an observable
The objective of OLS is toPminimize the SSR: where In is an identity matrix of n × n elements. variable (z) that is related with the variable of interest
n
min SSR = min i=1 û2i = min ûT û Under heterocedasticity and autocorrelation, the that is endogenous (x), and meet the requirements:
ˆ Defining û û:
T
variance-covariance matrix:  Cov(z, u) = 0 → instrument exogeneity
2

ûT û = (y − X β̂)T (y − X β̂) = σu1 σu12 . . . σu1n
Cov(z, x) ̸= 0 → instrument relevance
 σu21 σu2 . . . σu2n 
= y T y − 2β̂ T X T y + β̂ T X T X β̂

Var(u) = σu2 · Ω = 
2
 .. .. . .. .
..  Instrumental variables let the omitted variable in the error
ˆ Minimizing û û:

T  . .
2 term, but instead of estimate the model by OLS, it uti-
∂ ûT û σun1 σun2 . . . σun
= −2X T y + 2X T X β̂ = 0 lizes a method that recognizes the presence of an omitted
∂ β̂ where Ω ̸= In .
β̂ = (X T X)−1 (X T y) ˆ Heterocedasticity: Var(u) = σui ̸= σu 2 2 variable. It can also solve error measurement problems.

β0
  P P −1  P 
ˆ Autocorrelation: Cov(ui , uj ) = σuij ̸= 0, ∀i ̸= j ˆ Two-Stage Least Squares (TSLS) is a method to esti-
Pn P x21 ... P xk P y
 β1   x 1 x1 ... x1 xk   yx1  mate a model with multiple instrumental variables. The
 ..  =  ..  ·  .. 
     
.. .. .. Cov(z, u) = 0 requirement can be relaxed, but there has
.  .
P P .
.
P 2.   . 
P Variable omission to be a minimum of variables that satisfies it.
βk xk xk x1 ... xk yxk
∂ 2 ûT û Most of the time, it is hard to get all relevant variables for The TSLS estimation procedure is as follows:
The second derivative = X T X > 0 (is a min.)
∂ β̂ 2 an analysis. For example, a true model with all variables: 1. Estimate a model regressing x by z using OLS, ob-
y = β0 + β1 x 1 + β2 x 2 + v taining x̂:
Variance-covariance matrix of β̂ where β2 ̸= 0, v is the error term and Cov(v|x1 , x2 ) = 0. x̂ = π̂0 + π̂1 z
The model with the available variables: 2. Replace x by x̂ in the final model and estimate it by
Has the following shape:
y = α0 + α1 x1 + u OLS:
Var(β̂) = σ̂u2 · (X T X)−1 =

Var(β̂0 ) Cov(β̂0 , β̂1 ) ... Cov(β̂0 , β̂k )
 where u = v + β2 x2 . y = β0 + β1 x̂ + u
Cov(β̂1 , β̂0 ) Var(β̂1 ) ... Cov(β̂1 , β̂k ) Relevant variable omission can cause OLS estimators to be There are some important things to know about TSLS:
= biased and inconsistent, because there is no weak exo- – TSLS estimators are less efficient than OLS when the
 
.. .. .. .. 
 . . . . 
geneity, Cov(x1 , u) ̸= 0. Depending on the Corr(x1 , x2 ) explanatory variables are exogenous. The Hausman
Cov(β̂k , β̂0 ) Cov(β̂k , β̂1 ) . . . Var(β̂k )
and the sign of β2 , the bias on α̂1 could be: test can be used to check it:
ûT û
where: σ̂u2
= n−k−1 Corr(x1 , x2 ) > 0 Corr(x1 , x2 ) < 0 H0 : OLS estimators are consistent.
The standard errors are in the
qdiagonal of: β2 > 0 (+) bias (−) bias If H0 is accepted, the OLS estimators are better than
se(β̂) = Var(β̂) β2 < 0 (−) bias (+) bias TSLS and vice versa.
ˆ (+) bias: α̂1 will be higher than it should be (it includes – There could be some (or all) instrument that are not
Error measurements the effect of x2 ) → α̂1 > β1 valid. This is known as over-identification, Sargan
ˆ (−) bias: α̂1 will be lower than it should be (it includes test can be used to check it:
ˆ SSR = ûT û = y T y − β̂ T X T y = (yi − ŷi )2
P
the effect of x2 ) → α̂1 < β1 H0 : all instruments are valid.
ˆ SSE = β̂ T X T y − ny 2 = (ŷi − y)2P If Corr(x1 , x2 ) = 0, there is no bias on α̂1 , because the
P
ˆ SST = SSR + SSE = y T y − ny 2 = (yi − y)2 effect of x2 will be fully picked up by the error term, u.

ADD-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license


Information criterion Incorrect functional form Statistical definitions
It is used to compare models with different number of pa- To check if the model functional form is correct, we can Let ξ, η be random variables, a, b ∈ R constants, and P
rameters (p). The general formula: use Ramsey’s RESET (Regression Specification Error denotes probability.
Cr(p) = log( SSR
n ) + cn φ(p) Test). It test the original model vs. a model with vari-
where: ables in powers.
Mean Pn
ˆ SSR is the Sum of Squared Residuals from a model of H0 : the model is correctly specified. Definition: E(ξ) = i=1 ξi · P [ξ = ξi ]
order p. Test procedure: Population mean: Sample mean:
ˆ cn is a sequence indexed by the sample size. 1. Estimate the original model and obtain ŷ and R2 : 1 PN 1 Pn
E(ξ) = ξi E(ξ) = ξi
ˆ φ(p) is a function that penalizes large p orders. ŷ = β̂0 + β̂1 x1 + · · · + β̂k xk N i=1 n i=1
Is interpreted as the relative amount of information lost by 2. Estimate a new model adding powers of ŷ and obtain Some properties:
2
the model. The p order that min. the criterion is chosen. the new Rnew : ˆ E(a) = a
There are different cn φ(p) functions: ỹ = ŷ + γ̃2 ŷ 2 + · · · + γ̃l ŷ l ˆ E(ξ + a) = E(ξ) + a
ˆ Akaike: AIC(p) = log( SSR 2
n ) + np 3. Define the test statistic, under γ2 = · · · = γl = 0 as null ˆ E(a · ξ) = a · E(ξ)
ˆ Hannan-Quinn: HQ(p) = log( SSR n )+
2 log(log(n))
n p hypothesis:
2
ˆ E(ξ ± η) = E(ξ) + E(η)
−R2 n−(k+1)−l
ˆ Schwarz / Bayesian: BIC(p) = log( SSR ) + log(n) Rnew
F = 1−R · ∼ Fl,n−(k+1)−l ˆ E(ξ · η) = E(ξ) · E(η) only if ξ and η are independent.
n p
2 l
n
ˆ E(ξ − E(ξ)) = 0
new
BIC(p) ≤ HQ(p) ≤ AIC(p) If F > Fl,n−(k+1)−l , there is evidence to reject H0 .
ˆ E(a · ξ + b · η) = a · E(ξ) + b · E(η)
The non-restricted hypothesis test Logistic regression Variance
Is an alternative to the F test when there are few hypoth- When there is a binary (0, 1) dependent variable, the lin- Definition: Var(ξ) = E[(ξ − E(ξ))2 ]
esis to test on the parameters. Let βi , βj be parameters, ear regression model is no longer valid, we can use logistic Population variance: Sample variance:
Pn
a, b, c ∈ R are constants. regression instead. For example, a logit model: PN
(ξi − E(ξ)) 2
i=1 (ξi − E(ξ))
2
i=1 Var(ξ) =
ˆ H0 : aβi + bβj = c 1 eβ0 +β1 xi +ui Var(ξ) =
n−1
Pi = = N
ˆ H1 : aβi + bβj ̸= c 1 + e−(β0 +β1 xi +ui ) 1 + eβ0 +β1 xi +ui
where Pi = E(yi = 1 | xi ) and (1 − Pi ) = E(yi = 0 | xi ) Some properties:
aβ̂i + bβ̂j − c ˆ Var(a) = 0
Under H0 : t = The odds ratio (in favor of yi = 1):
se(aβ̂i + bβ̂j )
Pi 1 + eβ0 +β1 xi +ui ˆ Var(ξ + a) = Var(ξ)
aβ̂i + bβ̂j − c = = eβ0 +β1 xi +ui ˆ Var(a · ξ) = a2 · Var(ξ)
=q 1 − Pi 1 + e−(β0 +β1 xi +ui )
Taking the natural logarithm of the odds ratio, we obtain ˆ Var(ξ ± η) = Var(ξ) + Var(η) ± 2 · Cov(ξ, η)
a2 Var(β̂i ) + b2 · Var(β̂j ) + 2abCov(β̂i , β̂j ) ˆ Var(a · ξ ± b · η) = a2 · Var(ξ) + b2 · Var(η) ± 2ab · Cov(ξ, η)
the logit:
If |t| > |tn−k−1,α/2 |, there is evidence to reject H0 . 
Pi

Li = ln = β0 + β1 xi + ui Covariance
1 − Pi Definition: Cov(ξ, η) = E[(ξ − E(ξ)) · (η − E(η))]
ANOVA Pi is between 0 and 1, but P
L goes from −∞ to +∞. 1 Population covariance: Sample covariance:
Decompose the total sum of squared in sum of squared i
residuals and sum of squared explained: SST = SSR + SSE PN Pn
Variation origin Sum Sq. df Sum Sq. Avg. If Li is positive, it means i=1 (ξi − E(ξ)) · (ηi − E(η)) i=1 (ξi − E(ξ)) · (ηi − E(η))

Regression SSE k SSE/k that when xi increments, the N n−1


Residuals SSR n − k − 1 SSR/(n − k − 1) probability of yi = 1 in- Some properties:
Total SST n−1 creases, and vice versa. ˆ Cov(ξ, a) = 0
The F statistic: ˆ Cov(ξ + a, η + b) = Cov(ξ, η)
SSA of SSE SSE n − k − 1 0
F = = · ∼ Fk,n−k−1 x ˆ Cov(a · ξ, b · η) = ab · Cov(ξ, η)
SSA of SSR SSR k
If F > Fk,n−k−1 , there is evidence to reject H0 : There is ˆ Cov(ξ, ξ) = Var(ξ)
no difference among group means. ˆ Cov(ξ, η) = Cov(η, ξ)

ADD-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license


Hypothesis testing Typical one-tail test: Bootstraping
H0 distribution H1 distribution
H0 true H0 false C Problem - Asymptotic approximations to the distributions
Reject H0 False positive True positive of test statistics do not work on small samples.
1−α 1−β
Type I Error (α) (1 − β) Solution - Boostrap is basically sampling with replace-
β α ment. The observed data is treated like a population, and
Accept H0 True negative False negative
(1 − α) Type II Error (β) where (1 − α) is the confidence level, α is the significance multiple samples are exracted to recalculate an estimator
level, C is the critical value, (1 − β) is the statistical power. or test statistic multiple times (improve accuracy).

VAR (Vector Autoregressive) VECM (Vector Error Correction Model)


A VAR model captures dynamic interactions between time series. The VAR(p): If cointegrating relations are present in a system of variables, the VAR form is not the
yt = A1 yt−1 + · · · + Ap yt−p + Bxt + CDt + ut most convenient. It is better to use a VECM, that is, the levels VAR substracting yt−1
where: from both sides. The VECM(p − 1):
ˆ yt = (y1t , . . . , yKt )T is a vector of K observable endogenous time series.
Pp−1
∆yt = Πyt−1 + i=1 Γi ∆yt−i + Bxt + CDt + ut
ˆ Ai ’s are K × K coefficient matrices. where:
ˆ xt = (x1t , . . . , xM t )T is a vector of M observable exogenous time series. ˆ ∆yt = (∆y1t , . . . , ∆yKt )T is a vector of K observable endogenous time series.
ˆ B is an K × M coefficient matrix. ˆ Πyt−1 is the long-term part.
ˆ Dt is a vector that contains all deterministic terms: a constant, linear trend, seasonal ⋄ Π = −(IK − A1 − · · · − Ap ) for i = 1, . . . , p − 1
dummy, and/or any other user specified dummy variables. ⋄ Π = αβ T
ˆ C is a coefficient matrix of suitable dimension. ⋄ α is the loading matrix (K × r). It represents the speed-of-adjustment.
ˆ ut = (u1t , . . . , uKt )T is a vector of K white noise series. ⋄ β is the cointegration matrix (K × r).
Stability condition: ⋄ β T yt−1 is the cointegrating equation. It represents the long-run equilibrium.
det(IK − A1 z − · · · − Ap z p ) ̸= 0 for |z| ≤ 1 ⋄ rk(Π) = rk(α) = rk(β) = r is the cointegrating rank.
this is, there are no roots in and on the complex unit circle. ˆ Γi = −(Ai+1 + · · · + Ap ) for i = 1, . . . , p − 1 are the short-term parameters.
For example, a VAR model with two endogenous variables (K = 2), two lags (p = 2), an ˆ xt , B, C, Dt and ut are as in VAR.
exogenous
   contemporaneous    variable
 (M=1), aconstant
  (const)
 and a trend (Trend
  t ): For example, a VECM with three endogenous variables (K = 3), two lags (p = 2) and
y1t a a12,1 y a a12,2 y b c c12 const u two cointegratig relations (r = 2):
= 11,1 · 1,t−1 + 11,2 · 1,t−2 + 11 · xt + 11 + 1t
 
·
y2t a21,1 a22,1 y2,t−1 a21,2 a22,2 y2,t−2 b21 c21 c22 Trendt u2t
∆yt = Πyt−1 + Γ1 ∆yt−1 + ut
Visualizing the separate equations:
where:
y1t = a11,1 y1,t−1 + a12,1 y2,t−1 + a11,2 y1,t−2 + a12,2 y2,t−2 + b11 xt + c11 + c12 Trendt + u1t 
α11 α12 
 
 y1,t−1
 
α11 ec1,t−1 + α12 ec2,t−1

y2t = a21,1 y2,t−1 + a22,1 y1,t−1 + a21,2 y2,t−2 + a22,2 y1,t−2 + b21 xt + c21 + c22 Trendt + u2t β β21 β31 
Πyt−1 = αβ T yt−1 = α21 α22  11 y2,t−1  = α21 ec1,t−1 + α22 ec2,t−1 
If there is an unit root, the determinant is zero for z = 1, then some or all variables are β12 β22 β32
α31 α32 y3,t−1 α31 ec1,t−1 + α32 ec2,t−1
integrated and a VAR model is no longer appropiate (is unstable).
ec1,t−1 = β11 y1,t−1 + β21 y2,t−1 + β31 y3,t−1
SVAR (Structural VAR) ec2,t−1 = β12 y1,t−1 + β22 y2,t−1 + β32 y3,t−1
In a VAR model, causal interpretation is not explicit and results are sensitive to variable and     
ordering. An SVAR extends VAR by imposing theory-based restrictions on A and/or B γ11 γ12 γ13 ∆y1,t−1 u1
Γ ∆y = γ21 γ22 γ23  ∆y2,t−1  ut = u2 
matrices. This can enable causal interpretation and shock analysis without reliance on 1 t−1

arbitrary ordering. γ31 γ32 γ33 ∆y3,t−1 u3


For example, a basic SVAR(p) model: Visualizing the separate equations:
Ayt = A[A1 , . . . , Ap ]yt−1 + Bεt ∆y1t = α11 ec1,t−1 + α12 ec2,t−1 + γ11 ∆y1,t−1 + γ12 ∆y2,t−1 + γ13 ∆y3,t−1 + u1t
where: ∆y 2t = α21 ec1,t−1 + α22 ec2,t−1 + γ21 ∆y1,t−1 + γ22 ∆y2,t−1 + γ23 ∆y3,t−1 + u2t

ˆ u = A Bε
−1 ∆y 3t = α31 ec1,t−1 + α32 ec2,t−1 + γ31 ∆y1,t−1 + γ32 ∆y2,t−1 + γ33 ∆y3,t−1 + u3t
t t
ˆ A, B are (K × K) matrices.

ADD-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license

You might also like