Econometrics Cheat Sheet Assumptions and properties Ordinary Least Squares
By Marcelo Moreno - Universidad Rey Juan Carlos
The Econometrics Cheat Sheet Project Econometric model assumptions Objective - minimize Pn the2 Sum of Squared Residuals (SSR):
Under this assumptions, the OLS estimator will present min i=1 ûi , where ûi = yi − ŷi
good properties. Gauss-Markov assumptions:
Basic concepts 1. Parameters linearity (and weak dependence in time y
Simple regression model
Equation:
Definitions series). y must be a linear function of the β’s.
yi = β0 + β1 xi + ui
Econometrics - is a social science discipline with the 2. Random sampling. The sample from the population Estimation:
objective of quantify the relationships between economic has been randomly taken. (Only when cross section)
ŷi = β̂0 + β̂1 xi
agents, test economic theories and evaluate and implement 3. No perfect collinearity. where:
government and business policies. • There are no independent variables that are constant: β1
β̂0 = y − β̂1 x
Econometric model - is a simplified representation of the Var(xj ) 6= 0, ∀j = 1, . . . , k
β̂ Cov(y,x)
• There isn’t an exact linear relation between indepen- 1 = Var(x)
reality to explain economic phenomena. β0 x
Ceteris paribus - if all the other relevant factors remain dent variables.
constant. 4. Conditional mean zero and correlation zero.
Multiple regression model
a. There aren’t systematic errors: E(u | x1 , . . . , xk ) =
Data types E(u) = 0 → strong exogeneity (a implies b).
y Equation:
Cross section - data taken at a given moment in time, an b. There are no relevant variables left out of the model: yi = β0 + β1 x1i + · · · + βk xki + ui
static photo. Order doesn’t matter. Cov(xj , u) = 0, ∀j = 1, . . . , k → weak exogeneity. Estimation:
Time series - observation of variables across time. Order 5. Homoscedasticity. The variability of the residuals is ŷi = β̂0 + β̂1 x1i + · · · + β̂k xki
does matter. the same for all levels of x: where:
Panel data - consist of a time series for each observation Var(u | x1 , . . . , xk ) = σu2 β̂0 = y − β̂1 x1 − · · · − β̂k xk
of a cross section. 6. No autocorrelation. Residuals don’t contain informa- β0 x 2
Cov(y,resid x )
β̂j = Var(resid xj )j
Pooled cross sections - combines cross section from dif- tion about any other residuals:
x 1
Matrix: β̂ = (X T X)−1 (X T y)
ferent time periods. Corr(ut , us | x1 , . . . , xk ) = 0, ∀t 6= s
Phases of an econometric model 7. Normality. Residuals are independent and identically Interpretation of coefficients
distributed: u ∼ N (0, σu2 ) Model Dependent Independent β1 interpretation
1. Specification. 3. Validation. Level-level
8. Data size. The number of observations available must
y x ∆y = β1 ∆x
2. Estimation. 4. Utilization. Level-log y log(x) ∆y ≈ (β1 /100)(%∆x)
be greater than (k + 1) parameters to estimate. (It is Log-level
Regression analysis already satisfied under asymptotic situations) Log-log
log(y)
log(y)
x
log(x)
%∆y ≈ (100β1 )∆x
%∆y ≈ β1 (%∆x)
Study and predict the mean value of a variable (dependent Quadratic x + x2
variable, y) regarding the base of fixed values of other vari- Asymptotic properties of OLS
y ∆y = (β1 + 2β2 x)∆x
ables (independent variables, x’s). In econometrics it is Under the econometric model assumptions and the Central Error measurements
common to use Ordinary Least Squares (OLS) for regres- Limit Theorem (CLT): Sum of Sq. Residuals: SSR = i=1 û2i = Pi=1 (yi − ŷi )2
Pn Pn
• Hold 1 to 4a: OLS is unbiased. E(β̂j ) = βj Explained Sum of Squares:
n
sion analysis. SSE = Pi=1 (ŷi − y)2
• Hold 1 to 4: OLS is consistent. plim(β̂j ) = βj (to 4b Total Sum of Sq.:
n
SST = SSE + SSR = i=1q (yi − y)2
Correlation analysis left out 4a, weak exogeneity, biased but consistent)
Correlation analysis don’t distinguish between dependent Standard Error of the Regression: σ̂u = n−k−1 SSR
• Hold 1 to 5: asymptotic normality of OLS (then, 7 is
and independent variables. Standard Error of the β̂’s:
p
necessarily satisfied): u ∼ N (0, σu2 ) se(β̂) = q σ̂u2 · (X T X)−1
• Simple correlation measures the grade of linear associa- a
Pn
Root Mean Squared Error:
2
i=1 (yi −ŷi )
tion between two variables.Pn • Hold 1 to 6: unbiased estimate of σu . E(σ̂u ) = σu
2 2 2 RMSE =
Pn n
r = Cov(x,y)
((xi −x)·(yi −y)) • Hold 1 to 6: OLS is BLUE (Best Linear Unbiased Esti- Absolute Mean Error: AME = i=1 n i i
|y −ŷ |
σx ·σy =
pPn i=1 Pn
mator) or efficient.
2 2
i=1 (xi −x) · i=1 (yi −y)
Pn
Mean Percentage Error:
|û /y |
• Partial correlation measures the grade of linear associa- MPE = i=1n i i · 100
• Hold 1 to 7: hypothesis testing and confidence intervals
tion between two variables controlling a third.
can be done reliably.
CS-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license
R-squared Individual tests Dummy variables
Tests if a parameter is significantly different from a given
Is a measure of the goodness of the fit, how the regression value, ϑ. Dummy (or binary) variables are used for qualitative infor-
fits the data: • H0 : βj = ϑ mation like sex, civil state, country, etc.
SSE
R2 = SST = 1 − SSR
SST • H1 : βj 6= ϑ • Takes the value 1 in a given category and 0 in the
• Measures the percentage of variation of y that is lin- rest.
Under H0 : t = se(j β̂ ) ∼ tn−k−1,α/2
β̂ −ϑ
early explained by the variations of x’s. j • Are used to analyze and modeling structural changes
• Takes values between 0 (no linear explanation) and 1 If |t| > |tn−k−1,α/2 |, there is evidence to reject H0 . in the model parameters.
(total explanation). Individual significance test - tests if a parameter is sig- If a qualitative variable have m categories, we only have to
When the number of regressors increases, the value of the nificantly different from zero. include (m − 1) dummy variables.
R-squared also increases, whatever the new variables are • H0 : βj = 0
relevant or not. To solve this problem, there is an adjusted • H1 : βj 6= 0
Structural change
Structural change refers to changes in the values of the pa-
R-squared by degrees of freedom (or corrected): Under H0 : t = se(β̂j ) ∼ tn−k−1,α/2
β̂
2 n−1 j rameters of the econometric model produced by the effect
R = 1 − n−k−1 · SSR n−1
SST = 1 − n−k−1 · (1 − R )
2
If |t| > |tn−k−1,α/2 |, there is evidence to reject H0 . of different sub-populations. Structural change can be in-
For big sample sizes: R ≈ R2
2
cluded in the model through dummy variables.
The F test
The location of the dummy variables (D) matters:
Simultaneously tests multiple (linear) hypothesis about the
Hypothesis testing parameters. It makes use of a non restricted model and a
• On the intercept (additive effect) - represents the
mean difference between the values produced by the
restricted model:
Definitions structural change.
• Non restricted model - is the model on which we want
Is a rule designed to explain from a sample, if exist ev- y = β0 + δ 1 D + β1 x 1 + u
to test the hypothesis.
idence or not to reject an hypothesis that is made • On the slope (multiplicative effect) - represents the ef-
• Restricted model - is the model on which the hypoth-
about one or more population parameters. fect (slope) difference between the values produced by
esis that we want to test have been imposed.
Elements of an hypothesis test: the structural change.
Then, looking at the errors, there are:
• Null hypothesis (H0 ) - is the hypothesis to be tested. y = β0 + β1 x1 + δ1 D · x1 + u
• SSRUR - is the SSR of the non restricted model.
• Alternative hypothesis (H1 ) - is the hypothesis that Chow’s structural test - analyze the existence of struc-
• SSRR - is the SSR of the restricted model.
cannot be rejected when H0 is rejected. tural changes in all the model parameters, it’s a particular
Under H0 : F = SSRSSR R −SSRUR
· n−k−1 ∼ Fq,n−k−1
• Test statistic - is a random variable whose probability q
expression of the F test, where H0 : No structural change
where k is the number of parameters of the non restricted
UR
distribution is known under H0 . (all δ = 0).
model and q is the number of linear hypothesis tested.
• Critical value (C) - is the value against which the test
If F > Fq,n−k−1 , there is evidence to reject H0 .
Global significance test - tests if all the parameters as- Changes of scale
statistic is compared to determine if H0 is rejected or
not. It sets the frontier between the regions of accep-
sociated to x’s are simultaneously equal to zero. Changes in the measurement units of the variables:
tance and rejection of H0 .
• H0 : β1 = β2 = · · · = βk = 0 • In the endogenous variable, y ∗ = y·λ - affects all model
• Significance level (α) - is the probability of rejecting
• H1 : β1 6= 0 and/or β2 6= 0 . . . and/or βk 6= 0 parameters, βj∗ = βj · λ, ∀j = 1, . . . , k
the null hypothesis being true (Type I Error). Is chosen
In this case, we can simplify the formula for the F statistic: • In an exogenous variable, x∗ = xj · λ - only affect the
by who conduct the test. Commonly is 10%, 5% or 1%. j
• p-value - is the highest level of significance by which H0 Under H0 : F = 1−R R2
2 ·
n−k−1
k ∼ Fk,n−k−1 parameter linked to said exogenous variable, βj∗ = βj · λ
cannot be rejected. If F > Fk,n−k−1 , there is evidence to reject H0 . • Same scale change on endogenous and exogenous - only
Two-tailed test. H0 dist. One-tailed test. H0 dist. affects the intercept, β0∗ = β0 · λ
Confidence intervals
−C
1−α
C
1−α
C
α The confidence intervals at (1 − α) confidence level can be Changes of origin
α/ 2 α/ 2
Accept. region Accept. reg. calculated: Changes in the measurement origin of the variables (en-
The rule is: if p-value < α holds, there is evidence to β̂j ∓ tn−k−1,α/2 · se(β̂j ) dogenous or exogenous), y ∗ = y + λ - only affects the
reject H0 , thus, there is evidence to accept H1 . model’s intercept, β0∗ = β0 + λ
CS-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license
Multicollinearity Heteroscedasticity Autocorrelation
• Perfect multicollinearity - there are independent The residuals ui of the population regression function do The residual of any observation, ut , is correlated with the
variables that are constant and/or there is an exact not have the same variance σu2 : residual of any other observation. The observations are not
linear relation between independent variables. Is the Var(u | x1 , . . . , xk ) = Var(u) 6= σu2 independent.
breaking of the third (3) econometric model as- Is the breaking of the fifth (5) econometric model as- Corr(ut , us | x1 , . . . , xk ) = Corr(ut , us ) 6= 0, ∀t 6= s
sumption. sumption. The “natural” context of this phenomena is time series. Is
• Approximate multicollinearity - there are indepen- the breaking of the sixth (6) econometric model as-
dent variables that are approximately constant and/or
Consequences sumption.
• OLS estimators still are unbiased.
there is an approximately linear relation between inde-
pendent variables. It does not break any economet-
• OLS estimators still are consistent. Consequences
• OLS is not efficient anymore, but still a LUE (Linear • OLS estimators still are unbiased.
ric model assumption, but has an effect on OLS.
Unbiased Estimator). • OLS estimators still are consistent.
Consequences • Variance estimations of the estimators are biased: • OLS is not efficient anymore, but still a LUE (Linear
• Perfect multicollinearity - the equation system of the construction of confidence intervals and the hypoth- Unbiased Estimator).
OLS cannot be solved due to infinite solutions. esis testing is not reliable. • Variance estimations of the estimators are biased:
• Approximate multicollinearity the construction of confidence intervals and the hypoth-
– Small sample variations can induce to big variations
Detection esis testing is not reliable.
• Graphs - look u y
in the OLS estimations.
– The variance of the OLS estimators of the x’s that
for scatter pat- Detection
terns on x vs. u • Graphs - look for scatter patterns on ut−1 vs. ut or
are collinear, increments, thus the inference of the pa-
or x vs. y plots. x make use of a correlogram.
rameter is affected. The estimation of the parameter
Ac. Ac. + Ac. −
is very imprecise (big confidence interval). x ut ut ut
Detection • Formal tests - White, Bartlett, Breusch-Pagan, etc.
• Correlation analysis - look for high correlations be- Commonly, H0 : No heteroscedasticity. ut−1 ut−1
tween independent variables, |r| > 0.7. ut−1
Correction
• Variance Inflation Factor (VIF) - indicates the in-
• Use OLS with a variance-covariance matrix estimator
crement of Var(β̂j ) because of the multicollinearity. robust to heteroscedasticity (HC), for example, the one • Formal tests - Durbin-Watson, Breusch-Godfrey, etc.
1
VIF(β̂j ) = 1−R 2
j
proposed by White. Commonly, H0 : No autocorrelation.
where Rj denotes the R-squared from a regression be-
2 • If the variance structure is known, make use of Weighted
tween xj and all the other x’s. Least Squares (WLS) or Generalized Least Squares Correction
– Values between 4 to 10 - there might be multicollinear- (GLS): • Use OLS with a variance-covariance matrix estimator
ity problems. – Supposing that Var(u) = σ 2
u · x i , divide the model robust to heterocedasticity and autocorrelation (HAC),
– Values > 10 - there are multicollinearity problems. variables by the square root of x i and apply OLS. for example, the one proposed by Newey-West.
One typical characteristic of multicollinearity is that the – Supposing that Var(u) = σu
2
· x 2
i , divide the model • Use Generalized Least Squares. Supposing yt = β0 +
regression coefficients of the model aren’t individually dif- variables by x i (the square root of x 2
i ) and apply OLS. β1 xt + ut , with ut = ρut−1 + εt , where |ρ| < 1 and εt is
ferent from zero (due to high variances), but jointly they • If the variance structure is not known, make use of Fea- white noise.
are different from zero. sible Weighted Least Squared (FWLS), that estimates a – If ρ is known, create a quasi-differentiated model
possible variance, divides the model variables by it and where ut is white noise and estimate it by OLS.
Correction then apply OLS. – If ρ is not known, estimate it by -for example-
• Delete one of the collinear variables. • Make a new model specification, for example, logarith- the Cochrane-Orcutt method, create a quasi-
• Perform factorial analysis (or any other dimension re- mic transformation (lower variance). differentiated model where ut is white noise and es-
duction technique) on the collinear variables. timate it by OLS.
• Interpret coefficients with multicollinearity jointly.
CS-25.01-EN - github.com/marcelomijas/econometrics-cheatsheet - CC-BY-4.0 license