EMF CheatSheet V4
EMF CheatSheet V4
Standard error of the regression (SER) is just the standard deviation of the
residuals. R2 measures the fit relative to the variance of the dependent
variable, the SER just measures the fit. The rankings of the different dependent variables
(1st best, 2nd …) would only necessarily be the same reg. both R2
and SER if all the dependent var had the same variance
𝑅𝑆𝑆 + 𝐸𝑆𝑆 𝐸𝑆𝑆 𝑅𝑆𝑆 𝑅𝑆𝑆
𝑇𝑆𝑆 = 𝑅𝑆𝑆 + 𝐸𝑆𝑆 ↔ 1 = 𝑤𝑖𝑡ℎ 𝑅 . = = + 𝑅. ↔ 𝑅. = 1 −
𝑇𝑆𝑆 𝑇𝑆𝑆 𝑇𝑆𝑆 𝑇𝑆𝑆
R2 – Coefficient of determination (McFadden)
- on average the estimators are the true values (unbiased): Ε(β ̂ )=β
R2 cannot be compared when dependent variables are different (for e.g. Y≠logY)
- all formulae of the estimators are true linear combinations of random variables
R2 increases with the number of variables, even if the additional variables are not statistically - variance of the coefficients is minimized (efficient)
significant; therefore bigger R2 does not mean better, must compute the Adjusted - for an infinite number of observations, the estimators will converge to their true values (consistency)
𝑅0 . (See Formula)
Only compare LPM’s R2 with LPM’s R2; Logit’s R2 with Logit’s R2; Probit’s R2 with GAUSS-MARKOV Assumptions/ Implications/ Test/ Solutions
Probit’s R2 - need to be fulfilled otherwise model (regression) might not be adequate/ efficient!
𝑙𝑛𝐿B
𝐹𝑜𝑟 𝒍𝒐𝒈𝒊𝒕 𝑎𝑛𝑑 𝒑𝒓𝒐𝒃𝒊𝒕: 𝑅 . = I. Linearity: Errors have zero mean: Ε(𝑢‚ ) = 0
𝑙𝑛𝐿C
Test for significance (“Coefficient, constant = 0”) II. Exogeneity: independent variable and error term are uncorrelated
Endogeneity: some of explanatory variables are correlated with the equations error term
T-test: inferences about statistical significance regression coefficients (t-value) IMPLICATIONS: OLS estimator biased and inconsistent
TEST: Durbin-Wu-Hausmann Test
ΗC : 𝛽F = 0, ΗI : 𝛽F ≠ 0, if t > critical value (mostly 1,96) then reject ΗC , value is significant SOLUTION: Instrumental Variables (IV) – when #instruments = #endogenous variables (exactly identified system); Or Two
KLMNNOPOMQRSTU Stage Least Squares (2SLS) – when #instruments > #endogenous variables (overidentified system)
t-value =
VRWQXWYX ZYYLY III. Homoskedasticity: variance of errors is constant and finite
Type I error is rejecting the null when it is true, Type II error is accepting the null when it is false. The Heteroskedasticity: variance of the errors is not constant across observations
probability of type I error is the significance level (for e.g. 5%). IMPLICATIONS: Estimators are still unbiased, and consistent (OLS not violated); however, no longer efficient - (not min.
variance); hence t-& F-test no longer reliable, thus possibly erroneous inferences regarding stat significance and wrong standard
Indication p-value: errors since error terms no longer normally distributed
Low value: < 0.05 - high evidence against Ho – reject = coefficient significant TEST: White’s Test
High value: > 0.05-0.1 few evidence again Ho = not significance SOLUTION: Huber-White Correction “Robust estimation”/
“Adjust: Robust Standard errors” Why could researcher use OLS if u and v indepen: No
Confidence interval signifiance:
𝛽ƒ − 𝛽C 𝛽ƒ − 𝛽C Autoco: If u and v are independent, then cov(u;v) = 0,
ΗC : 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑜𝑟 𝛽F = 0, ΗI : 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑜𝑟 𝛽F ≠ 0 𝑡= =
𝑆𝐸 which implies that cov(u; v) = E(uv) E(u):E(v) = 0. And
If 0 /∈ CF (not part of the CF) then reject ΗC ∑ (𝑥F − 𝑥̅ ).
×𝒖 ˆ 𝟐𝒊 since the expected value of the error term is always equal
„
F-test: significance test of unrestricted regression [∑ (𝑦F − 𝑦0). ]𝟐 to zero, when using OLS and when including an intercept
Overall in the regression, we may conclude that cov(u;v)=E(uv)=0
IV. No Autocorrelation: no pattern in the errors (residuals)
Autocorrelation: pattern in the residuals (assumes that relationship is between an error and previous one)
K = #regressors (including constant); T = #observations IMPLICATIONS: Estimators are no longer BLUE
Test of significance of overall model (F-value) = High value = Signifiance Static model (y=ax1t+bx2t)-inefficient estimators
Dynamic model (y=ax1t+bx2t-1)- inconsistent estimators (have to change my model)
F-test: Joint, overall significance test (test of the validity of the restrictions) TEST: Durbin Watson (DW-stat bounds) / Breusch-Godfrey Test (LM)
SOLUTION: Newey-West’s HAC robust standard errors – Static Models
Ϝa^, _S`SI ∼ CV
and Multicollinearity: explanatory variables are very highly correlated with each other
(iijjSkijj)/^ (hints: high R2 and high SEs)
ΗC : 𝛽I = 𝛽c = 0 ΗI : 𝛽I ≠ 0, … .∨ 𝛽c ≠ 0 𝐹 − 𝑇𝑒𝑠𝑡 = kijj/(_S`SI) IMPLICATIONS:
unrestricted: model contains all variables exactly as in overall regression TEST: Variance Inflation Factors
restricted: all regressors which coefficients have been set to “0” are excluded SOLUTION: drop collinear variables/ regress one on another and take residuals/ transform into ratio
degrees of freedom: (v2) N / T (sample size) – k (numbers parameters in unrestricted model)
numbers of restrictions: (v1) r
o
o −Ζ o
o o o o
o
Confidence Interval: n𝛽 1−𝛼/2 × 𝑆𝐸 t𝛽u ; 𝛽 + Ζ1−𝛼/2 × 𝑆𝐸(𝛽)w
Significance Tests
ΗC : 𝛽F = 0 ΗI : 𝛽F < 0 ΗC : 𝛽F = 0 ΗI : 𝛽F > 0 ΗC : 𝛽F = 1 ΗI : 𝛽F ≠ 1 ΗC : 𝛽F = 𝛽c ΗI : 𝛽F < 𝛽c
→ negative correlation between x1 and y → positive correlation between x1 and y → one-to-one relationship → difference in impact on y
Coefficient Interpretation:
Level-Level: If x (independent variable) varies by 1 unit, y (dependent variable) varies by ß1 units on average ceteris paribus//
Constant: If xI (𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡𝑠) = x. = x~ = 0 then constant is the XYZ (e.g. excess return)
Level-Log: If x varies by 1%, y varies ß1/100 units on average ceteris paribus.
Log-Level: If x varies by 1 unit, y varies 100ß1% on average, ceteris paribus.
Log-Log: If x varies by 1%, y varies ß1% on average ceteris paribus.
Logit/ Probit: If 𝑥Fc (if 𝛽c > 0) increases by 1 unit, the probability of P(y=1) increases by 𝛽c , vice versa.
OLS: Endogeneity (𝑐𝑜𝑣(𝑥‚ , 𝜐‚ ) = 0) (E(xt|et)¹ 0 (When a regressor is exogenous it means that it is not correlated with
OLS: Heteroskedasticity OLS: Autocorrelation D-W-HAUSMAN TEST the disturbances, i.e., E(xtut) = 0, don’t use OLS because biased and inconsistent
WHITE’s (tests if unequal var in the error DURBIN-WATSON (Autocorr. of 1st order, regress residuals) Steps: estimators) # of Instruments and # of endogenous
term) It assumes that the relationship is between an error and the previous 1.) Estimate original regression: variables given in exam
One: 𝑢‚ = 𝝆𝑢‚SI + 𝑣𝑡 e.g. 𝑌‚ = 𝛽C + 𝛽I 𝑥I‚ + 𝛽. 𝑥.‚ + 𝛽~ 𝑥~‚ + 𝑣‚ add: “by OLS save residuals 𝑣¡‚ ”
How to detect: Dependent variable=Residual2 from original
∑¹
µº¸(B̂µ SB̂µ¶· )
¸ Advantage of using an IV
regression =auxiliary regression, Cross-multiples’ of all DW Score formula: 𝐷𝑊 = estimator is that it is always
variables are added as regressors ∑¹
µº¸ B̂µ
¸
2.) Estimate regression with instruments for the potential endogenous var. consistent and unbiased (i.e. in the
Table: E_MKT/E_MKT^2/E_MKT*HML/etc. Test actually tests: 𝑯𝟎 : 𝝆 = 𝟎 and 𝑯𝟏 : 𝝆 ≠ 𝟎 e.g. 𝑥I‚ = 𝛼C + 𝛾I 𝑧I‚ + 𝛾. 𝑧.‚ + 𝛾~ 𝑧~‚ + 𝑢‚ add: “by OLS and save residuals 𝑢
¡‚” presence and in the absence of
endogeneity). Therefore, if the
Look into stat tables for 𝑛 and 𝑘, (!) 𝑘 not including the constant researcher is not sure whether the
𝑯𝟎 : 𝜸𝟏 = ⋯ = 𝜸𝒙 = 𝟎 | 𝒉𝒐𝒎𝒐𝒔𝒌. (x = No. of 3.) Regress first residuals from step 1.) error terms are independent, using
coefficients w/o constant) e.g. 𝑣¡‚ = 𝛿C + 𝛿I 𝑥I‚ + 𝛿. 𝑥.‚ + 𝛿~ 𝑥~‚ + 𝛿é 𝑢
¡ ‚ + 𝜀‚ add: “by OLS” an IV estimator would provide
unbiased estimates for beta, even if
𝑯𝟏 : 𝜸𝟏 ≠ ∨ … ∨ 𝜸𝒙 ≠ 𝟎 | 𝒉𝒆𝒕𝒆𝒓𝒐𝒔𝒌. the variable Skill was endogenous.
4.) Test with if coefficient from step 3.) is significant:
1 The test (RESID^2) is the White’s test Original 𝑯𝟎 : 𝜹𝟒 = 𝟎 | 𝒏𝒐 𝒆𝒏𝒅𝒐𝒈𝒆𝒏𝒆𝒊𝒕𝒚 Disadvantage of using IV is efficiency. Whether in the presence of
endogeneity or not, the OLS estimator is always as e¢ cient as the IV
model: 𝑦‚ = 𝛼 + 𝛽I 𝑥I‚ + 𝛽. 𝑥.‚ + 𝑢‚
𝑯𝟏 : 𝜹𝟒 ≠ 𝟎 | 𝒆𝒏𝒅𝒐𝒈𝒆𝒏𝒆𝒊𝒕𝒚
one. Thus, if the researcher would believe that the error terms were
2 Auxiliary model: 𝑢¡‚. = 𝛾C + 𝜸𝟏 𝒙𝟏𝒕 + 𝜸𝟐 𝒙𝟐𝒕 + 𝜸𝟑 𝒙𝟐𝟏𝒕 + independent, then OLS estimators should be used.
𝟐
𝜸𝟒 𝒙𝟐𝒕 + 𝜸𝟓 𝒙𝟏𝒕 𝒙𝟐𝒕 + 𝑣‚ LM = 𝑇 ∗ 𝑅 . , Compare with χíîSFï‚ð^ñòó
^ , if LM-value > χ then reject 𝐻C , (!) 𝑅 . from regression of first residuals (Step 3), 𝑟
Then do number of endogenous variables (here just 1)
Exactly id. System means: #instruments = #endogenous variables
LM = 𝑇 ∗ 𝑅 . , 𝝌. (𝐾)
𝐾 = #𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑜𝑟𝑠 𝑜𝑓 𝑎𝑢𝑥 (𝑒𝑥𝑐𝑙𝑢𝑑𝑖𝑛𝑔 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝛾C ) Instrumental Variables (To use when endogeneity was detected, exactly identified system)
𝑧‚ =IV: Has to be (1) strongly correlated with endogenous variable & (2) uncorrelated with error term
, if LM-value > χ then reject 𝐻C
𝑇 = #𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠; 𝑅 .
If 𝑧‚ would be directly added, regression would be changed, Solution: first order derivation Exactly identified system
IV Estimator: If one has an exactly identied
= 𝑢𝑛𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅 . 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑢𝑥𝑖𝑙𝑖𝑎𝑟𝑦 𝑚𝑜𝑑𝑒𝑙 (𝑅𝐸𝑆𝐼𝐷^2) system (i.e. the number of
Rejecting 𝐻C = Evidence for Heteroskedasticity,
Limitations of test: 1.) Only testing 1st order autocorrelation 2.) not valid 𝛽ö
ôõ =
∑(𝑦‚ − 𝑦0)(𝑧‚ − 𝑧̅) instrumental variables is
Adjust w/ Robust SE for in dynamic models or under endogeneity ∑(𝑥‚ − 𝑥̅ ) (𝑧‚ − 𝑧̅) equal to the number of
2-Stage Least Squares (To use when endogeneity was detected, overidentified system) potentially endogenous
Closest alternative test: BREUSCH-GODFREY (Autocorr. up to 4th order) variables), it is possible to use
1.) Estimate regression with instruments for the potential endogenous var.:
Closest Alternative test is the regression How to detect: Dependent variable=Residual from original regression,
both the IV estimator or the 2-
significance F test provided in the auxiliary e.g. 𝑥I‚ = 𝛼C + 𝛾I 𝑧I‚ + 𝛾. 𝑧.‚ + 𝛾~ 𝑧~‚ + 𝑢‚ add: “by OLS and save fitted values” Stage Least Squares approach
regression output (look at the p-value of the
Lagged variables added to regression Note: Instruments are/ must be: (2SLS).
1.) Strongly correlated with endogenous variable, 2.) Uncorrelated with the error term, 3.) Not necessary for the model
overall significance). These are just the
hypothesis associated (only need them if they
Table: [lag1resid]/ [resid-1] B-G
2.) Estimate regression of original model BUT with fitted values for endogenous variable:
1) Original model: 𝑦‚ = 𝛼 + 𝛽I 𝑥I‚ + 𝛽. 𝑥.‚ + ⋯ + 𝛽F 𝑥F‚ + 𝑢‚
asked you to write them) 2) Auxiliary model:
𝑌‚ = 𝜗C + 𝜗I 𝑥¡I‚ + 𝜗. 𝑥.‚ + 𝜗~ 𝑥~‚ + 𝑣‚
¡ ‚. = 𝛾C + 𝜸𝟏 𝒙𝟏𝒕 + 𝜸𝟐 𝒙𝟐𝒕 +
Unrestricted: 𝑢 Note: 1.) Coefficients are differently noted as they are different from first regression, 2.) Running 2-Stage OLS on a model
ˆ 𝒕S𝟏 + 𝜹𝟐 × 𝒖
𝑢¡‚ = 𝛾C + 𝛾I 𝑥I‚ + 𝛾. 𝑥.‚ + ⋯ + 𝛾F 𝑥F‚ + 𝜹𝟏 × 𝒖 ˆ 𝒕S𝟐 + ⋯ + 𝜹𝒌 × 𝒖
ˆ 𝒕S𝒌 w/o endogeneity makes it less efficient
𝜸𝟑 𝒙𝟐𝟏𝒕 + 𝜸𝟒 𝒙𝟐𝟐𝒕 + 𝜸𝟓 𝒙𝟏𝒕 𝒙𝟐𝒕 + 𝑣‚ + 𝑣‚
Binary dependent
Restricted: 𝑢¡‚. = 𝛾C variables
+ 𝑣‚ 𝐻C : 𝛿I = ⋯ = 𝛿¾ = 0 → 𝑁𝑜 𝑎𝑢𝑡𝑜𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
Panels 1 and 2: 2-stage SLS
Need to transform the dichotomous (zweigeteilt) Y 𝐻À : 𝛿I ≠ 0 ∪ … ∪ 𝛿¾ ≠ 0 → 𝐴𝑢𝑡𝑜𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛
𝐿𝑀 = 𝑇 × 𝑅 and compute CV with 𝜒 . ¾ (𝑘 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑎𝑔𝑠) to compare and
. approach. In the 1st panel the
into continuous variable 𝑌′ ∈ (−∞, ∞) Solution
conclude. If LM > CV, reject H0 (problem of serial autocorrelation)
endogenous variable is
= Link function 𝐹(𝑌) that takes a dichotomous 𝑌 regressed on a set of IVs while in
and gives us a continuous, real-valued 𝑌′; Choice of # lagged error terms (monthly: 11, quarterly: 3, annually: 1)
the 2nd stage the dep var is
Probit Model: regressed on the fitted values
Which function does that? Cumulative normal from the 1st stage.
distribution 𝛷 , given any Z-score it gives =
𝛷(𝑍) ∈ [0,1]// It follows: 𝐹(𝑌) = 𝛷 SI (𝑌)
Linear (OLS) Non-Linear Approach (Formulas) Non-Linear Approach (Applications) Joint-tests
In a Probit model 𝛽𝑥F is taken to be the z-value of
𝑦′F = 𝛼 + 𝛽𝑥F + 𝑢‚ 1.) Express marginal effect (formula e.g. for unempl.) F-test =- not valid under Probit or Logit since
a normal distribution ú û(ÜüI) ú î(Ö× ØÙ)
P [y=1] = 𝑥F ′ ∗ 𝛽 Logit (Logistic distribution): 𝑀𝐸 𝑜𝑓 𝛽Bïðøùó = = = 𝑓(𝑥F Ø 𝛽) ∗ non-linear!
Logit Model:
Cumulative distribution: ú (ÙÐÑÔýþÿ ) ú (ÙÐÑÔýþÿ )
Based on the odds ratio (Chancenverhältnis): Overall likelihood ratio-test > equivalent F-test
- derive for 𝑥F yields 𝛽F 1 I ·
S¸( Ö× ØÙ)¸
𝑂𝑅(𝑝) = 𝑝/(1 − 𝑝); Taking the log: 𝑙𝑜𝑔𝑖𝑡(𝑌) =
𝐹(𝑥F ′𝛽) = 𝛽ƒBïðøùó = 𝛽
ö. ∗ 𝑒 ΗC : 𝛽I = 𝛽c = 0 𝒏𝒐 𝒐𝒗𝒆𝒓𝒂𝒍𝒍 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒄𝒆
𝑙𝑜𝑔[𝑂(𝑌)] = 𝑙𝑜𝑔[𝑦/(1 − 𝑦)] - not useful in extremes √.à
1 + 𝑒 SÖ× Ø∗Ù
(negative results) 2.) Calculate an exact marginal effect ΗI : 𝛽I ≠ 0 ∨ … .∨ 𝛽c ≠ 0 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒
Density distribution: (use for marginal effect)
Properties: 𝑦“F Ö× ØÙ 𝑒 (!Íøùó. óÍÏF‚ øÍ"ðó∗øðòï Í# $òøùóð)
• always a constant in model = 𝛼 + 𝛽 𝑥F (𝐺𝑒𝑛𝑑𝑒𝑟) 𝑒 1 𝑀𝐸óÍÏF‚ = 𝛽I ∗ LR = - 2 [Lu – Lr] (likelihood unrestricted – restricted)
𝑓(𝑥F ′𝛽) = 𝛽 ∗ → 𝐹(𝑡) = (1 + 𝑒 !Íøùó. óÍÏF‚ øÍ"ðó ). ~ 𝜒2
• Certain function F gives outcomes only 0;1 P [y=1] | x (Gender) = 0] (1 + 𝑒 Ö× ØÙ ). 1 + 𝑒 S(Ö× ØÙ) (𝑘) (number of restrictions ß)
ME Probit: If overall likelihood test score > 𝜒 . , reject 𝐻Í , Assume
(failure; success) – no negative probability àP=±𝛼 Probit (Probability unit): a.) Insert sample mean into probit equation, solve. overall significance
• need threshold: if <t favour zero 0; if >t Cumulative Distribution: I · ¸
favour one 1 Non-linear approach
Ü
𝐹(𝑦) = ∫SÝ ∅(𝑢) Integral from −∞ to 𝑦 of the normal PDF b.) 𝑀𝐸ù^Í%F‚ = 𝛽I ∗ 𝑒 S¸(^ð$Bó‚ øðòï Fï ù^Í%F‚ ð&Bò‚FÍï)
√.à
• prone to Heteroskedasticity (different (𝑢 as dummy to integrate over). =Integrating up to the point Goodness of fit (for both, logit and probit):
P [y=1] = 𝐹(𝑥F ′ ∗ 𝛽) 3.) Calculate concrete value of 𝒙𝒊 for a known level of (P) ÎÍÏ(η,ÐÑÒµÓ )
behaviour of error (u) with 0;1) 𝑦 Logit: 𝑅 . = 1 − Note: R. cannot be used to
ÎÍÏ(ÎU,ÓÔÒµÓ )
- derive for 𝑥F yields Density function: (use for marginal effect) a.) Set known P level equal with logit cumulative function: chose between Logit/Probit as these two have different
𝑓(𝑥F ′ ∗ I · ¸ 1 likelihoods
𝑓(𝑥F ′𝛽) = 𝛽 ∗ 𝑒 S¸Ö× ØÙ → 𝐹(𝑥F ′𝛽) → 𝜙 𝑇𝑎𝑏𝑙𝑒𝑠! (z) 𝐿𝑜𝑔𝑖𝑡: 𝑃 𝑙𝑒𝑣𝑒𝑙 =
Second order serial correlation is a relation between residuals of 𝛽 ) ∗ 𝛽F √.à 1 + 𝑒−(𝐿𝑜𝑔𝑖𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑥𝑖)
𝟏 b.) Solve for 𝑥F
the form ut = ρ1ut−1 +ρ2ut−2 +e t which is a problem because it P [y=1] | x(Gender)=0] à F(𝑥F ′ ∗ 𝛽) = 𝐹(𝑦¡𝒊 “) =
- weigh beta by density 𝟏ã𝒆¶ä× å∗æ Probit:
indicates the model is likely to be misspecified. If p-value < 0.05,
it indicates that there is serial correlation.
function logit density
a.) To be solved: 𝛷(𝑝𝑟𝑜𝑏𝑖𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛) = 𝑃 𝑙𝑒𝑣𝑒𝑙
Autocor. In a dyn. Model causes endogeneity, - strength of probability:
Logit
b.) Short-cut: Look for Z-Score in Stat Table for P level and
set:
c.) 𝑃𝑟𝑜𝑏𝑖𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 = 𝑂𝑡𝑎𝑖𝑛𝑒𝑑 𝑍 − 𝑆𝑐𝑜𝑟𝑒 (Solve for 𝑥F )