EUC1502 Module1 Machine Learning
EUC1502 Module1 Machine Learning
Classical vs
machine learning
econometrics
THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION
Eurostat
▪ Overview of classical econometrics:
➢ What is econometrics
➢ Multiple regression
➢ Time series
▪ Models of inference
2
Eurostat
1. What is econometrics
Eurostat
Overview of classical econometrics
What is econometrics
Y = β0 + β1 X1 +… + βk Xk + ɛ 4
Eurostat
Overview of classical econometrics
What is econometrics
Aspects:
▪ Uncertainty regarding an outcome
▪ Relationships suggested by (economic) theory
▪ Assumptions and hypotheses to be specified
▪ Sampling process including functional form
▪ Obtaining data for the analysis
▪ Estimation rule with good statistical properties
▪ Fit and test model using software package
▪ Analyse and evaluate implications of the results
▪ Problems suggest approaches for further research
5
Eurostat
Overview of classical econometrics
What is econometrics
▪ Production Functions
▪ Cost Functions
▪ Etc.
6
Eurostat
Overview of classical econometrics
What is econometrics
Demand model
ln 𝑦𝑡𝑑 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡
Supply model
ln 𝑦𝑡𝑠 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡
Quantity supplied price
7
Eurostat
Overview of classical econometrics
What is econometrics
Production function
ln 𝑦𝑡 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡
output input
8
Eurostat
Overview of classical econometrics
What is econometrics
Cost function
𝑦𝑡 = 𝛽1 + 𝛽2 𝑥𝑡2 + 𝜀𝑡
9
Eurostat
Overview of classical econometrics
What is econometrics
ln y = ln 𝛽1 + 𝛽2 lnx + u= 𝛼 + 𝛽2 ln x + u
10
Eurostat
2. The multiple linear
regression model
Eurostat
Overview of classical econometrics
The multiple linear regression model
Y = β0 + β1 X1 +… + βk Xk + ɛ
12
Eurostat
Overview of classical econometrics
The multiple linear regression model
14
Eurostat
Overview of classical econometrics
The multiple linear regression model
Assumptions:
▪ E(𝜀𝑖 |Xi) = 0 𝜀𝑖 has conditional zero mean
Goodness of Fit
𝑛
TSS = 𝑖=1 𝑦𝑖 − 𝑦ത 2 total variation of y
Total sum of squares
Or
ෝ𝑖 − 𝑦ത
𝒚 2
16
𝑖=1 Eurostat
Overview of classical econometrics
The multiple linear regression model
Goodness of Fit
𝐄𝐒𝐒
R2= 0≤ R2 ≤ 1
𝐓𝐒𝐒
𝐑𝐒𝐒
It can also be written as 1 –
𝐓𝐒𝐒
𝑛−1
Adjusted R2 = 1- (1- R2)
𝑛−𝑘
17
Eurostat
Overview of classical econometrics
Collinearity
18
Eurostat
Overview of classical econometrics
Collinearity
20
Eurostat
Overview of classical econometrics
Collinearity
Some solutions:
Eurostat
Overview of classical econometrics
Time series models
Examples:
- Unemployment rate over time
- Inflation rate
- Production indices
- Number of deaths/births
- Etc.
23
Eurostat
Overview of classical econometrics
Time series models
28
Eurostat
Overview of classical econometrics
Time series models
29
Eurostat
Overview of classical econometrics
Time series models
𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝛿1 𝑋𝑡−1 + 𝜀𝑡 30
Eurostat
Overview of classical econometrics
Time series models
32
Eurostat
Overview of classical econometrics
Time series models
33
Eurostat
Overview of classical econometrics
Time series models
34
Eurostat
Overview of classical econometrics
Time series models
MAq
35
Eurostat
Overview of classical econometrics
Time series models
Nonstationarity
▪ Most economic variables (GDP, consumption, price
level, etc.) are non-stationary (upward or
downward trend over time)
Nonstationarity
▪ Deterministic
▪ Stochastic
37
Eurostat
Overview of classical econometrics
Time series models
38
Eurostat
Overview of classical econometrics
Time series models
ΔY stationary
39
Eurostat
Overview of classical econometrics
Time series models
40
Eurostat
Overview of classical econometrics
Time series models
The Box-Jenkins approach:
▪ Identification
Inspect the data for stationarity, identify p and q, take first
differences
▪ Estimation
Apply least squares method (linear or no linear)
▪ Validation
Check the estimated model fit well with no autocorrelation
41
Eurostat
4. Econometric methods in
Official statistics
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
43
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
𝑝𝑖 = ℎ 𝑧𝑖 + 𝜀𝑖 Error term
Function of the
quality characteristics
44
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
Hedonic modelling
45
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
Applications
• Housing prices
• ICT- product prices
• Producer prices
46
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
Advantages
47
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices
Difficulties
✓ Excluded variables
✓ New features
✓ Multicollinearity
✓ Small quantities 48
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning
49
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning
i.e. compare the price predicted by the model for each observation
combining all land characteristics and comparing it to the price
predicted for an ad-hoc observation with mean values for each
explanatory variable corresponding to this group
50
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning
Figure 1. Effect of parcel characteristics on land prices.
i.e. compare the price predicted by the model for each observation
combining all accessibility variables and comparing it to the price
predicted for an ad-hoc observation with mean values for each
explanatory variable corresponding to this group
52
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning
53
Eurostat
Econometric methods in Official statistics
Deseasonalisation
54
Eurostat
Econometric methods in Official statistics
Classical vs machine learning econometrics
Traditional econometrics Machine Learning
econometrics
Data features - Small to medium size - Large size
- Monthly, quarterly data (in - High frequency
Official Statistics)
- “Reasonable” number of - High dimensional datasets
variables
Model definition - Model-based relationships - Algorithm based
(usual) between variables, grounded
on (economic) theory
e.g. Multiple linear regression,
time series (ARIMA)
Model selection - Expert’s knowledge - Artificial intelligence (but it
and estimation does not avoid knowing what
- Distribution of asymptotic type of technique to apply!!)
significance methods - Automatic optimization
(“regularization”) of modes
Assumptions - Rigid distributional, - No assumptions
independence assumptions
55
Eurostat
5. Models of inference
Eurostat
Models of inference
The context of Official Statistics
57
Eurostat
Models of inference
The objectives of statistical inference
58
Eurostat
Models of inference
Overview of different modes of inference (paradigms)
▪ Design-based
▪ Model-assisted
▪ Model-based
predictive
▪ Algorithm-based
59
Eurostat
Models of inference
Design-based inference
61
Eurostat
Models of inference
Design-based inference: theoretical example
1
𝑌𝐻𝑇 = 𝑦
𝜋𝑖 𝑖
𝑖∈𝑆
62
Eurostat
Models of inference
Design-based inference: limitations
63
Eurostat
Models of inference
Model-assisted inference
64
Eurostat
Models of inference
Model-assisted inference: estimation
65
Eurostat
Models of inference
Design-based and model-assisted: examples of
application in official statistics
▪ Generalised regression estimator (GREG) widely used by
NSIs for calibration
➢ Adjusts totals for sub-populations (consistency across tables)
➢ Adjusts to known totals
▪ Small Area Estimation (estimation borrowing strength
over space)
▪ Surveys based on panels (estimation borrowing strength
from the past)
▪ Modelling survey discontinuities
▪ Integration of sources in National Accounts
▪ Hedonic Price Indices
▪ Seasonal adjustment of statistical series 66
Eurostat
Models of inference
Algorithm-based inference
67
Eurostat
Models of inference
Algorithm-based inference: types of data
68
Eurostat
Models of inference
Algorithm-based inference: types of data
69
Eurostat
Models of inference
Algorithm-based inference: theoretical examples
73
Eurostat
References
▪ OECD, Eurostat, ILO, IMF, The World Bank, UNECE (2013). Handbook on
Residential Property Price Indices (RPPIs)
▪ Peter Hein van Mulligen (2003). Quality aspects in price indices and
international comparisons: Applications of the hedonic method
▪ C. Goytia and G. Dorna (Universidad Torcuato Di Tella)(2016). Big data
and a Spatial Hedonic Approach: Addressing the land market information
gap and estimating land prices determinants in metropolitan regions from
developing countries
74
Eurostat