0% found this document useful (0 votes)

27 views

EUC1502 Module1 Machine Learning

This document provides an overview of classical econometrics, including multiple regression analysis and time series models. It defines econometrics as the application of statistics and mathematics to identify relationships between predicted and predictor variables. Multiple regression analysis uses the ordinary least squares method to estimate coefficients and predict variable values while minimizing residuals. Time series models analyze variables measured sequentially over time, such as GDP, employment, births and population. The document also discusses issues like collinearity among predictor variables.

Uploaded by

Радомир Мутабџија

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

EUC1502 Module1 Machine Learning

Uploaded by

Радомир Мутабџија

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

Module 1.

Classical vs
machine learning
econometrics

THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

Eurostat
▪ Overview of classical econometrics:

➢ What is econometrics

➢ Multiple regression

➢ Time series

▪ Econometric methods in Official statistics

▪ Models of inference
2
Eurostat
1. What is econometrics

Eurostat
Overview of classical econometrics
What is econometrics

Econometric is an application of statistics

and mathematics aimed at identifying and
quantifying the relationship between two
sets of variables

▪ The predicted variables

▪ The predictor variables

Y = β0 + β1 X1 +… + βk Xk + ɛ 4
Eurostat
Overview of classical econometrics
What is econometrics

Aspects:
▪ Uncertainty regarding an outcome
▪ Relationships suggested by (economic) theory
▪ Assumptions and hypotheses to be specified
▪ Sampling process including functional form
▪ Obtaining data for the analysis
▪ Estimation rule with good statistical properties
▪ Fit and test model using software package
▪ Analyse and evaluate implications of the results
▪ Problems suggest approaches for further research

5
Eurostat
Overview of classical econometrics
What is econometrics

Examples of econometrics models:

▪ Demand and supply Models

▪ Production Functions

▪ Cost Functions

▪ Etc.

6
Eurostat
Overview of classical econometrics
What is econometrics

Demand model

ln 𝑦𝑡𝑑 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡

Quantity demanded price

Supply model

ln 𝑦𝑡𝑠 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡
Quantity supplied price
7
Eurostat
Overview of classical econometrics
What is econometrics

Production function

ln 𝑦𝑡 = 𝛽1 + 𝛽2 ln 𝑥 𝑡 + 𝜀𝑡

output input

Cobb-Douglas production function

8
Eurostat
Overview of classical econometrics
What is econometrics

Cost function

𝑦𝑡 = 𝛽1 + 𝛽2 𝑥𝑡2 + 𝜀𝑡

Total cost output

9
Eurostat
Overview of classical econometrics
What is econometrics

There are also non-lineal models:

𝑦 = 𝛽1 𝛼 𝛽2𝑥 𝑒 u

And models that can be linearises:

𝑦 = 𝛽1 𝑥 𝛽2 𝑒 u

ln y = ln 𝛽1 + 𝛽2 lnx + u= 𝛼 + 𝛽2 ln x + u
10
Eurostat
2. The multiple linear
regression model

Eurostat
Overview of classical econometrics
The multiple linear regression model

slopes Error term

Y = β0 + β1 X1 +… + βk Xk + ɛ

Predicted variable, intercept Predictor variables,

dependent variable independent variables

12
Eurostat
Overview of classical econometrics
The multiple linear regression model

Ordinary Least Squares method

Minimise σni=1 yi − β෠ 0 − β෠ 1 xi1 −. . . − β෠ k xik 2

OLS estimators of β0, β1,…βk

They give the variation of yi for

one-unit variation of xi,
mantaining the other variables
constants:
Δ𝑦ො = 𝛽መ𝑖 Δ𝑥𝑖
13
Eurostat
Overview of classical econometrics
The multiple linear regression model

Ordinary Least Squares method

Predicted value of yi : 𝑦ො𝑖 = β෠ 0 + β෠ 1 xi1 +. . . + β෠ k xik

Residual or error term : 𝜀𝑖 = 𝑦𝑖 − 𝑦ො𝑖

14
Eurostat
Overview of classical econometrics
The multiple linear regression model

Assumptions:
▪ E(𝜀𝑖 |Xi) = 0 𝜀𝑖 has conditional zero mean

▪ (Xi,Yi) i.i.d i=1,..n

▪ Xi and 𝜀𝑖 have nonzero finite fourth moment

▪ There is no perfect multicollinearity (see later)

▪ var(𝜀𝑖 |Xi) = 𝜎𝜀2 homoschedasticies

▪ The conditional distribution of 𝜀𝑖 given Xi is normal

15
Eurostat
Overview of classical econometrics
The multiple linear regression model

Goodness of Fit
𝑛
TSS = ෌𝑖=1 𝑦𝑖 − 𝑦ത 2 total variation of y
Total sum of squares
Or

TSS= ESS + RSS

Explained Sum of Squares: Residual Sum of Squares: residual
variation explained by the variation, i.e. variation explained by the
model, i.e. variation of Y residuals:
explained by X: 𝑛 𝑛
𝑛 ෌𝑖=1 𝒖𝟐𝒊 = ෌𝑖=1 𝒚𝒊 − 𝒚
ෝ𝑖 2

ෝ𝑖 − 𝑦ത
෍ 𝒚 2
16
𝑖=1 Eurostat
Overview of classical econometrics
The multiple linear regression model

Goodness of Fit

𝐄𝐒𝐒
R2= 0≤ R2 ≤ 1
𝐓𝐒𝐒

𝐑𝐒𝐒
It can also be written as 1 –
𝐓𝐒𝐒

𝑛−1
Adjusted R2 = 1- (1- R2)
𝑛−𝑘
17
Eurostat
Overview of classical econometrics
Collinearity

▪ The term “independent variable” means an

explanatory variable is independent of the
error term, but not necessarily independent
of other explanatory variables.

▪ Since economists typically have no control over

the implicit “experimental design”, explanatory
variables tend to move together which often
makes sorting out their separate influences
rather problematic.

18
Eurostat
Overview of classical econometrics
Collinearity

Evidence of high collinearity include:

▪ a high pairwise correlation between two

explanatory variables
▪ a high R-squared when regressing one
explanatory variable at a time on each of the
remaining explanatory variables
▪ a statistically significant F-value when the
t-values are statistically insignificant
▪ an R2 that doesn’t fall by much when dropping
any of the explanatory variables 19
Eurostat
Overview of classical econometrics
Collinearity

▪ Collinearity doesn’t mean the model is

misspecified

▪ Especially common problem in time series

regressions

▪ It depends on lack of adequate information in the

sample

20
Eurostat
Overview of classical econometrics
Collinearity

Some solutions:

➢ collect more data with better information

➢ impose economic restrictions as appropriate

➢ impose statistical restrictions when justified

➢ if all else fails at least point out that the poor

model performance might be due to the
collinearity problem (or it might not).
21
Eurostat
3. Time series models

Eurostat
Overview of classical econometrics
Time series models

A collection of observations made

sequentially in time (stochastic process)

Examples:
- Unemployment rate over time
- Inflation rate
- Production indices
- Number of deaths/births
- Etc.

23
Eurostat
Overview of classical econometrics
Time series models

Spanish quarterly GDP from 1995 to 2011

24
Eurostat
Overview of classical econometrics
Time series models

Total employees in Spain, from 1980 to 2004,

quarterly variation 25
Eurostat
Overview of classical econometrics
Time series models

Number of births in Spain from 1975 to 2013;

monthly data 26
Eurostat
Overview of classical econometrics
Time series models

Total number of population in Spain from 1971

to 2016 27
Eurostat
Overview of classical econometrics
Time series models

28
Eurostat
Overview of classical econometrics
Time series models

29
Eurostat
Overview of classical econometrics
Time series models

Univariate Time Series describe the behaviour

of a variable in terms of its own past values

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝜀𝑡 Random error

(white noise)
intercept coefficient

Multivariate Time Series describe the behaviour

of a variable in terms of its own past values and
the past values of other variables

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝛿1 𝑋𝑡−1 + 𝜀𝑡 30
Eurostat
Overview of classical econometrics
Time series models

First order autoregression (AR1)

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝜀𝑡

Second order autoregression (AR2)

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝛽2 𝑌𝑡−2 𝜀𝑡

pth order autoregression (ARp)

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝛽2 𝑌𝑡−2 + 𝛽𝑝 𝑌𝑡−𝑝 + 𝜀𝑡
31
Eurostat
Overview of classical econometrics
Time series models

We use OLS to estimate the coefficients

𝑌෠𝑡 = 𝛽መ0 + 𝛽መ1 𝑌𝑡−1 + 𝜀𝑡
forecast

𝜀𝑡 = 𝑌෠𝑡 − 𝑌𝑡 forecast error

▪ The forecast error is not a residual

▪ The forecast and the forecast errors pertain to
“out-of-sample” observations (in contrast to
“in-sample observations)

32
Eurostat
Overview of classical econometrics
Time series models

Lag length selection (choosing the order of p):

▪ F-statistics approach
▪ BIC (Bayes Information Criterion)
▪ AIC (Akaike Information Criterion)

33
Eurostat
Overview of classical econometrics
Time series models

Moving average process

𝑌𝑡 = 𝜇 + 𝜀𝑡 + 𝜃𝜀𝑡−1 (MA1)
....
𝑌𝑡 = 𝜇 + 𝜀𝑡 + 𝜃1 𝜀𝑡−1 + 𝜃2 𝜀𝑡−2 + ⋯ 𝜃𝑞 𝜀𝑡−𝑞 (MAq)

MA processes depend not on the level of the last

time point, but rather on the level of the last
time point’s error (ε)

34
Eurostat
Overview of classical econometrics
Time series models

ARMA process ARp

𝑌𝑡 = 𝛽0 + 𝛽1 𝑌𝑡−1 + 𝛽2 𝑌𝑡−2 + ⋯ + 𝛽𝑝 𝑌𝑡−𝑝

+ 𝜀𝑡 + 𝜃1 𝜀𝑡−1 + 𝜃2 𝜀𝑡−2 + ⋯ +𝜃𝑞 𝜀𝑡−𝑞

MAq

35
Eurostat
Overview of classical econometrics
Time series models

Nonstationarity
▪ Most economic variables (GDP, consumption, price
level, etc.) are non-stationary (upward or
downward trend over time)

▪ Nonstationarity when the probability ditribution of

𝑌𝑡 changes over time

▪ Many nonstationarity time series can be be made

stationary by differencing them one or more times
(Integrated processes)
36
Eurostat
Overview of classical econometrics
Time series models

Nonstationarity
▪ Deterministic
▪ Stochastic

Random walk: 𝑌𝑡 = 𝑌𝑡−1 + 𝜀𝑡

Random walk with drift: 𝑌𝑡 = 𝛽0 + 𝑌𝑡−1 + 𝜀𝑡

Specific case of
AR(1) with 𝜷𝟏 =1

37
Eurostat
Overview of classical econometrics
Time series models

If 𝛽1 = 1 nonstationary time series

If | 𝛽1 | <1 stationary time series

𝜷𝟏 = 1 is called Unit root

38
Eurostat
Overview of classical econometrics
Time series models

If a time series with a stochastic trend (i.e. A unit

root), the first difference of the series does not
have a trend
𝑌𝑡 − 𝑌𝑡−1 = 𝛽0 + 𝜀𝑡

ΔY stationary

𝑌𝑡 is said to be integrated of order one I(1)

39
Eurostat
Overview of classical econometrics
Time series models

▪ 𝑌𝑡 is said to be integrated of order d I(d) if it

becomes stationary after being first differenced d
times
▪ Resulting model is ARIMA model
ΔdY = 𝛽0 + 𝛽1 Δd𝑌𝑡−1 + 𝛽2 Δd𝑌𝑡−2 + ⋯ + 𝛽𝑝 Δd𝑌𝑡−𝑝
+ 𝜀𝑡 + 𝜃1 𝜀𝑡−1 + 𝜃2 𝜀𝑡−2 + ⋯ +𝜃𝑞 𝜀𝑡−𝑞

40
Eurostat
Overview of classical econometrics
Time series models
The Box-Jenkins approach:
▪ Identification
Inspect the data for stationarity, identify p and q, take first
differences
▪ Estimation
Apply least squares method (linear or no linear)

▪ Validation
Check the estimated model fit well with no autocorrelation

41
Eurostat
4. Econometric methods in
Official statistics

Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

Price comparisons over time and across countries are strongly

affected by the statistical treatment of changes in product
quality over time and differences in product quality across
countries

Matching method is not adequate to deal with substantial changes

or differences in quality bias in the price index:
• The inside the sample bias: prices of non-identical products are
matched
• The outside the sample bias: price changes of matched items are not
representative of price changes of unmatched items

43
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

A Hedonic Price Index use a regression analysis to

estimate the effect of individual characteristics, the
determinants of quality, on a product’s price.

𝑝𝑖 = ℎ 𝑧𝑖 + 𝜀𝑖 Error term

Function of the
quality characteristics

44
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

Hedonic modelling

Fully linear model

𝐾

𝑝𝑛𝑡 = 𝛽0𝑡 + ෍ 𝛽𝑘𝑡 𝑍𝑛𝑘

𝑡
+ 𝜀𝑛𝑡
𝑘=1
Multiple linear
regression
Logarithmic-linear model
𝐾

𝑙𝑛𝑝𝑛𝑡 = 𝛽0𝑡 + ෍ 𝛽𝑘𝑡 𝑍𝑛𝑘

𝑡
+ 𝜀𝑛𝑡
𝑘=1

45
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

Applications
• Housing prices
• ICT- product prices
• Producer prices

46
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

Advantages

✓ Offer a solution for the quality problem in price indices and

international comparison, provided sufficient information on
characteristics can be obtained

✓ It is used to estimate the willing to pay for, or marginal

cost of producing, the characteristics, or the underlying
demand or supply functions of these characteristics and
corresponding consumer of producer surplus

47
Eurostat
Econometric methods in Official statistics
Regression methods - Hedonic prices

Difficulties

✓ Characteristics should represent user value and user cost

✓ Needs large datasets BIG DATA

✓ Excluded variables

✓ Other price determining variables: price mark-ups

✓ New features

✓ Multicollinearity

✓ Small quantities 48
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning

Data Sources: GIS Land data (Big Data)

Hedonic function: to estimate the value associated with: land

characteristics, accessibility, externalities and expectations of
future land developments

Ln 𝑃𝑟𝑖𝑐𝑒 = 𝛼 + 𝐿𝑎𝑛𝑑 𝐶ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖𝑠𝑡𝑖𝑐𝑠′𝛽1 + 𝐴𝑐𝑐𝑒𝑠𝑠𝑖𝑏𝑖𝑙𝑖𝑡𝑦 ′ 𝛽2

+𝐸𝑥𝑡𝑒𝑟𝑛𝑎𝑙𝑖𝑡𝑖𝑒𝑠 ′ 𝛽3 + 𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛𝑠 ′𝛽4 + 𝑍𝑜𝑛𝑖𝑛𝑔′ 𝛽5
+𝐶𝑜𝑛𝑡𝑟𝑜𝑙𝑠 ′ 𝛽5 + 𝜀

49
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning

Figure 1. Effect of parcel characteristics on land prices.

The map shows log price differentials generated by different values

associated with land characteristics.

i.e. compare the price predicted by the model for each observation
combining all land characteristics and comparing it to the price
predicted for an ad-hoc observation with mean values for each
explanatory variable corresponding to this group

50
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning
Figure 1. Effect of parcel characteristics on land prices.

Observations that reduce The majority of observations in the city

prices, mainly seen have predicted prices 20% to 57%
towards the west where higher than the observation with
price differentials reach - average land characteristics in the
148% sample
51
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning

Figure 2. Accessibility values.

log price differentials generated by different accessibility values

i.e. compare the price predicted by the model for each observation
combining all accessibility variables and comparing it to the price
predicted for an ad-hoc observation with mean values for each
explanatory variable corresponding to this group

52
Eurostat
Econometric methods in Official statistics
Hedonic prices and machine learning

Figure 2. Accessibility values.

Proximity to the city center

adds value to the land

53
Eurostat
Econometric methods in Official statistics
Deseasonalisation

54
Eurostat
Econometric methods in Official statistics
Classical vs machine learning econometrics
Traditional econometrics Machine Learning
econometrics
Data features - Small to medium size - Large size
- Monthly, quarterly data (in - High frequency
Official Statistics)
- “Reasonable” number of - High dimensional datasets
variables
Model definition - Model-based relationships - Algorithm based
(usual) between variables, grounded
on (economic) theory
e.g. Multiple linear regression,
time series (ARIMA)
Model selection - Expert’s knowledge - Artificial intelligence (but it
and estimation does not avoid knowing what
- Distribution of asymptotic type of technique to apply!!)
significance methods - Automatic optimization
(“regularization”) of modes
Assumptions - Rigid distributional, - No assumptions
independence assumptions
55
Eurostat
5. Models of inference

Eurostat
Models of inference
The context of Official Statistics

• Budget restrictions to carry out traditional surveys

• Increasing concern for response burden
• Increasing non-response
• New sources of available data:
➢ Administrative data
➢ Big Data sources: traffic sensors, M2M transactions, social
media, satellite images…
• Development of mathematical-statistical methods and IT
tools that allow for other forms of data treatment

57
Eurostat
Models of inference
The objectives of statistical inference

▪ The purpose of statistical inference is to obtain information

about a population (finite or infinite) from a sample from
this population

▪ Stochastic assumptions about the individual observations

and/or the population are made

▪ Statistical information of interest includes totals, means,

proportions, ratios, quantiles, etc. or the probability
distribution of a random variable

58
Eurostat
Models of inference
Overview of different modes of inference (paradigms)

▪ Design-based

▪ Model-assisted

▪ Model-based
predictive
▪ Algorithm-based

59
Eurostat
Models of inference
Design-based inference

▪ Traditionally used by National Statistical

Institutes
➢ Use of surveys to collect data
➢ NSIs prefer not to rely on model assumptions,
particularly if they are not verifiable
➢ Statistical (mathematical) models may be difficult to
understand, communicate or even calculate in a
production environment
➢ The concepts of random sample, sampling error,
weighting observations, etc. are familiar to (educated)
users of Official Statistics
60
Eurostat
Models of inference
Design-based inference: estimation

▪ Estimators (of a mean, a total, a proportion) are obtained

by expanding or weighting the observations in the sample
with survey weights
➢ Survey weights are derived from the sample design and
available auxiliary information

▪ The statistical properties of estimators are based on the

probability distributions from the sampling design
➢ Design-based estimators have «good» statistical
properties such as asymptotic unbiasedness

61
Eurostat
Models of inference
Design-based inference: theoretical example

▪ Horvitz-Thompson estimator of a total

1
𝑌෠𝐻𝑇 = ෍ 𝑦
𝜋𝑖 𝑖
𝑖∈𝑆

where 𝜋𝑖 is the probability of selection of unit i, and 1/𝜋𝑖 is the

weight of unit i calculated on the basis of the design:
➢ Stratification (auxiliary variables that define the strata)
➢ Sample size
➢ Corrections for non-response, calibration, etc.

62
Eurostat
Models of inference
Design-based inference: limitations

▪ Design-based inference may not be suitable when

➢ samples are small

➢ in presence of non-sampling errors
➢ discontinuities in survey design (e.g. change in data
collection mode, new classifications, methodological
change of concepts)
➢ Design-based estimators do not take into account the changes and cannot
separate the «real» change from the methodological change

63
Eurostat
Models of inference
Model-assisted inference

▪ Design-based estimators of the parameter of a variable can

be improved by using auxiliary information and modelling
the relationship between the variable and the auxiliary
information (=model-assisted)

64
Eurostat
Models of inference
Model-assisted inference: estimation

▪ HT estimator obtained from a linear regression model that

relates the parameter to auxiliary information
▪ Observed (𝑥𝑘 ; 𝑦𝑘 ) for a sample S (e.g. administrative and survey
data), x are observed for the whole U universe
1
▪ 𝑋෠𝐻𝑇 = σ𝑖∈𝑆 𝑥 is the grossed-up total of observed auxiliary x values
𝜋𝑖 𝑖
▪ 𝑋 = σ𝑖∈𝑼 𝑥𝑖 is the known total of auxiliary x values
1
▪ 𝑌෠𝐻𝑇 = σ𝑖∈𝑆 𝑦 is the Horvitz-Thompson estimate
𝜋𝑖 𝑖
෡ 𝐻𝑇 is the regression (=model-based) estimate
▪ 𝑌෠𝑹 = 𝑌෠𝐻𝑇 + 𝒃 · 𝑋 − 𝑿
based on the regression model 𝑦 = 𝑎 + 𝑏 · 𝑥 estimated from the
sample of observed (𝑥𝑘 ; 𝑦𝑘 )

65
Eurostat
Models of inference
Design-based and model-assisted: examples of
application in official statistics
▪ Generalised regression estimator (GREG) widely used by
NSIs for calibration
➢ Adjusts totals for sub-populations (consistency across tables)
➢ Adjusts to known totals
▪ Small Area Estimation (estimation borrowing strength
over space)
▪ Surveys based on panels (estimation borrowing strength
from the past)
▪ Modelling survey discontinuities
▪ Integration of sources in National Accounts
▪ Hedonic Price Indices
▪ Seasonal adjustment of statistical series 66
Eurostat
Models of inference
Algorithm-based inference

• In the algorithmic approach, the equivalent of fitting a

model is tuning an algorithm, so that it predicts well

• It is generally impossible to express algorithmic methods

analytically in terms of a mathematical expression

• In the algorithmic approach, the data for which both x and

y are known is split into two parts
• TRAINING SET: part is used to tune the algorithm
• TEST SET: part used to evaluate – or test – the predictive capabilities
of the trained algorithm

67
Eurostat
Models of inference
Algorithm-based inference: types of data

▪ collected from units through a targeted survey (e.g.

Structural Business Survey, Labour Force survey)

▪ collected from units in support of some administrative

process (e.g. tax records, unemployment benefits)

▪ other types, registering events (e.g. a transaction, an e-

mail, a Tweet) generated as by-products of processes
unrelated to statistics or administration

68
Eurostat
Models of inference
Algorithm-based inference: types of data

Feature Survey Admin data Other

data data
Records are units of a target population Yes Yes No
Target variables are directly available Yes Yes No
Auxiliary variables are directly available Yes Often No
Data preparation/ conversion is needed No No Yes
Data covers the complete target population No Often Rarely
Data are (almost) representative Usually Usually No
Susceptibility to measurement error High Medium low

Source: Buelens et al. (2012)

69
Eurostat
Models of inference
Algorithm-based inference: theoretical examples

• Similar to the model-based estimator, the algorithmic

estimator is
𝑌෠𝑨𝒍𝒈 = ෍ 𝒚𝒌 + ෍ 𝑭(𝒙𝒌 )
𝒌∈𝑺 𝒌∈𝑹

• For some function F() which maps the observed x to the

corresponding y within S (training set of units for which y is
known), the set R contains the population units with
unknown y.

• Uncertainty of this estimator arises from the imperfect

predictive power of the algorithm, and is assessed on the
test set using some cost function. 70
Eurostat
Models of inference
Algorithm-based inference: examples of application in
official statistics

▪ Central Statistics Office of Ireland: automatic coding

system for Classification of Individual Consumption by
Purpose (COICOP) assignment for their Household Budget
Survey, using previously coded records as training data
▪ Statistics New Zealand: Support Vector Machines (SVM)
to improve coding of variables Occupation and Post-school
Qualification, using two disjoint sets of observations, each
of size 10,000, from Census 2013 data for training and
testing (50% correctness).
▪ Statistics Portugal: classification trees (a type of decision
trees whose response variables are categorical) for error
detection in foreign trade transaction data.
71
Eurostat
Models of inference
Algorithm-based inference: examples of application in
official statistics (2)

▪ US Department of Agriculture: hierarchical clustering to

reduce the number of Quarterly Agriculture Survey (QAS)
questionnaire versions (states x crops).
▪ Italian National Institute of Statistics: substituting
(fully or partially) ICT in Enterprises surveys by collecting
data via website scraping and extracting information using
machine learning methods.
▪ Statistics Canada: use of satellite imaging data to assist
with estimation of crop yields. Field surveyors were sent to
corresponding actual locations to ascertain crop types and
yields; these were used as response variables. Probabilistic
image processing algorithms were used to learn and predict
the field observations based on the satellite data. 72
Eurostat
References
▪ J.H. Stock and M.W. Watson (2003). Introduction to econometrics,
Addison Wesley
▪ W.H. Green (2003). Econometrics analysis, Prentice Hall
▪ J. van den Brakel and J. Betlehem (2008). Model-based estimation for
official statistics. Statistics Netherlands, discussion paper (08002)
▪ K. Chu and Cl. Poirier, Statistics Canada (2015). Machine Learning
Documentation Initiative. HIGH-LEVEL GROUP FOR THE MODERNISATION
OF STATISTICAL PRODUCTION AND SERVICES, Modernisation Committee
on Production and Methods
▪ Buelens, B. H.J. Boonstra, J. van den Brakel, P. Daas (2012). Shifting
paradigms in official statistics. Statistics Netherlands, discussion paper
(201218)
▪ CROS Portal on MEMOBUST:
▪ Generalised Regression Estimator (Method)
▪ Calibration (Method)

73
Eurostat
References
▪ OECD, Eurostat, ILO, IMF, The World Bank, UNECE (2013). Handbook on
Residential Property Price Indices (RPPIs)
▪ Peter Hein van Mulligen (2003). Quality aspects in price indices and
international comparisons: Applications of the hedonic method
▪ C. Goytia and G. Dorna (Universidad Torcuato Di Tella)(2016). Big data
and a Spatial Hedonic Approach: Addressing the land market information
gap and estimating land prices determinants in metropolitan regions from
developing countries

74
Eurostat

Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
000 - Data Manipulation With Pandas - DataCamp
100% (1)
000 - Data Manipulation With Pandas - DataCamp
5 pages
INTRODUCTION TO ECONOMETRICS (Cap1) PDF
0% (1)
INTRODUCTION TO ECONOMETRICS (Cap1) PDF
32 pages
Advanced Econometrics - 1985 - 1era Edición - Amemiya
100% (1)
Advanced Econometrics - 1985 - 1era Edición - Amemiya
531 pages
EUC1502 Module1 Machine Learning
No ratings yet
EUC1502 Module1 Machine Learning
154 pages
Unit 1 - Part 1
No ratings yet
Unit 1 - Part 1
105 pages
Introduction To Financial Econometrics
No ratings yet
Introduction To Financial Econometrics
38 pages
G. S. Maddala - Introduction To Econometrics-Macmillan Pub. Co. - Maxwell Macmillan Canada - Maxwell Macmillan International (1992)
No ratings yet
G. S. Maddala - Introduction To Econometrics-Macmillan Pub. Co. - Maxwell Macmillan Canada - Maxwell Macmillan International (1992)
637 pages
Econometrics Cheat Sheet_ ?
No ratings yet
Econometrics Cheat Sheet_ ?
10 pages
Step by Step Econometric Modelling
No ratings yet
Step by Step Econometric Modelling
24 pages
Introduction S
No ratings yet
Introduction S
31 pages
Econometric Modeling
No ratings yet
Econometric Modeling
38 pages
Lecture #1
No ratings yet
Lecture #1
22 pages
Lecture Notes - Econometrics I - Andrea Weber
No ratings yet
Lecture Notes - Econometrics I - Andrea Weber
119 pages
Economics 308: Econometrics Professor Moody: Describing The Relationship Between Two Variables
No ratings yet
Economics 308: Econometrics Professor Moody: Describing The Relationship Between Two Variables
8 pages
A Review of Basic Econometrics
No ratings yet
A Review of Basic Econometrics
5 pages
ECM Class 1 2 3
No ratings yet
ECM Class 1 2 3
65 pages
Ec 384 Applied Econometrics Topic 1 - 2023
No ratings yet
Ec 384 Applied Econometrics Topic 1 - 2023
99 pages
Applied Econometrics Module
100% (1)
Applied Econometrics Module
142 pages
Econometrics: Damodar Gujarati
No ratings yet
Econometrics: Damodar Gujarati
36 pages
Econometrics Madala
No ratings yet
Econometrics Madala
1 page
Doç - Dr. Özgür Ömer Ersin: Introduction, Basic Definitions and Concepts
No ratings yet
Doç - Dr. Özgür Ömer Ersin: Introduction, Basic Definitions and Concepts
50 pages
Week1 Lecture1
No ratings yet
Week1 Lecture1
65 pages
AEphd 2023 Week 1
No ratings yet
AEphd 2023 Week 1
70 pages
Econometrics Lecture Notes
No ratings yet
Econometrics Lecture Notes
16 pages
Econometrics 1st Edition Thomas Andren - The complete ebook version is now available for download
100% (1)
Econometrics 1st Edition Thomas Andren - The complete ebook version is now available for download
61 pages
Econometrics II
No ratings yet
Econometrics II
15 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
Study Material_Econometrics - Copy
No ratings yet
Study Material_Econometrics - Copy
20 pages
Week1 Combined
No ratings yet
Week1 Combined
38 pages
A Brief Overview of The Classical Linear Regression Model (CLRM)
No ratings yet
A Brief Overview of The Classical Linear Regression Model (CLRM)
85 pages
BOOK MADDLA Econometric - Introduction To Econometrics
0% (1)
BOOK MADDLA Econometric - Introduction To Econometrics
637 pages
Mad Dala
No ratings yet
Mad Dala
637 pages
AEphd 2023 Week 1 Small
No ratings yet
AEphd 2023 Week 1 Small
18 pages
Econometrics Lecture1
No ratings yet
Econometrics Lecture1
24 pages
Econometrics Chapter _Two (1)
No ratings yet
Econometrics Chapter _Two (1)
71 pages
Linear Models and Econometrics Chapter 1-9-2
No ratings yet
Linear Models and Econometrics Chapter 1-9-2
206 pages
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
No ratings yet
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
16 pages
Maddala G.S. Introduction To Econometrics
100% (1)
Maddala G.S. Introduction To Econometrics
637 pages
Ch2 Slides Edited
No ratings yet
Ch2 Slides Edited
66 pages
Econometrics - Theory and Applications With EViews - 2005 (Vogelvang) PDF
100% (17)
Econometrics - Theory and Applications With EViews - 2005 (Vogelvang) PDF
379 pages
lecture_18_build_arima (1)
No ratings yet
lecture_18_build_arima (1)
22 pages
10.4324 9780203157688 Previewpdf
No ratings yet
10.4324 9780203157688 Previewpdf
36 pages
econometrics 1
No ratings yet
econometrics 1
7 pages
Econometrics by Example 2nd Edition Damodar Gujarati All Chapters Instant Download
100% (1)
Econometrics by Example 2nd Edition Damodar Gujarati All Chapters Instant Download
51 pages
[Ebooks PDF] download Econometrics by Example 2nd Edition Damodar Gujarati full chapters
100% (2)
[Ebooks PDF] download Econometrics by Example 2nd Edition Damodar Gujarati full chapters
50 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Chapter1
No ratings yet
Chapter1
55 pages
Econometrics - Basic 1-8
100% (1)
Econometrics - Basic 1-8
58 pages
ECONO Chap 1
No ratings yet
ECONO Chap 1
49 pages
Econemtrics_ppt_
No ratings yet
Econemtrics_ppt_
230 pages
Econometrics Professor Seppo Pynn Onen Department of Mathematics and Statistics University of Vaasa
No ratings yet
Econometrics Professor Seppo Pynn Onen Department of Mathematics and Statistics University of Vaasa
11 pages
Econometrics Chapter Two-1
No ratings yet
Econometrics Chapter Two-1
41 pages
Linear Models and Econometrics Chapter 4 Econometrics
No ratings yet
Linear Models and Econometrics Chapter 4 Econometrics
156 pages
Week 1 L1 Introduction to Eco+ Analysis
No ratings yet
Week 1 L1 Introduction to Eco+ Analysis
25 pages
Quantitative Chapter10
No ratings yet
Quantitative Chapter10
27 pages
Understanding Econometrics Basics
No ratings yet
Understanding Econometrics Basics
10 pages
Course Outline 2020
No ratings yet
Course Outline 2020
3 pages
Thesis PDF
No ratings yet
Thesis PDF
167 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
EUC1502 Module2 Machine Learning
No ratings yet
EUC1502 Module2 Machine Learning
32 pages
EuroStat TextAnalysistraining
No ratings yet
EuroStat TextAnalysistraining
12 pages
EUC1502 Module6 TextualAnalysis
No ratings yet
EUC1502 Module6 TextualAnalysis
99 pages
Tricet
No ratings yet
Tricet
3 pages
Euro Stat Map Reducetraining
No ratings yet
Euro Stat Map Reducetraining
10 pages
CURSL
No ratings yet
CURSL
60 pages
Scu - 37 23101713060
No ratings yet
Scu - 37 23101713060
1 page
102 - Sorting and Subsetting - Python
No ratings yet
102 - Sorting and Subsetting - Python
2 pages
101 - Introducing DataFrames - Python
No ratings yet
101 - Introducing DataFrames - Python
2 pages
Resources To Learn Statistical Programs: SPSS Webpage: Ucla: SPSS Tutorial: SPSS List
No ratings yet
Resources To Learn Statistical Programs: SPSS Webpage: Ucla: SPSS Tutorial: SPSS List
1 page
Master of European Legal Studies - MELS Online: Faculty Alumni
No ratings yet
Master of European Legal Studies - MELS Online: Faculty Alumni
2 pages
Template of GSBPM - ENG
No ratings yet
Template of GSBPM - ENG
15 pages
Sampling Methods: 1 Simple Random Sampling (S.R.S.)
No ratings yet
Sampling Methods: 1 Simple Random Sampling (S.R.S.)
1 page
1stmarch Updated Version - February, New Topics, Presentation Online Course 2021
No ratings yet
1stmarch Updated Version - February, New Topics, Presentation Online Course 2021
27 pages
IT Report Annex2
No ratings yet
IT Report Annex2
4 pages
Scholarship MELS 2021-2023
No ratings yet
Scholarship MELS 2021-2023
2 pages
A. Reporting Non-Response, Types
No ratings yet
A. Reporting Non-Response, Types
1 page
How To Do A Pre-Test
No ratings yet
How To Do A Pre-Test
1 page
Tipkamo Li U Mraku?-Upalimo Svjetlo!: Annex 1
No ratings yet
Tipkamo Li U Mraku?-Upalimo Svjetlo!: Annex 1
2 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
Bab 3 Research Design
No ratings yet
Bab 3 Research Design
10 pages
Data Awal
No ratings yet
Data Awal
1,798 pages
Uji Flavonoid Ekstrak Daun Pepaya: Positif Flavonoid (Warna Merah-Merah Bata)
No ratings yet
Uji Flavonoid Ekstrak Daun Pepaya: Positif Flavonoid (Warna Merah-Merah Bata)
3 pages
Some Examples of Survival Studies
No ratings yet
Some Examples of Survival Studies
29 pages
Chang Grad - Msu 0128D 13013
No ratings yet
Chang Grad - Msu 0128D 13013
183 pages
Group 2, Wisdom (Stem) Theoretical and Conceptual Framework, Chapter 3, Original Version
No ratings yet
Group 2, Wisdom (Stem) Theoretical and Conceptual Framework, Chapter 3, Original Version
6 pages
The Estimation of Measurement Results Using Statistical Methods
No ratings yet
The Estimation of Measurement Results Using Statistical Methods
7 pages
Unit-1 Correlation and Regression
No ratings yet
Unit-1 Correlation and Regression
46 pages
Data Dreamer
No ratings yet
Data Dreamer
3 pages
Immediate download (Ebook) Data Analytics for the Social Sciences: Applications in R by Garson, G. David ISBN 9780367624293, 9780367624279, 9781003109396, 036762429X, 0367624273, 100310939X ebooks 2024
100% (5)
Immediate download (Ebook) Data Analytics for the Social Sciences: Applications in R by Garson, G. David ISBN 9780367624293, 9780367624279, 9781003109396, 036762429X, 0367624273, 100310939X ebooks 2024
81 pages
Examples of Business Intelligence: Data Visualization
No ratings yet
Examples of Business Intelligence: Data Visualization
2 pages
Rail Training Conderence - Development of International Training in Asia
No ratings yet
Rail Training Conderence - Development of International Training in Asia
6 pages
The Relationship of Product Quality and Discount Toward Purchasing Decisions On Manufacturing Brand Startup The Case in Jabodetabek
No ratings yet
The Relationship of Product Quality and Discount Toward Purchasing Decisions On Manufacturing Brand Startup The Case in Jabodetabek
6 pages
21ai402 Data Analytics Unit-3
No ratings yet
21ai402 Data Analytics Unit-3
150 pages
next_level_data_science_sample chapter
No ratings yet
next_level_data_science_sample chapter
37 pages
Qualitative Vs Quantitative Risk Analysis-3
No ratings yet
Qualitative Vs Quantitative Risk Analysis-3
6 pages
NP000418 CT127 3 2 Pfda
No ratings yet
NP000418 CT127 3 2 Pfda
28 pages
RESEARCH PROCESS - Research Execution - 29th October 2024
No ratings yet
RESEARCH PROCESS - Research Execution - 29th October 2024
12 pages
MINITAB For The Calculus: Confidence and Prediction Intervals For The Data in Table 12.1
No ratings yet
MINITAB For The Calculus: Confidence and Prediction Intervals For The Data in Table 12.1
3 pages
Bangalore House Price Prediction Using The Best Machine Learning Model Submitted by Rukzana Vadakkekudy Rassak P2682221
No ratings yet
Bangalore House Price Prediction Using The Best Machine Learning Model Submitted by Rukzana Vadakkekudy Rassak P2682221
9 pages
Module 2
No ratings yet
Module 2
21 pages
Bài Tập Slides
No ratings yet
Bài Tập Slides
26 pages
Secondary Data in Mixed Methods Research (Daphne C. Watkins) (Z-Library)
No ratings yet
Secondary Data in Mixed Methods Research (Daphne C. Watkins) (Z-Library)
265 pages
Community Based Tourism
No ratings yet
Community Based Tourism
12 pages
S.R.M.Alloys Private Limited: Project Study Report
No ratings yet
S.R.M.Alloys Private Limited: Project Study Report
57 pages
Faculty of Education and Languages: TH TH RD
No ratings yet
Faculty of Education and Languages: TH TH RD
4 pages
Lecture 5 - Spring 2024
No ratings yet
Lecture 5 - Spring 2024
30 pages
Adigrat University: Collage of Business and Economics
No ratings yet
Adigrat University: Collage of Business and Economics
10 pages
Lesson 04
No ratings yet
Lesson 04
5 pages