Forecasting Models - PPT
Forecasting Models - PPT
• Medium-term forecasts
• Long-term forecasts
Forecasting data and methods
• Qualitative forecasting
• Delphi Method
• Forecasting by analogy
• Scenario forecasting
• Quantitative forecasting
• Time-series forecasting
• Predictor variables and time series forecasting
❖ Given a time series of data Xt , the ARMA model is a tool for understanding and predicting
future values in this series.
❖ This acronym is descriptive, capturing the key aspects of the model itself. Briefly, they are:
AR: Autoregression. A model that uses the dependent relationship between an observation
and some number of lagged observations.
MA: Moving Average. A model that uses the dependency between an observation and a
residual error from a moving average model applied to lagged observations.
ARIMA Models
• Role of Models
12.00
10.00
8.00
Series1
6.00
4.00
2.00
0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
D(t ) = D + e(t )
where
D is the constant mean of the series and
e(t ) is normally distributed with zero mean and some unknown
variance
2
• Also
1 N −1 1
E[ (t + 1)] = E[ D(t − i)] − E[ D(t + 1)] = ( ND ) − D = 0
N i =0 N
1 N −1 1 1 2
Var [ (t + 1)] = 2 Var [ D(t − i)] + Var [ D(t + 1)] = 2 N + = (1 + )
2 2
N i =0 N N
Forecasting error (cont.)
• (t+1) is normally distributed with the mean and variance computed
in the previous slide.
20.00
15.00
Series1
Series2
Series3
10.00
5.00
0.00
1
10
13
16
19
22
25
28
31
34
37
40
• blue series: the original data series.
• magenta series: the predictions of the MA(5) forecasting model.
• yellow series: the predictions of the MA(10) forecasting model.
• Remark: the MA(5) model adjusts faster to the experienced jump of the data mean value, but the mean estimates that it
provides under stationary operation are less accurate than those provided by the MA(10) model.
Exponential Smoothing
With exponential smoothing the idea is that the most recent observations
will usually provide the best guide as to the future, so we want a weighting
scheme that has decreasing weights as the observations get older.
Forecasting constant mean series:
The Simple Exponential Smoothing model
• The presumed demand model:
D(t ) = D + e(t )
where D is an unknown constant and e(t ) is normally distributed with zero mean
and an unknown variance 2 .
• The forecastDˆ (t ) , at the end of period t:
ˆ ˆ ˆ ˆ
D (t ) = aD(t ) + (1 − a) D (t − 1) = D (t − 1) + a[ D(t ) − D (t − 1)]
where (0,1) is known as the “smoothing constant”.
• Remark: The updating equation constitutes a correction of the previous
estimate in the direction suggested by the forecasting error, D(t ) − Dˆ (t − 1)
Expanding the Model Recursion
ˆ (t ) = aD(t ) + (1 − a) D
D ˆ (t − 1)
ˆ (t − 2) =
= aD(t) + a(1− a)D(t −1) + (1− a)2 D
.................................................................................................
t −1
= a (1 − a) D(t − i) + (1 − a) Dˆ (0)
i t
i =0
The impact of and of Dˆ (0) on
the model performance
25.00
20.00
15.00 Series1
Series2
Series3
10.00 Series4
5.00
0.00
1
10
13
16
19
22
25
28
31
34
37
40
• Dark blue series: the original data series.
• magenta series: the predictions of the ES(0.2) model initialized at the value of 10.0.
• yellow series: the predictions of the ES(0.2) model initialized as 0.0.
• light blue series: the predictions of the ES(0.8) model initialized at 10.0.
• Remark: the ES(0.8) model adjusts faster to the jump of the series mean value, but the estimates that it provides under
stationary operation are less accurate than those provided by the ES(0.2) model. Also, notice that the effect of the initial
value is only transient.
The inadequacy of SES and MA models for
data with linear trends
12
10
8
Dt
6 SES(0.5)
SES(1.0)
4
0
1 2 3 4 5 6 7 8 9 10
• blue series: a deterministic data series increasing linearly with a slope of 1.0.
• magenta series: the predictions obtained from the SES(0.5) model initialized at the exact value of 1.0.
• yellow series: the predictions obtained from the SES(1.0) model initialized at the exact value of 1.0.
• Remark: Both models under-estimate the actual values, with the most inert model SES(0.5) under-
estimating the most. This should be expected since both of these models (as well as any MA model)
essentially average the past observations. Therefore, neither the MA nor the SES model are appropriate for
forecasting a data series with a linear trend in it.
Forecasting series with linear trend:
The Double Exponential Smoothing Model
I is the model intercept, i.e., the unknown mean value for t=0,
T is the model trend, i.e., the mean increase per unit of time, and
e(t ) is normally distributed with zero mean and some unknown
variance 2
The Double Exponential Smoothing Model (cont.)
10
8
Dt
6 DES(T0=1)
DES(T0=0)
4
0
1 2 3 4 5 6 7 8 9 10
• blue series: a deterministic data series increasing linearly with a slope of 1.0.
• magenta series: the predictions obtained from the DES(0.5;0.2) model initialized at the exact value of 1.0.
• yellow series: the predictions obtained from the DES(0.5;0.2) model initialized at the value of 0.0.
• Remark: In the absence of variability in the original data, the first model is completely accurate (the blue and
the magenta series overlap completely), while the second model overcomes the deficiency of the wrong initial
estimate and eventually converges to the correct values.
Time Series-based Forecasting:
Accommodating seasonal behavior
The data demonstrate a periodic behavior (and maybe some
additional linear trend).
350
300
250
200
Series1
150
100
50
0
0 2 4 6 8 10 12 14
Remarks:
• At each cycle, the demand of a particular season is a fairly stable percentage of the total demand over the cycle.
• Hence, the ratio of a seasonal demand to the average seasonal demand of the corresponding cycle will be fairly
constant.
• This ratio is characterized as the corresponding seasonal index.
A forecasting methodology
Forecasts for the seasonal demand for subsequent years can be obtained by:
i. estimating the seasonal indices corresponding to the various seasons in the
cycle;
ii. estimating the average seasonal demand for the considered cycle (using, for
instance, a forecasting model for a series with constant mean or linear trend,
depending on the situation);
iii. adjusting the average seasonal demand by multiplying it with the
corresponding seasonal index.
Example (cont.):
Car
1648 1665 1627 1791 1797
ownership
{ D j ; X 1 j , X 2 j ,..., X kj , j = 1,..., n}
Estimating the parameters bi
• The observed data satisfy the following equation:
D1 1 X 11 ... X k1 b0 e1
D 1 X ... X k 2 b1 e2
2 = 12
+
... ... ... ... ... ... ...
Dn 1 X 1n ... X kn bk en
or in a more concise form
d = X b + e
• The vector
e = d − X b
denotes the difference between the actual observations and the corresponding mean
values, and therefore, b̂ is selected such that it minimizes the Euclidean norm of the
resulting vector . eˆ = d − X bˆ
• The minimizing value for b̂ is equal to bˆ = ( X T X ) −1 X T d
• The necessary and sufficient condition for the existence of( X T X ) −1 is that the columns of
matrix X are linearly independent.
Characterizing the model variance
• An unbiased estimate of 2 is given by
SSE
MSE = (Mean Squared Error)
n − k −1
where
goodness of fit • The quantity SSE/2 follows a Chi-square distribution with n-k-1 degrees of freedom.
• The random variable Dˆ ( x0 ) can function also as an estimator for any single observation
D(x0). The resulting errorDˆ ( x0 ) − D ( x0 )will have zero mean and variance 2 [1 + x0T ( X T X ) −1 x0 ]
Assessing the goodness of fit
• A rigorous characterization of the quality of the resulting approximation can be obtained
through Analysis of Variance, that can be traced in any introductory book on statistics.
SSR
R =2
SYY
where n
SSR = bˆT ( X T d ) − nd 2 = ( Dˆ j − d ) 2
n j =1
1
d = Dj
and
n j =1
• When the only explanatory variable is just the time variable t, the resulting multiple linear
regression model essentially supports time-series analysis.
• The above approach for time-series analysis enables the study of more complex
dependencies on time than those addressed by the moving average and exponential
smoothing models.
• The integration of a new observation in multiple linear regression models is much more
weighty than the updating performed by the moving average and exponential smoothing
models (although there is an incremental linear regression model that alleviates this
problem).
Confidence Intervals
• Given a random variable X and p(0,1), a p100% confidence
interval (CI) for it is an interval [a,b] such that
P ( a X b) = p
• Confidence intervals are used in:
i. monitoring the performance of the applied forecasting model;
ii. adjusting an obtained forecast in order to achieve a certain
performance level
Whereas in the simple moving average the past observations are weighted
equally, exponential functions are used to assign exponentially decreasing
weights over time.
Here, P is the traffic volume in PCU/day and GDP is the gross domestic