Forecasting epidemic:
Time series modelling
Dr Cho-Min-Naing
Medical Officer (Malaria/DHF)
The National Vector Borne Diseases Control
Project, Yangon, Insein PO, Myanmar
Email
Learning objectives:
At the end of this session,
1. the participant should
understand forecasting methods.
2. the participant should know
concepts behind forecasting models.
Performance objectives:
1. the participant should be able
to develop times series model for
forecasting epidemic.
I. Background:
Unaided, subjective judgements to warn
of forthcoming events and changes are not as
accurate and effective as systematic, explicit
approaches to forecasting.
This does not mean there is error free
forecasts. This does mean explicit systematic
forecasting approaching can provide
substantial benefits when used properly as all
types and forms of forecasting techniques are
made available within the existing data.
II. Forecasting methods:
There are three major categories
as stated below.
1. Judgmental method
2. Quantitative method
3. Technological method
1. Judgmental method:
Forecasts are made as
individual judgements or by
committee agreement or
decisions.
2. Quantitative method:
To know what will happen, but not why
something happens.
There are three subcategories of this method
2.1 Times series methods: Seek to identify
historical patterns (using time as a reference)
and then forecast using a time-based
extrapolation of those patterns.
3. Technological method:
Address long-term issues of a
technological, societal, political or
economic nature.
3.1 Extrapolative methods: using historical
patterns and relationships as a basic for forecasts
3.2 Analogy-based methods: using historical
and other analogies to make forecasts
3.3 Expert-based methods:
3.4 Normative-based methods: {using
objectives, goals, and desired outcomes as a
basic for forecasting, thereby influencing future
events}.
III. Selection points for the appropriate
forecasting techniques.
When we have to concern with application of
forecasting in our decision making, we need to
iterate the importance of selecting the appropriate
forecasting techniques. In this context, there are six
points that play an important role in determining the
requirements for an appropriate technique.
1.Time horizon: Generally, time horizon can be divided
into short term (1 to 3 months) immediate term (less
than 1 month), medium term (3 months to 2 years), and
long term (2 years or more). The exact length of time
used to classify these four categories is subject to vary
by organization and situation.
III. Selection points for the appropriate
forecasting techniques. (cont.)
2. Level of aggregate detail: In general, the greater the
level of detail (and frequency) that is required, the greater
the need for an automated forecasting procedure, and vice
versa.
3. Number of items: The larger the number of items
involved (all other things being equal), the more accurate
the forecasts.
4. Control versus planning: In control, management by
except is the general procedure. Thus, a forecasting method
in such situation should be able to recognize changes in
basic patterns or relationships at an early stage. On the
planning side, it is generally assumed that the existing
patterns will continue in the future, the major emphasis is on
identifying those patterns and extrapolating them into future.
III. Selection points for the appropriate
forecasting
techniques.
(cont.)
5.
Constancy: Forecasting
a situation
that is constant
over time is very different from forecasting one that is in a
state of flux. In the stable situation a quantitative
forecasting method can be adopted and checked
periodically to reconfirm its appropriateness. In changing
circumstances, what is needed is a method that can adopt
continually to reflect the most recent results and the latest
information.
6. Existing planning procedure: The greater the
competition (all other things being equal), the more difficult
to forecast. Based on the outcomes of forecasting models,
there is built in resistance to change in any organization.
The change can be made in a stepwise manner, rather
than all at once.
IV. Concepts behind the
times series analysis:
In man, the conflict is what is desired
and what should be desired. In the
animal, the conflict is what is and
what is desired.
Time series forecasting treats the system as
black box and makes no attempt to discover
the factors affecting its behavior. It explains
only what will happen, but not why something
happens. The general formula for the time
series model is
Actual = pattern + randomness
The common goal in the application of
forecasting techniques is to minimize these
deviations or errors in the forecast. The errors
are defined as the differences between the
actual value and what was predicted.
V. The decomposition method
We will selectively present the decomposition method,
assuming that the data can be broken down into the
various components and a forecast obtained for each
component.
Advantage: 1. The simplicity of the procedures
2. Ease for computational procedures
3. The minimal start-up time
4. Accuracy especially for short-term forecasting
Disadvantage: Not having sound statistical theory
behind the method
Times series model can basically be
classified into two types; additive model and
The forecast for Y in the year t is
generally written as
Y^t = f (Trt, Snt, Clt, t )
Y^ = forecast y
f = function
Tr = trend
Sn = seasonal variation
Cl = cycle
= error
t = the time period being
examined (t = 1, 2, i ).
Additive model
1. We assume that the data is the sum of the
time series components.
Yt = Trt + Snt + Clt +
2. If the data do not contain one of the
components (e.g., cycle) the value
for that missing component is zero.
Suppose there is no cycle, then
Yt = Trt + Snt + t
3. The seasonal component is independent of
trend, and thus magnitude of the seasonal
swing is constant over time.
Multiplicative model
1. We assume that the data is
the product of the various components.
Yt = Trt * Snt * Clt * t
2. If trend, seasonal variation, or cycle is missing, the
the value
is assumed to be 1.
Suppose there is no cycle, then
Yt = Trt * Snt * t
3. The seasonal factor of multiplicative model is a
proportion (ratio) to the trends, and thus the
magnitude of the seasonal swing increases or
decreases according to the behaviour of trend.
VI. Case study:
Quarterly malaria cases of a Township in
Myanmar between 1984-1992 is shown in
Table 1. Using the multiplicative
decomposition method,
a) calculate the centered moving average for
the time series data.
Objectives of modelling :
1) to monitor the malaria situation
in the study area and forecast with
modelling;
Methods:
1. This is a documentary study using time series
data covering 1984 to 1992.
2. The dependent variable was the incidence of
malaria occurring during a given time including both
out-patient and in-patient malaria cases.
3. For a starting point, we demonstrated a simple,
two-variable regression model using the
Results:
The output for MINITAB program illustrating
seasonal indices and centred moving average.
Times series (multiplicative decomposition
method)
Seasonal Indices
Period Index
1
1.18483
2
0.309150
3
0.738706
4
1.76732
Accuracy of Model
MAPE:
494
MAD:
234
MSD:
101789
Fig 1. The multiplicative decomposition
method: Actual versus forecast values for
malaria cases, 1984-92
1500
Actual
Predicted
Forecast
Actual
Predicted
Forecast
Cases
1000
500
MAPE:
MAD:
MSD:
0
0
10
20
30
quarterly time periods
40
249.4
206.1
89899.9
Fig 2. Moving average model
1500
Actual
Predicted
Actual
Predicted
cases
1000
Moving Average
500
0
0
10
20
quarterly time periods
30
Length:
MAPE:
499
MAD:
256
MSD:
132728
Linear regression model (simple,
two-variable model) in MINITAB
The regression equation is
Y = 21 + 10.9 X
Predictor
Constant
X
S = 324.7
Coef
21.1
StDev
110.5
T
0.19
P
0.850
10.932
5.209
2.10
0.043
R-Sq = 11.5%
R-Sq(adj) = 8.9%
Dependent variable: quarterly malaria cases
Independent variable: time
FIG 2 The linear regression model for malaria cases, 1984-92
Y = 21.0603 + 10.9322X
R-Sq = 0.115
1500
1000
se
sa
C
500
0
Regression
95% CI
-500
95% PI
0
10
20
quarterlytime periods
30
40
Evaluating the model:
Before completing the analysis, diagnostic tests
taking account of statistical pathology are to be
investigated.
Graphing the actual values with the predicted
values: closeness of the differences?
[see Figure 1]Graphing the residual:
Discussion:
1.Epidemic of malaria: What?
1.1 Periodical rapid and great increase in
malaria morbidity and perhaps mortality, reaching
levels above local average endemicity.
1.2 A rapid increase in malaria morbidity and
mortality in a given population (independent of
Points to ponder:
1. Among diverse factors, the selection of
independent variables should be judiciously
based on theoretical considerations.
2. It is worth emphasizing that the simple, twovariable regression model is limited in
information.
3. The preferred approach is to perform
A cautionary note:
For real progress, the
mathematical modeller as well as
the epidemiologist must have
mud on his boots. [Bradley,1982].
References:
1. Armitage P, Berry G. Automatic selection
procedures and colinearity. In:Statistical Methods
in Medical Research. 3rd ed. Blackwell Scientific
Publications, Oxford. 1994; 321-323.
References (cont):
4. Foster DP, Stine RA, Waterman RP. Summary
regression case. In:Business Analysis Using
Regression: A Case Book. Springer-Verlag New York,
Inc. 1998; 227.
5. Gujarati DN. Test of specification errors. In: Basic
Econometric. 3rd ed. Mcgraw-Hill, Inc. Singapore.