Lecture 9
Lecture 9
Time Series
Dr. Amr El-Wakeel
Lane Department of Computer
Science and Electrical Engineering
Spring 24
Time Series
Quantitative
Forecasting
Trend Cyclical
Seasonal Irregular
Trend Component
Time
Cyclical Component
Sales
Time
Seasonal Component
Irregular
variation
Trend
Cyclical
Cycles
Year 01
00
99
Seasonal variations
Multiplicative Time-Series Model
Y i = Ti C i I i Ci = Cyclical
Ii = Irregular
•For Quarterly or Monthly Data: Si = Seasonal
Yi = Ti Si Ci I i
Time Series Forecasting
Time
Series
Moving Exponential
Average Smoothing
Auto-
Linear Quadratic Exponential Regressive
Plotting Time Series Data
Time Series Forecasting
Time
Series
Moving Exponential
Average Smoothing
Auto-
Linear Quadratic Exponential Regressive
Moving Average Method
𝑬𝒊 = 𝑾𝒀𝒊 + (𝟏 − 𝑾)𝑬𝒊−𝟏
𝑾𝒀𝒊 (𝟏 − 𝑾)𝑬𝒊−𝟏
Exponential Smoothing Constant
Et = wYt + (1 – w)Et–1
…
Exponential Weight: Example
𝑬𝒊 = 𝑾𝒀𝒊 + (𝟏 − 𝑾)𝑬𝒊−𝟏
Exponential Weight: Example Graph
Sales
8
6 Data
0 Smoothed
94 95 96 97 98 99 Year
Forecast Effect of Smoothing Coefficient (W)
^
Yi+1 = W·Yi + W·(1-W)·Yi-1 + W·(1-W)2·Yi-2 +...
Weight
2 Periods 3 Periods
W is... Prior Period Ago Ago
2
W W(1-W) W(1-W)
0.10 10% 9% 8.1%
0.90 90% 9% 0.9%
Linear Time-Series
Forecasting Model
Linear Time-Series Forecasting Model
𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝒊
𝒀
Y b1 > 0
b1 < 0
Time, X1
The Linear Trend Model
95 1 5 7
96 2 2 6
97 3 2 5
98 4 7 4 Projected to
year 2000
99 5 6 3
2
Output
Coefficients 1
Intercept 2.14285714 0
X Variable 1 0.74285714 1993 1994 1995 1996 1997 1998 1999 2000
Computing a and b
y t x t
b= t =1 t =1 t =1
2 a= t =1
−b t =1
n
n n n
n xt − xt
2
t =1 t =1
Linear Model Seems Reasonable
X Y Computed
7 15 50
relationship
2 10
40
6 13
4 15 30
14 25
15 27 20
16 24
10
12 20
14 27 0
20 44 0 5 10 15 20 25
15 34
7 17
Another Example
Variables: Weeks and Sales
t y
2
Week t Sales ty
1 1 150 150
2 4 157 314
3 9 162 486
4 16 166 664
5 25 177 885
t = 15 t = 55 y = 812 ty = 2499
2
2
( t) = 225
Linear Trend Calculation
812 - 6.3(15)
a = = 143.5
5
y = 143.5 + 6.3 t
Sales in week t = 143.5 + 6.3 t
Linear Trend Calculation
y = 143.5 + 6.3t
When t = 0, the value of y is 143.45 and the
slope of the line is 6.3. meaning that the
value of of y will increase by 6.3 units for
each time period. If t = 10, the forecast is
143.5 + 6.3(10) = 206.5
43
Quadratic Time-Series Model
• Used for forecasting trend
• Relationship between response variable Y &
time X is a quadratic function
• Quadratic model
𝒊 = 𝒃𝟎 + 𝒃𝟏 𝑿𝒊 + 𝒃𝟐 𝑿𝟐𝒊
𝒀
The Quadratic Trend Model
2
Year Coded Sales 𝑌𝑖 = 𝑏0 + 𝑏1 𝑋𝑖 + 𝑏2 𝑋𝑖
94 0 2
95 1 5 Coefficients
96 2 2 I n te rc e p t 2.85714286
97 3 2 X V a ri a b l e 1 -0 . 3 2 8 5 7 1 4
98 4 7 X V a ri a b l e 2 0.21428571
99 5 6
Output
2
𝑌𝑖 = 2.857 − 0.33𝑋𝑖 + .214𝑋𝑖
Exponential Time-Series Model
Y b1 > 1
0 < b1 < 1
Year, X1
Exponential Trend Model
𝒊 = 𝒃𝟎 𝒃𝟏 𝑿 𝒊
𝒀 or 𝒀𝒊 = 𝐥𝐨𝐠 𝒃𝟎 + 𝑿𝟏 𝐥𝐨𝐠 𝒃𝟏
𝐥𝐨𝐠
• (Weakly) stationary
– The covariance is independent of t for each h
X ( X t , X t − h ) E( X t − )( X t − h − )
– The mean is independent of t
E( X t ) =
Why Stationary Time Series?
d = X t − X t − d = (1 − B d ) X t
E.g. AR: Stationary Models
• AR (AutoRegressive)
𝑿𝒊 = 𝝋𝟏 𝑿𝒊−𝟏 + ⋯ + 𝝋𝒑 𝑿𝟏−𝒑 + 𝜹𝒊 ,
𝒁𝒕 ~𝑾𝑵(𝟎, 𝝈𝟐 ),
• AR’s predictor
𝑷𝒏 𝑿𝒏+𝟏 = 𝝋𝟏 𝑿𝒏 + ⋯ + 𝝋𝒑 𝑿𝒏−𝒑+𝟏
ARMA and ARIMA
• ARMA
– Reduces large autocovariance functions
– A transformed linear predictor is used
ARMA = AR + MA
Autoregressive process of order p
Where:
•p is the order
•c is a constant
•epsilon: noise
Also, a moving average process q is defined as:
Where:
•q is the order
•c is a constant
•epsilon is noise
ARMA
ARMA(p,q) is simply the combination of both models into a
single equation:
• A value of 0 can be used for a parameter, which indicates to not use that element
of the model.
• In other words, ARIMA model can be configured to perform the function of an
ARMA model, and even a simple AR, I, or MA model.
and the equations is expressed as:
ACF and PACF
ACF and PACF
• ACF is an (complete) auto-correlation function which gives us
values of auto-correlation of any series with its lagged values.
ACF and PACF
PACF is a partial auto-correlation function. Instead of finding correlations of
present with lags like ACF, it finds correlation of the residuals (which remains
after removing the effects which are already explained by the earlier lag(s))
with the next lag value hence
‘partial’ and not ‘complete’ as
we remove already found
variations before we find the
next correlation.
ARMA and ARIMA
• ACF and PACF can be used for determining
ARIMA model hyperparamters p and q.
• Multivariate Cointegration
• SARIMA
• FARIMA
• …
Ensemble methods
We need to make sure they do not all just learn the same
Bagging
How?
Bootstrap
• Construct B (hundreds) of trees (no pruning)
• Learn a classifier for each bootstrap sample and
average them
• Very effective
Bagging
• Easy to parallelize
Bagging decision trees
Hastie et al.,”The Elements of Statistical Learning: Data Mining, Inference, and Prediction”, Springer (2009)
No Overfitting
X2
X1
Random Forest Time series prediction
https://otexts.com/fpp2/residuals.html
Residual Analysis
e e
0 0
T T
Random errors Cyclical effects not accounted for
e e
0 0
T T
Trend not accounted for Seasonal effects not accounted for
Measuring Errors