Probabilistic Forecast
Probabilistic Forecast
Rob J Hyndman
3 July 2024
1
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2 Identify anomalies within a
time series in historical data:
use residual distributions
from smoothing method
2
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2 Identify anomalies within a
time series in historical data:
use residual distributions
from smoothing method
3 Identify an anomalous time
series in a collection of time
series:
use feature-based approach 2
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2 Identify anomalies within a
time series in historical data:
use residual distributions
from smoothing method
3 Identify an anomalous time
series in a collection of time
series:
use feature-based approach 2
Australian PBS data
pbs
3
Australian PBS data
Scripts for ATC group A12 (Mineral supplements)
150
100
Scripts
50
4
Anomaly score distribution
One-step forecast distribution: 𝑁(𝜇𝑡 , 𝜎 2 )
𝑦𝑡 − 𝜇 𝑡 1 (𝑦𝑡 − 𝜇𝑡 )2
𝑓(𝑦𝑡 |𝑦1 , … , 𝑦𝑡−1 ) = 𝜙 ( )= exp { }
𝜎 𝜎 √2𝜋 2𝜎 2
5
Anomaly score distribution
One-step forecast distribution: 𝑁(𝜇𝑡 , 𝜎 2 )
𝑦𝑡 − 𝜇 𝑡 1 (𝑦𝑡 − 𝜇𝑡 )2
𝑓(𝑦𝑡 |𝑦1 , … , 𝑦𝑡−1 ) = 𝜙 ( )= exp { }
𝜎 𝜎 √2𝜋 2𝜎 2
5
Anomaly score distribution
One-step forecast distribution: 𝑁(𝜇𝑡 , 𝜎 2 )
𝑦𝑡 − 𝜇 𝑡 1 (𝑦𝑡 − 𝜇𝑡 )2
𝑓(𝑦𝑡 |𝑦1 , … , 𝑦𝑡−1 ) = 𝜙 ( )= exp { }
𝜎 𝜎 √2𝜋 2𝜎 2
For each 𝑡:
Estimate one-step forecast density: 𝑓(𝑦𝑡 |𝑦1 , … , 𝑦𝑡−1 ).
Anomaly score: 𝑠𝑡 = − log 𝑓(𝑦 ̂ |𝑦 , … , 𝑦 ).
𝑡 1 𝑡−1
High anomaly score indicates potential anomaly.
Fit a Generalized Pareto Distribution to the top 10% of
anomaly scores seen so far.
𝑦𝑡 is anomaly if 𝑃(𝑆 > 𝑠𝑡 ) < 0.05 under GPD.
6
Example
a12 pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Jan”))
a12plus pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Feb”))
fc a12 model(ets = ETS(Scripts)) forecast(h = 1)
7
Example
a12 pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Jan”))
a12plus pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Feb”))
fc a12 model(ets = ETS(Scripts)) forecast(h = 1)
fc autoplot(a12)
120
Scripts (thousands)
90
60
30
1995 Jan 2000 Jan 2005 Jan
Month 7
Example
a12 pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Jan”))
a12plus pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Feb”))
fc a12 model(ets = ETS(Scripts)) forecast(h = 1)
fc autoplot(a12)
90
60
30
1995 Jan 2000 Jan 2005 Jan
Month 7
Example
a12 pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Jan”))
a12plus pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Feb”))
fc a12 model(ets = ETS(Scripts)) forecast(h = 1)
fc autoplot(a12plus)
90
60
30
1995 Jan 2000 Jan 2005 Jan
Month 8
Example
a12 pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Jan”))
a12plus pbs filter(ATC2 == ”A12”, Month <= yearmonth(”2006 Feb”))
fc a12 model(ets = ETS(Scripts)) forecast(h = 1)
fc autoplot(a12plus)
60
30
1995 Jan 2000 Jan 2005 Jan
Month 8
Rolling origin forecasts
h=1
time 9
Rolling origin forecasts
pbs_stretch stretch_tsibble(pbs, .step = 1, .init = 36)
10
Rolling origin forecasts
pbs_fit pbs_stretch model(ets = ETS(Scripts))
# A mable: 14,076 x 3
# Key: .id, ATC2 [14,076]
.id ATC2 ets
<int> <chr> <model>
1 1 A01 <ETS(M,N,A)>
2 1 A02 <ETS(M,A,M)>
3 1 A03 <ETS(M,A,M)>
4 1 A04 <ETS(M,N,A)>
5 1 A05 <ETS(A,Ad,N)>
6 1 A06 <ETS(M,A,M)>
7 1 A07 <ETS(M,N,M)>
8 1 A09 <ETS(M,A,M)>
9 1 A10 <ETS(M,A,M)>
10 1 A11 <ETS(M,A,M)>
# i 14,066 more rows
11
Rolling origin forecasts
pbs_fc forecast(pbs_fit, h = 1)
12
PBS anomalies
pbs_scores pbs_fc
left_join(pbs rename(actual = Scripts), by = c(”ATC2”, ”Month”))
mutate(
s = -log_likelihood(Scripts, actual),
prob = lookout(density_scores = s, threshold = 0.9)
)
# A fable: 67 x 7 [1M]
# Key: .id, ATC2 [67]
.id ATC2 Month Scripts actual s prob
<int> <chr> <mth> <dist> <dbl> <dbl> <dbl>
1 11 P03 1995 May N(2.3, 0.045) 3.83 26.1 0.0278
2 13 D05 1995 Jul N(0.33, 0.0039) 0.781 24.3 0.0307
3 18 A11 1995 Dec N(46, 6.6) 25.1 34.4 0.0192
4 18 C05 1995 Dec N(33, 4.9) 2.46 98.9 0.00510
5 18 D02 1995 Dec N(43, 5.9) 10.0 97.2 0.00522
6 18 D06 1995 Dec N(6.7, 0.17) 4.24 18.5 0.0455
7 18 D08 1995 Dec N(5.4, 0.11) 1.40 71.4 0.00759
8 18 G04 1995 Dec N(54, 8.4) 9.67 121. 0.00399
9 18 L03 1996 Dec N(0.33, 0.00054) 1.23 756. 0.000463
10 19 D02 1996 Jan N(38, 26) 8.07 19.8 0.0412
# i 57 more rows
14
PBS anomalies
Scripts for ATC group L03
7.5
5.0
Scripts
2.5
0.0
15
PBS anomalies
Scripts for ATC group R06
30
Scripts
20
10
0
1992 1994 1996 1998 2000 2002 2004 2006 2008
Month
16
PBS anomalies
Scripts for ATC group N07
100
75
Scripts
50
25
0
1992 1994 1996 1998 2000 2002 2004 2006 2008
Month
17
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2 Identify anomalies within a
time series in historical data:
use residual distributions
from smoothing method
3 Identify an anomalous time
series in a collection of time
series:
use feature-based approach 18
Time series anomaly detection paradigms
1 Identify anomalies within a
time series in real time:
use one-step forecast
distributions
2 Identify anomalies within a
time series in historical data:
use residual distributions
from smoothing method
3 Identify an anomalous time
series in a collection of time
series:
use feature-based approach 18
Example: French mortality
fr_mortality
19
Example: French mortality
Female Male
1e-01
Age
80
Mortality
1e-02
60
40
20
1e-03 0
1e-04
fr_fit
# A mable: 172 x 3
# Key: Age, Sex [172]
Age Sex stl
<int> <chr> <model>
1 0 Female <STL>
2 0 Male <STL>
3 1 Female <STL>
4 1 Male <STL>
5 2 Female <STL>
6 2 Male <STL>
7 3 Female <STL>
8 3 Male <STL>
9 4 Female <STL>
10 4 Male <STL> 21
Example: French mortality
augment(fr_fit)
22
Example: French mortality
fr_sigma augment(fr_fit)
group_by(Age, Sex)
summarise(sigma = IQR(.innov)/1.349, .groups = ”drop”)
# A tibble: 172 x 3
Age Sex sigma
<int> <chr> <dbl>
1 0 Female 0.0643
2 0 Male 0.0616
3 1 Female 0.0894
4 1 Male 0.0788
5 2 Female 0.0900
6 2 Male 0.0931
7 3 Female 0.0925
8 3 Male 0.0864
9 4 Female 0.0963
10 4 Male 0.0931
# i 162 more rows
23
Example: French mortality
fr_scores augment(fr_fit)
left_join(fr_sigma)
mutate(
s = -log(dnorm(.innov / sigma)),
prob = lookout(density_scores = s, threshold_probability = 0.9)
)
# A tibble: 31,648 x 7
Age Sex Year Mortality .innov s prob
<int> <chr> <int> <dbl> <dbl> <dbl> <dbl>
1 0 Female 1816 0.187 -0.0342 1.06 1
2 0 Female 1817 0.182 -0.0580 1.32 1
3 0 Female 1818 0.186 -0.0316 1.04 1
4 0 Female 1819 0.197 0.0311 1.04 1
5 0 Female 1820 0.181 -0.0483 1.20 1
6 0 Female 1821 0.182 -0.0385 1.10 1
7 0 Female 1822 0.207 0.0973 2.06 1
8 0 Female 1823 0.192 0.0263 1.00 1
9 0 Female 1824 0.199 0.0639 1.41 1 24
Example: French mortality
fr_scores arrange(prob)
# A tibble: 31,648 x 7
Age Sex Year Mortality .innov s prob
<int> <chr> <int> <dbl> <dbl> <dbl> <dbl>
1 28 Female 1944 0.0170 1.45 373. 0.00737
2 25 Female 1944 0.0191 1.59 331. 0.00831
3 26 Female 1944 0.0176 1.50 266. 0.0104
4 24 Female 1944 0.0150 1.40 259. 0.0106
5 27 Female 1944 0.0178 1.50 228. 0.0121
6 25 Male 1944 0.0432 1.89 170. 0.0163
7 18 Male 1914 0.0798 2.06 170. 0.0163
8 21 Female 1944 0.0120 1.29 168. 0.0165
9 27 Male 1944 0.0388 1.78 168. 0.0165
10 23 Female 1944 0.0134 1.29 167. 0.0166
# i 31,638 more rows
25
Example: French mortality
French mortality anomalies
Female Male
80 80
60 60
Age
Age
40 40
20 20
0 0
1880 1900 1920 1940 1880 1900 1920 1940
Year Year
26
Example: French mortality
French mortality anomalies
Female Male
80 80
1944 1914 1916 1940 1944
1871
1918 1943 1945 1915 1918 1943
60 60
Age
Age
40 40
20 20
0 0
1880 1900 1920 1940 1880 1900 1920 1940
Year Year
27
Example: French mortality
French mortality anomalies
Franco-Prussian
80
1944
80
1914 1916 1940 1944
1871
war and 1918 1943 1945 1915 1918 1943
repression
60 of the 60
‘Commune de
Age
Age
Paris’
40 40
1914–1918: World
War20 I 20
Franco-Prussian
80
1944
80
1914 1916 1940 1944
1871
war and 1918 1943 1945 1915 1918 1943
repression
60 of the 60
‘Commune de
Age
Age
Paris’
40 40
1914–1918: World
Age 25
War20 I 20
0.015
Female
0.010
0.005
Sex
Mortality
0.000
Female
Male
0.075
0.050
Male
0.025
0.000
1850 1900 1950 2000
Year
29
More information
30
More information
Slides: robjhyndman.com/isf2024
Incomplete book: OTexts.com/weird
fable: fable.tidyverts.org
weird: pkg.robjhyndman.com/weird
30
More information