0% found this document useful (0 votes)
6 views20 pages

Time series with python

This document serves as an introduction to time series analysis, covering definitions, components, and decomposition models. It discusses the statistical and dynamical system perspectives of time series, along with applications in various fields such as economics and anomaly detection. The document also outlines methods for decomposing time series data into trend, seasonality, and residuals, and includes exercises for practical understanding.

Uploaded by

kone moustapha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views20 pages

Time series with python

This document serves as an introduction to time series analysis, covering definitions, components, and decomposition models. It discusses the statistical and dynamical system perspectives of time series, along with applications in various fields such as economics and anomaly detection. The document also outlines methods for decomposing time series data into trend, seasonality, and residuals, and includes exercises for practical understanding.

Uploaded by

kone moustapha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

Introduction to time series


analysis
Contents
Introduction
Basics
Time series components
Decomposition Models
Time Series Decomposition
Identify the dominant period/frequency
Summary
Exercises

1 of 20 3/15/2025, 2:21 PM
Introduction
In this lecture we will cover the following topics:

Definition of time series data.


Introduction to time series analysis and application examples.
The main components of a time series.
Time series decomposition.
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

Basics

What is a time series?


A time series is a sequence of data points organized in time order.
Usually, the time signal is sampled at equally spaced points in time.
These can be represented as the sequence of the sampled values.

Irregularly sampled time signals can still be represented as a time series.


It is necessary to encode this additional information into an additional data structure.

What data are represented as time series?


Time series are found in a myriad of natural phenomena, industrial and engineering
applications, business, human activities, and so on.

3 of 20 3/15/2025, 2:21 PM
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

Other examples include data from:


Finance: stock prices, asset prices, macroeconomic factors.
E-Commerce: page views, new users, searches.
Business: transactions, revenue, inventory levels.
Natural language: machine translation, chatbots.

Time series analysis


The main pruposes of time series analysis are:

1. To understand and characterize the underlying process that generates the observed
data.
2. To forecast the evolution of the process, i.e., predict the next observed values.

4 of 20 3/15/2025, 2:21 PM
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

There are two main different perspectives to look at a time series.


Each perspective leads to different time series analysis approaches

Statistics perspective
A time series is a sequence of random variables that have some correlation or other
distributional relationship between them.
The sequence is a realization (observed values) of a stochastic process.
Statistical time series approaches focus on finding the parameters of the stochastic
process that most likely produced the observed time series.

Dynamical system perspective


This perspective assumes that there is a system governed by unknown variables
.
Generally, we only observe one time series generated by the system.
What can be?
One of the system variables.
A function of system variables.
The objective of the analysis is to reconstruct the dynamics of the entire system from
.

5 of 20 3/15/2025, 2:21 PM
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

Applications
Time series analysis is applied in many real world applications, including

Economic forecasting
Stock market analysis
Demand planning and forecasting
Anomaly detection
… And much more

Economic Forecasting

Time series analysis is used in macroeconomic predictions.


World Trade Organization does time series forecasting to predict levels of
international trade [source].
Federal Reserve uses time series forecasts of the economy to set interest rates
[source].

Demand forecasting

Time series analysis is used to predict demand at different levels of granularity.


Amazon and other e commerce companies use time series modeling to predict
demand at a product geography level [source].
Helps meet customer needs (fast shipping) and reduce inventory waste

6 of 20 3/15/2025, 2:21 PM
Anomaly detection

Used to detect anomalous behaviors in the underlying system by looking at unusual


patterns in the time series.
Widely used in manufacturing to detect defects and target preventive maintenance
[source].
With new IoT devices, anomaly detection is being used in machinery heavy industries,
such as petroleum and gas [source].
Time series components
A time series is often assumed to be composed of three components:
Trend: the long-term direction.
Seasonality: the periodic behavior.
Residuals: the irregular fluctuations.

Trend
Trend captures the general direction of the time series.
For example, increasing number of passengers over the years despite seasonal
fluctuations.
Trend can be increasing, decreasing, or constant.
It can increase/decrease in different ways over time (linearly, exponentially, etc…).

Let’s create a trend from scratch to understand how it looks like.

time = np.arange(144)
trend = time * 2.65 +100
Seasonality
Periodic fluctuations in time series data that occur at regular intervals due to seasonal
factors.
It is characterized by consistent and predictable patterns over a specific period (e.g.,
daily, monthly, quarterly, yearly).

It can be driven by many factors.

Naturally occurring events such as weather fluctuations caused by time of year.


Business or administrative procedures, such as start and end of a school year.
Social or cultural behavior, e.g., holidays or religious observances.

Let’s generate the seasonal component.

seasonal = 20 + np.sin( time * 0.5) * 20

Residuals
Residuals are the random fluctuations left over after trend and seasonality are
removed from the original time series.
One should not see a trend or seasonal pattern in the residuals.
They represent short term, rather unpredictable fluctuations.
Decomposition Models
Time series components can be decomposed with the following models:
1. Additive decomposition
2. Multiplicative decomposition
3. Pseudoadditive decomposition

Additive model
Additive models assume that the observed time series is the sum of its components:

where
is the time series
is the trend
is the seasonality
is the residual
Additive models are used when the magnitudes of the seasonal and residual values
do not depend on the level of the trend.
Multiplicative Model
Assumes that the observed time series is the product of its components:

It is possible to transform a multiplicative model to an additive one by applying a log


transformation:

Multiplicative models are used when the magnitudes of seasonal and residual values
depends on trend.

multiplicative = trend * seasonal # we do not include the residuals to make the pattern m
Pseudoadditive Model
Pseudoadditive models combine elements of the additive and multiplicative models.
Useful when:
Time series values are close to or equal to zero. Multiplicative models struggle
with zero values, but you still need to model multiplicative seasonality.
Some features are multiplicative (e.g., seasonal effects) and other are additive
(e.g., residuals).
Complex seasonal patterns or data that do not completely align with additive or
multiplicative model.

For example, this model is particularly relevant for modeling series that:
are extremely weather-dependent,
have sharply pronounced seasonal fluctuations and trend-cycle movements.
Formulation:

pseudoadditive = trend * (seasonal + residuals - 1)

Time Series Decomposition


Now let’s reverse directions.
We have additive and multiplicative data.
Let’s decompose them into their three components.
A very simple, yet often useful, approach is to estimate a linear trend.
A detrended time series is obtained by subtracting the linear trend from the data.
The linear trend is computed as a 1st order polynomial.

slope, intercept = np.polyfit(np.arange(len(additive)), additive, 1) # estimate line coef


trend = np.arange(len(additive)) * slope + intercept # linear trend
detrended = additive - trend # remove the trend

Next, we will use seasonal_decompose (more information here) to isolate the main
time series components.
This is a simple method that requires us to specify the type of model (additive or
multiplicative) and the main period.

Additive Decomposition
We need to specify an integer that represents the main seasonality of the data.
By looking at the seasonal component, we see that the period is approximately
time steps long.
So, we set period=12 .
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

The blue line in each plot representes the decomposition.


There is a legend in the upper left corner of each plot to let you know what each plot
represents.
You can see the decomposition is not perfect with regards to seasonality and
residuals, but it’s pretty close.

You may notice both trend and residuals are missing data towards the beginning and
end.
This has to do with how trend is calculated (beyond the scope of this lesson).
The residuals are missing simply because , so missing trend values
mean missing residual values as well.
In other words, there is nothing wrong with these graphs.

Multiplicative Decomposition
We use the same function as before, but on the multiplicative time series.
Since we know this is a multiplicative time series, we declare
model='multiplicative' in seasonal_decompose .

14 of 20 3/15/2025, 2:21 PM
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

multiplicative_decomposition = seasonal_decompose(x=multiplicative, model='multiplicative


seas_decomp_plots(multiplicative, multiplicative_decomposition)

Again, the decomposition does a relatively good job picking up the overall trend and
seasonality.
We can see the shapes follow the patterns we expect.

Locally estimated scatterplot smoothing (LOESS)


Next, we try a second method called STL (Seasonal and Trend decomposition using
LOESS).
We start with the additive model.

stl_decomposition = STL(endog=additive, period=12, robust=True).fit()


seas_decomp_plots(additive, stl_decomposition)

15 of 20 3/15/2025, 2:21 PM
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

The STL decomposition does a very good job on the additive time series.
Next, we try with the multiplicative one.

stl_decomposition = STL(endog=multiplicative, period=12, robust=True).fit()


seas_decomp_plots(multiplicative, stl_decomposition)

16 of 20 3/15/2025, 2:21 PM
This decomposition is not as good as the previous one.

Which method to use?


Use seasonal_decompose when:

Your time series data has a clear and stable seasonal pattern and trend.
You prefer a simpler model with fewer parameters to adjust.
The seasonal amplitude is constant over time (suggesting an additive model) or
varies proportionally with the trend (suggesting a multiplicative model).

Use STL when:

Your time series exhibits complex seasonality that may change over time.
You need to handle outliers effectively without them distorting the trend and
seasonal components.
You are dealing with non-linear trends and seasonality, and you need more control
over the decomposition process.

Identify the dominant period/frequency


seasonal_decompose expects the dominant period as a parameter.

In this example, we generated the seasonal component by hand as follows:

seasonal = 20 + np.sin( time * 0.5) * 20

We said that the period was approximately .


But, in general, how do we find it out ?

You can use one of the following techniques:


Plot the data and try to figure out after how many steps the cycle repeats.
Do an Autocorrelation Plot (more on this later).
Use the Fast Fourier Transform on a signal without trend.
We will look more into FFT later on.
For now, you can use the following function to compute the dominant period in the
data.

It turns out that the main seasonality was not exactly .


If we want to generate a periodic signal with seasonality , we have to do as follows.
seasonal_12 = 20 + np.sin(2*np.pi*time/12)*20

fft_analysis(seasonal_12);

Dominant Frequency: 0.083


Dominant Period: 12.00 time units

Summary
In this lecture we covered the following topics.

The definition of a time series and examples of time series from the real world.
The definition of time series analysis and examples of its application in different
fields.
A practical understanding of the three components of time series data.
The additive, multiplicative, and pseudo-additive models.
Standard approaches to decompose a time series in its constituent parts.

Exercises

Exercise 1
Consider as the seasonal component the periodic signal with period 12

Use seasonal_12 and the trend and residual components below to define and
plot the additive and the multiplicative models
Introduction to time series analysis — Time series analysis with Python https://filippomb.github.io/python-time-series-handbook/notebooks/01...

Perform the seasonal decomposition with seasonal_decompose and STL on the new
signals and compare the results with the ones obtained in class, where we used an
approximate period.

Exercise 2
Load the two different time series as follows.

import statsmodels.api as sm
ts_A = sm.datasets.get_rdataset("AirPassengers", "datasets").data["value"].values
print(len(ts_A))
ts_B = sm.datasets.get_rdataset("co2", "datasets").data["value"].values
print(len(ts_B))

Plot the two time series.


Determine if the time series looks additive or multiplicative models.
Determine the main period of the seasonal component in the two time series.

Exercise 3
Decompose ts_A and ts_B using seasonal_decompose and STL .
Comment on the results you obtain. from statsmodels.tsa.seasonal import
seasonal_decompose, STL

20 of 20 3/15/2025, 2:21 PM

You might also like