0% found this document useful (0 votes)
9 views

Time Series Analysis

Time series analysis involves examining data points over time to identify patterns and relationships among variables. Key components include trend, seasonal, cyclic, and random variations, which help in forecasting future values critical for decision-making in various sectors. Methods such as Moving Average, Exponential Smoothing, and Auto Regression are commonly used for modeling time series data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Time Series Analysis

Time series analysis involves examining data points over time to identify patterns and relationships among variables. Key components include trend, seasonal, cyclic, and random variations, which help in forecasting future values critical for decision-making in various sectors. Methods such as Moving Average, Exponential Smoothing, and Auto Regression are commonly used for modeling time series data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Time series Analysis

Time Series analysis and


Forecasting
• Time series analysis is a way of analyzing a sequence
of data points collected over an interval of time.
• Two basic tasks – Identifying patterns in the data;
Discovering relationship among variables of interest.
• A common method of identifying pattern (in historical
data ) is to decompose into components – trend,
seasonal, cyclic and random.
Time Series analysis and
Forecasting
• Components of Time Series:
Trend-cycle component:
Trend represents the long-term movement or direction of the
data.
Trend filters out short-term, noise fluctuations.
Seasonal component:
It accounts for regular, recurring patterns that manifest at
fixed time intervals.
Reminder component
Random variability. Variations not accounted for.
• In business context, we understand seasonal component, in a particular
period, certain products are bought (Example: Umbrella during rainy
season).
Time Series analysis and
Forecasting
• Cyclic component: In general, the fluctuation does not
have fixed frequency unlike seasonal component.
• Example: Economic landscape: There is ups and downs.
(Market goes up and down but not in the fixed
frequency).
• Where as, seasonality component depend on calendar
(associated is some sense: may be a day, season,
month etc.).
• Seasonality often has shorter period compared to
cyclicity.
Need for Time Series Analysis and
Forecasting
• The capacity to accurately predict future values of
time series data is paramount.
• Decision-making processes across various sectors,
including finance, economics, and environmental
sciences.
• Example where a real-time sensor values are observed
periodically, and due to Industry 4.0 (Smart
Manufacturing) that ensures delivering real-time
decision making that necessitates automation. (Like
Compressors has to shutdown when the temperature
reading from a sensor is not normal or within the
expected range).
Moving Average
• It is usually represented as MA(q)
• Forecasting based on the average of the past values.
• The model is,

where,

𝜇 is the mean.
is the time series at time t .

is white noise with zero mean and variance .

𝑞 is the order of the MA model (number of past errors used)


, , …, are the MA coefficients to be estimated.
Exponential Smoothing
• forecast for the period t and actual data in the period t.
• The model,

• Different approaches to determine (First data value, last data


value, or average etc.)
• Initial few forecast are not going to be precise. For larger dataset,
impact of is smaller.
• Smaller value is preferred when the data is fluctuating more
• It is called smoothening because it reduces the fluctuation
but still exhibit some trend.
alpha 0.8
Exponential
Time Temp Expo Smoothing
1 104.3 104.3
2 103.8 104.3 • For larger alpha value (=0.8),
3 105.1 103.9 predictions will be closer as there
4 104.9 104.9
is small variations in the sample
data.
5 105.6 104.9
6 105.9 105.5
7 106.1 105.8
8 106.3 106.0
alpha 0.2
Exponential
Time Temp Expo Smoothing
1 104.3 104.3
2 103.8 104.3 • For smaller alpha value (=0.2),
3 105.1 104.2 predictions will not be closer as
4 104.9 104.4
there is small variations in the
sample data.
5 105.6 104.5
6 105.9 104.7
7 106.1 104.9
8 106.3 105.2
Auto Regression (AR)
• It is usually represented as AR(p).
• The autoregressive model specifies that the output
variable depends linearly on its own previous
values.

Where,
is the time series at time t
is the model parameters
is white noise
Auto Regression (AR)
• During training, the model learns the optimal
parameters by minimising a loss function such as
the Mean Squared Error (MSE):

where is the actual value and is the predicted


value.
ARMA (AR + MA)
• It is usually represented as ARMA(p,q)

How Are MA Coefficients Estimated? Numerical methods to


estimate the parameters.

Step 1: Initialize Model Parameters


Before estimating, we assume:
• Choose order q (say 2)
• Initial error terms are unknown, so we start with an initial guess
(often zero).
• is estimated as the mean of the observed series
ARMA (AR + MA)
Step 2: Compute Residuals iteratively
To compute the error terms , we assume an initial guess (e.g., =
0, = 0 ). Then, we rearrange the MA(2) equation to express :
Estimate residuals using:

• Since , are unknown, we start with an initial guess (e.g., =


0.5, = 0.2 ) and iteratively improve.
Step 3: Use Maximum Likelihood Estimation (MLE)
• The optimization process iteratively updates , to
maximize the log-likelihood.
• MLE finds the parameter values that maximize the
likelihood of the observed data.
Python Implementation – Auto
Regression
• Library: statsmodels
• For Auto Regression: AutoReg model
• It uses Ordinary Least Squares approach (OLS)
Python AR(1)

from statsmodels.tsa.ar_model import AutoReg


y=
[1.99,2.29,2.38,2.58,2.49,2.63,2.83,2.92,3.13,3.26,3.27]
mod = AutoReg(y, 1)
res = mod.fit()
print(res.summary())
Python Implementation – MA, ARMA,
ARIMA
• Library: statsmodels
• For MA, ARMA, ARIMA: ARIMA
• It uses Maximum likelihood estimation
Python MA(1)
from statsmodels.tsa.arima.model import ARIMA
y=
[1.99,2.29,2.38,2.58,2.49,2.63,2.83,2.92,3.13,3.26,3.27]
ma_mod = ARIMA(y, order=(0, 0, 1))
ma_res = ma_mod.fit()
print(ma_res.summary())
Python MA(1)
• For MA(1) model, the no. of estimated parameters are,
1. Constant term (c)
2. Moving Average (MA) coefficients:
• (MA(1))
3. Error term variance ()
Python ARMA(1,2)

from statsmodels.tsa.arima.model import ARIMA


y=
[1.99,2.29,2.38,2.58,2.49,2.63,2.83,2.92,3.13,3.26,3.27]
arma_mod = ARIMA(y, order=(1, 0, 2))
arma_res = arma_mod.fit()
print(arma_res.summary())
Python ARMA(1,2)

from statsmodels.tsa.arima.model import ARIMA


y=
[1.99,2.29,2.38,2.58,2.49,2.63,2.83,2.92,3.13,3.26,3.27]
arima_mod = ARIMA(y, order=(1, 1, 2))
arima_res = arima_mod.fit()
print(arima_res.summary())
Akaike Information Criterion (AIC)
AIC is a metric used for model selection that balances goodness
of fit and model complexity. It penalizes models with too many
parameters to avoid overfitting.

where:
L = Maximum likelihood function of the model
k = Number of estimated parameters (including error
variance )
AIC is better when you need better forecasting accuracy
(allows more parameters).
Bayesian Information Criterion (BIC)
BIC is similar to AIC but applies a stronger penalty for the number
of parameters. It is used to select the simplest model that fits
the data well.

where:
L = Maximum likelihood function of the model
k = Number of estimated parameters
T = Number of observations in the dataset
BIC is better when you want a simpler model (stronger penalty
for complexity).
AIC and BIC for MA model
Log Likelihood for MA(1) 2.961
No. of parameters for MA(1)= 3
AIC = - 2 (2.961) + 2 * 3 = 0.078
BIC = - 2 (2.961) + 3 * ln(11) = 1.27168

Log Likelihood for MA(2) 4.149


No. of parameters for MA(2)= 4
AIC = - 2 (4.149) + 2 * 4 = -0.297
BIC = - 2 (4.149) + 4 * ln(11) = 1.294
(Lower values of AIC and BIC means the model better fits the
data)

You might also like