AUTOREGRESSIVE
MOVINGAVERAGE
(ARMA)
PROF. AJAYA KUMAR PANDA
AR and MA
𝒀𝒕 = 𝜶 + 𝜷𝟏 𝒀𝒕−𝟏 + 𝜺𝒕 (RW)
𝒀𝒕 = 𝜶 + 𝜷𝟏 𝒀𝒕−𝟏 + 𝜷𝟐 𝒀𝒕−𝟐 + 𝜷𝟑 𝒀𝒕−𝟑 + 𝜺𝒕 (AR (3))
𝒀𝒕 = 𝜶 + 𝜸𝟏 𝜺𝒕−𝟏 + 𝜸𝟐 𝜺𝒕−𝟐 + 𝜸𝟑 𝜺𝒕−𝟑 + μ𝒕 (MA (3))
𝒀𝒕 = 𝜶 + 𝜷𝟏 𝒀𝒕−𝟏 + 𝜷𝟐 𝒀𝒕−𝟐 + 𝜷𝟑 𝒀𝒕−𝟑 + 𝜸𝟏 𝜺𝒕−𝟏 + μ𝒕 (ARMA (3,1))
TESTING STATIONARY USING CORRELOGRAM
H0 : Variable is stationary (or) there is no trend
H1 : Variable is not a stationary (or) there is trend
➢ If probability of Q- stat of ACF and PACF is less than 0.05 then reject H0.
➢ In case of correlogram, we wish not to reject H0 and expect the probability
values to be greater than 0.05
ACF: Auto-Correlation Function (ACF) represents the correlation between the
observation at the current time ‘t’ and its lag (t – i). For e.g., if we assume that up-to
day ‘i’ stock prices are correlated with its past values then we can calculate its ACF
to know how effectively today's stock price is correlated with it’s past.
PACF: Partial Auto-Correlation function (PACF) represents the correlation
between observation at two time given that both the observations are correlated to
the observations at other time. In other words, PACF measures the correlation
between time series observations after controlling for the correlations at
intermediate lags. Hence PACF measures the true marginal effects of significant
lags.
For example: Current stock price of HDFCt can be correlated with HDFC t-1 only and can also be
correlated with both HDFC t-1 HDFC t-2 and HDFC t-3 . Then PACF of HDFC t with HDFC t-1 measures
the correlation between t & t-1 after taking out the influence of t-2 and t-3 on t. Hence PACF measures the
real correlation between two time periods after taking out the influence of other time periods. Therefore,
in practice, we use PACF to evaluate the AR models, because PACF captures the true correlation among
the selected AR(P) orders taking out the influence of other higher orders on current (t).
Similarly, we use ACF to evaluate the MA(q) models that measures the correlation between current
observation with its time-series of past observations up to a significant level.
Rules of using ACF & PACF
➢ If we have a time-series data set, then the 1st step is to see whether there are any
obvious trends/time trends. If there is any trend, then the data set violates the
basic assumption of stationarity. The immediate job is to de-trend or take the
log-difference to make it stationary.
➢ After de-trending or making the data stationary we must decide between AR and
MA, which model we should use?
➢ As we said, we use PACF to decide the order of AR model, a Significant PACF
value must be chosen. For example, if the PACF of t-1 and t-2 stock price is
significant, (i.e., cross the CI bound) and rest of the t-i lays are insignificant.
If it remains within the CI, then we should choose a AR (2) or a 2nd order Auto-regression
model.
➢ Then use ACF to determine the terms of MA. Like previous, we will find significant
order of ACF to identify optimum order of MA(q).
➢ Similarly, looking at PACF and ACF together, we will find the order of ARMA (p, q).
Note: The dotted lines present the significant thresholds. The bars/lines represents ACF and
PACF values at each time lag. Only the Bars/lines that cross the significant threshold lines
or confidence interval are called significant.
H0: Variable is stationary or there is no trend.
➢ This is the common pattern of non-stationary time series
➢ ACF lies outside the CI lines and declines slowly. It shows high autocorrelation in the
errors.
➢ PACF drops immediately after 1st lag, but most of the PACFs are having prob. < 0.05
even lying within the CI (i.e. dotted lines).
Pattern of Stationary Data
Note:
➢ This is the common pattern of a stationary series
➢ ACF and PACF are within the CI and the prob. values are greater than 0.05.
Hence, we don’t reject the H0.
Let’s Summarize the pattern of ACF and PACF in ARMA modelling.
Analysis of ACF and PACF
➢ The figure (1) shows the data is not stationary, and we reject the Ho. The
variable is non-stationary because the probability of ACF and PACF is less than, 0.05.
Further, the Autocorrelation plot is exceeding the dotted lines. It implies the error
terms indicated by ACF is auto correlated. At the same time PACF is Significant at 1st
lag only.
➢ The pattern of ACF and PACF is also matching the point 1 of summary table
➢ The figure (2) shows the ACF and PACF of the data in 1st difference. We can see
the ACF and PACF lines are within the dotted confidence interval and the prob. values
of the lags are also greater than 0.05. Hence, we cannot reject the null hypothesis
ACF Pattern PACF Pattern
AR (P) → Exponential decay (or) Significant spikes through 1st lag
Damped sine wave pattern
MA (q) → Significant spikes through 1st lag Exponential decay (or)
Damped sine wave pattern
ARMA (1,1) → Exponential Decay from lag 1 Exponential Decay from lag 1
ARMA (p, q) → Exponential Decay Exponential Decay
Steps to model AR, MA, ARMA:
Step 1: Identification process. That is analyze the time series plot to visualize
stationarity, trend, Seasonality etc.
Step 2: Undertake testing of unit root test through DF, ADF, PP, KPSS etc.
Step 3: Analyze both at ACF & PACF for the data both at level and 1st difference
data.
Step 4: Decide order of AR and MA and finalize the possible of ARMA (p, q)
models.
Step 5: Estimate the models
Parsimonious models
➢ In ARIMA, Box - Jenkin suggest building parsimonious models. Parsimonious
models are small and less parameters. Hence, it gives better forecast than over-
parameterize models. Based on Boy- Joskin parsimonious s modelling, choose the
ARMA model that is small with less parameters but gives better estimates.
➢ Identify the best suitable ARMA:
➢ Estimate all the possible combinations of ARIMA, i.e AR (1), AR (2), ARMA (1,1),
(2, 1)... and so on. It's always better to select the (p, q) from ACF & PACF plots. That
is select the order of lag, where ACF & PACF crosses the C1 line. Then compare the
estimated statistics of all the model to choose the best one
Identifying the Best Model:
Select the model based on below characteristics with maximum number of
significant coefficients
➢ The model with least volatility (Sigma)
➢ The model with lowest AIC SBC value
➢ The model with highest Adj. R2 tic
Probable Models
ARMA ARMA ARMA ARMA ARMA
AR(2) AR(3) (1,0,1) (2,0,1) (3,0,1) (1,0,2) (2,0,3)
Sig. Coeff 2 1 2 3 2 2 3
Sigma 0.69 0.83 0.34 0.38 0.53 0.64 0.91
Sigma2 0.47 0.69 0.11 0.15 0.28 0.40 0.82
R2 0.80 0.83 0.90 0.94 0.91 0.89 0.90
Adj. R2 0.78 0.81 0.88 0.92 0.89 0.84 0.88
AIC 1323.14 1113.14 1033.14 1003.11 1123.14 1163.14 1128.87
SBIC 1142.49 1145.65 1152.94 1129.14 1121.44 1132.44 1141.09
Diagnostic test
➢ To cross verify the ideas ARMA model, that you have selected on the basis of the
above four selection criteria, then re-estimate the selected model.
➢ Estimate or predict the error distribution on the final ARMA model
➢ Then calculate the correlogram of the estimated errors.
➢ If the ACF and PACF of the estimated residual is within 95% CI, dotted line and
prob. values are greater than 0.05, we conclude that ARMA model is best .
➢ If any of the ACF or PACF are crossing the lines and prob < 0.05 then we may
conclude that the selected model could not capture some relevant information that
are left with random errors.
Thank You