AMFE Module 4 - Box and Jenkins Methodology
AMFE Module 4 - Box and Jenkins Methodology
Financial Econometrics
Box-Jenkins Method
Course Instructor:
Dr. Devasmita Jena
Box-Jenkins (BJ) Methodology
The estimation and forecasting of univariate TS models are carried out using B-J methodology
The methodology has three broad steps:
Identification
o Determine if the TS is stationary and to make it stationary if it is non-stationary
o Once the time series is stationary, the order of AR and MA terms are identified
o And competing models are shortlisted
Estimation
o Parameters of the identified models are estimated
Diagnostic Checks
o Having chosen the ARIMA model and having estimated the model, justify whether the model fits the data reasonably
Forecasting
o Forecast future values based on the estimated model
Identification
Raw plot: to identify presence of any TS process => prelim idea about stationarity
Unit root test to validate the stationarity
If TS is non-stationary, make it stationary by differencing
Check whether the transformed, differenced series is stationary
Keep differencing till you arrive at I(0) series
Caution: Avoid differencing more than once
Loss of data points
Getting back original series to forecast will entail issues!
Try to transform the series, before even resorting to difference
ACF/PACF plot of transformed series
Process ACF PACF
White Noise No significant spikes No significant spikes
AR(p) Spikes damp out gradually Spikes cut off at the pth lag
MA(q) Spikes cut off at the qth lag Spikes damp out gradually
ARMA(p,q) Spikes damp out gradually Spikes damp out gradually
Identification
Identification of probable model: An example
ACF Plot:
• Signinificant spikes at lags
1,5,8,11,15,20,31
PACF plot:
• Significant Spikes at lags 1,5, 11, 12,20,31
Competing Models for the data, assuming we
differenced the data only once: ARIMA(1,1,1),
ARIMA(1,1,5), ARIMA(1,1,8),ARIMA(1,1,11),
ARIMA(5,1,1), ARIMA(5,1,5), ARIMA(5,1,8),
ARIMA(5,1,11), ARIMA(11,1,1), ARIMA(11,1,
5), ARIMA(11,1,8), ARIMA(11,1,11)
We have ignored lags higher than 11 due to
parsimony
Which among the above could represent the
data closely?
• Use information criterion
Indentification: Information Criteria
Information Criterion tests: Probabilistic statistical measures to assess the model performance
and quantify the complexity of the model
Based on its log-likelihood method of estimation
Advantage: do not require a hold-out test set
Limitation: do not take the uncertainty of the models into account and may end-up selecting
models that are too simple
IC:
• Akaike information criterion
Here, k is the number of regressors (=p+q+1); T is the number of observation and ,i.e, is the residual sum of
squares
Identification: Information Criteria
Model with smallest SC or AIC is chosen
• Reduces the RSS
Choice between the information criteria:
• SIC and HQIC are stricter in penalizing loss of degree of freedom than AIC
Example
Estimation
OLS methodology can not be used to estimate ARIMA models
Estimation of AR models
• OLS not applicable due to presence of autocorrelation
• By assuming normality of the error terms Maximum Likelihood method can be used for estimating the
parameters
• Under normality, OLS estimates of the coefficients coincide with ML estimates
• MLE are known to be asymptotically normal and asymptotically most efficient
Estimation of MA models
• Since most of the parameters (except intercept) are involved in error component which is not observable, OLS
technique can’t be applied directly
• Method of moments or MLE is used
• Since moments are non-linear functions of the unknown coefficients, closed form estimation is ruled out
• Iterative methods are used to estimate the parameters (softwares makes this easy and a matter of wink of eye!)
Estimation of ARMA models
• Yule walker method
• Method of moments
• MLE method
Diagnostic Checks
Comparison between in-sample based forecasts with out-sample observations
• Issue: loss of information, which will be severe if data points are insufficient
Residual Analysis
• A simple plot of residuals against time will give an idea whether residuals are white noise
• ACF and PACF plots of residuals: No significant spikes