Rajendra Ladda Time Series Forecasting Project Report
Rajendra Ladda Time Series Forecasting Project Report
1. Read the data as an appropriate Time Series data and plot the data.
The Rose wine time series data has 2 missing values. I imputed the missing values by
intrapolation method. Following is the time series plot after missing values imputation.
The Sparkling wine time series data has no missing values. Following is the time series plot.
2. Perform appropriate Exploratory Data Analysis to understand the data and also
perform decomposition.
Data Description:
ROSE WINE
count mean std min 25% 50% 75% max
Decomposition
As we can observe from the above decompositions, we can say that both the wine time series are
upward trend. The Wine sales are unstable. They clearly show seasonality.
3. Split the data into training and test. The test data should start in 1991.
After splitting both the time series datasets, following are the data sizes.
Rose Wine Data: Training Data 132 rows, Test Data 55 rows
Sparkling Wine Data: Training Data 132 rows, Test Data 55 rows
Following are the plots of training and test data in both Rose and Sparkling wine data.
4. Build all the exponential smoothing models on the training data and evaluate the
model using RMSE on the test data. Other models such as regression,naïve
forecast models and simple average models. Should also be built on the training
data and check the performance on the test data using RMSE.
Following are the RMSE comparisons of all the different methods used. Below that please find the charts
with predictions using all the different methods.
Best Model for Sparkling data is Triple Exponential Smoothing Model (Additive Season)
5. Check for the stationarity of the data on which the model is being built on using
appropriate statistical tests and also mention the hypothesis for the statistical
test. If the data is found to be non-stationary, take appropriate steps to make it
stationary. Check the new data for stationarity and comment. Note: Stationarity
should be checked at alpha = 0.05.
Hence we conclude that at lag 1 the Rose Wine data becomes stationary
Sparkling Data
DF test statistic is -1.798
DF test p-value is 0.7055958459932068
p>0.05 Hence The Data is not stationary