Forecasting Electricity Consumption Using ARIMA Model
Forecasting Electricity Consumption Using ARIMA Model
Abstract— Autoregressive integrated moving average, paramount class of models that can be applied to many real
ARIMA, is a popular technique, which is used to fit time applications. It is derived from autoregressive moving
series data for prediction and forecasting. This paper average, ARMA. Forecasting electricity consumption using
proposes ARIMA models with different sets of parameters different ARIMA models on real dataset and comparing them
for forecasting electricity consumption. The three ARIMA
to determine the best model gives highly accurate and stable
models, which are quite good and robust to develop a reliable
model, are investigated to forecast electricity consumption prediction. Electricity forecasting is a challenging task, it
for providing the required level of performance. The best can’t be predicted 100% accurately. Because forecasting
fitted model, effective and reliable approach, and network depends on some factors which varies on different sectors,
structure are determined according to the prediction areas, industries etc. Considering electricity consumption of
performance. For this purpose, we use synthetic dataset and any sector, there are so many attributes that can be chosen for
electricity consumption data in industries at Guangdong detecting and predicting the consumption of any area.
province in China. The experimental results show that the ARIMA model is more accurate than traditional forecasting
ARIMA(1,1,1) has high precision, stable predictions and techniques. It is one kind of statistical model to analyze and
suitable for predicting electricity consumption. The
forecast time series data. Specially, ARIMA model is also
forecasting results are essential to manage the required
electricity demand in various kind of industries and other applied to detect patterns and analyze the trends on electricity
sectors. consumption in household (daily, weekly, monthly and
quarterly) [3].
Keywords—Auto Correlation Function, Akaike Information
Criterion, Partial Auto Correlation Function, ARIMA
In this paper, we apply ARIMA model to forecast electricity
I. INTRODUCTION
consumption. The electricity consumption raw data from
Electricity is a fundamental necessary factor in our daily life. different manufacturing factories at Guangzhou in China
The energy source becomes a core component for social and 2012 were collected for prediction [4]. The ultimate goal is
economic development and the central source of its usage of to predict highly accurate results by estimating reliable
a country. Electric power storage is quite impractical and the ARIMA model.
demand of it can change dramatically in space and time
related to different sectors. The forecasting of electricity
consumption is an essential issue for utility owners, power II. RELATED WORKS
system operators, energy planners and system managers. The
methods for prediction are chosen by considering different Increasing electricity demand is a key issue nowadays.
factors including size of the time series, prediction interval, Prediction of electricity consumption is one of the vital
and prediction period [1]. During the last several decade elements for minimizing the waste of electricity. Various
various methods are being used for consumption of electricity types of approaches of prediction have been introduced to
to predict the future consumption accurately. The time series predict the consumption of electricity. In this section we
data has four components: trend (long term direction), briefly explain the existing various prediction procedure. The
seasonal (systematic, calendar related movements) effect, early methods of electricity consumption forecasting
cyclical and irregular (unsystematic, short term fluctuations) techniques include exponential smoothing models, moving
effect [2]. ARIMA is a core forecasting technique to predict average, autoregressive models etc. The forecasting methods
the future electric power production which meets the future are of three categories: grey prediction models, statistical
energy demand. The prediction figure helps to determine the analysis models and non-linear intelligent models. Non-linear
budget and how much electricity should be produced in models consist of Support Vector Machine (SVM), Markov
various sectors including agricultural, transportation, Chain and Artificial Neural Network [5].
residential, commercial. Forecasting is used to predict the In drawing precise prediction, GM(1,1) solutions are
future information by considering previous and present data statistically comprehensive but in volatility order of
and analyzed the trends of them. ARIMA models establish a applications no satisfactory result is provided [6].
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
The data in ANN training may generate output even with lost 4. Visualize stationary time series with ACF/PACF: We
information. The performance level depends on incomplete plot ACF/PACF before estimating ARIMA parameters. Auto
data importance. Apposite network structure is obtained Correlation Function (ACF) shows lagged correlation, which
through trial, experience and error but no rule is specified to is the correlation between two series over time. It helps to
determine the structure of ANN. Artificial neural networks visualize the processed series, which returns one lag, compute
needs modifications before using it on time series data [7]. the correlation, again returns one lag, again compute the
Hierarchical multi-matrices Markov, HMM, model is used correlation and so on. If the dataset is strongly seasonal, peaks
when direction of the next observed point is stated rather than coincide with the seasonality period. Plotting ACF may assist
forecasting [8]. to guide the selection of moving average lags. This is a
SVM, Supervised learning method, have been widely used in popular approach to visualize the trend of time series data. A
time series predicting complications though they have not regression of time series, partial autocorrelation function
been broadly explored in seasonal time series forecasting. (PACF), against its past lags helps to find out a likely order
Only the binary classification problems are solved through for the AR term. According to a standard linear regression,
standard SVM formulation and output variables are limited the term can be treated as the contribution of a change in that
to take only binary values [9]. particular lag while holding others constant. As stated in the
rule of thumb, the ACF confirms trend and infers possible
III. METHODOLOGY values of the moving average parameters, and the PACF is
ARIMA is a model which is commonly used to forecast and for the auto regressive part.
predict future information on time series data. There are 5. Estimate parameters for ARIMA model: Parameters are
different settings of ARIMA model which are used as needed to be estimated for developing ARIMA models. The
complementary methods for non-stationary data analysis. p, d and q values define the order of ARIMA model.
In this paper, we use three ARIMA models with different ARIMA(p, d, q) model integrates AR(p), MA(q) models
sets of parameters to forecast electricity data. We define where ACF cuts off after lag ‘p’, PACF cuts off after lag ‘q’
ARIMA model with different parameters (p, d, q) where p, and ‘d’ shows how many times the difference of time series
d, q represents the number of autoregressive terms, the is needed.
number of non-seasonal differences, and the number of The AR model depends on the lagged values of the data. We
lagged error values in prediction respectively. define that AR(p) is an autoregressive model with p lags,
The forecasting of electricity consumption consists of the particular lagged values of yt are predictor variables.
following steps: The AR(p) model is defined by the equation:
1. Visualize the time series data: It is important to visualize yt = δ + φ1yt-1 + φ2yt-2 + … + φpyt-1 + ϵt (2)
the electricity consumption data to understand the trends, Where
seasonality or random behavior for developing time series • yt-1, yt-2…yt-p are the past series values (lags)
model. • ϵt is white noise (i.e. randomness)
2. Test stationary property by Augmented Dickey Fuller and δ is defined by the following equation:
Test: The ARIMA model, an ARIMA(p, d, q), works on = ((1 − ∑ ∅ ) (2.1)
stationary data. Therefore, after visualizing the electricity where μ is the process mean
consumption data, the stationary property is tested with A moving average model depends on the errors (residuals) of
Augmented Dickey Fuller Test, ADF. The ADF test is an the previous forecasts. It uses past prediction errors in a
advanced model tests where the null hypothesis that a unit regression-like model and is common to have negative sign
root is present in an autoregressive model. The existence of for the parameters MA(q) is a moving average model defined
unit roots leads unwanted results in time series analysis, by the equation:
which can cause inaccurate forecasting. The ADF is able to yt =c+ ϵt + θ 1 ϵt −1+ θ 2 ϵt −2+⋯+ θ q ϵt –q (3)
test stationary property and handle more complex statistics
than the traditional Dickey-Fuller test. Where
3. Stationarize the time series data: Dataset should be • ‘q’ is the moving-average trend parameter
stationarized if the time series is not stationary. Three • ϵt−1, ϵt−2...ϵt−q are the error at previous time
methods which are widely used to convert a time series periods.
stationary: detrending, seasonality and differencing. • ϵt is white noise (i.e. randomness)
Detrending is performed by using regression analysis on a An ARMA model describes
time related trend and identified the residuals. weakly stochastic stationary time series data for
Seasonality makes a component linear or nonlinear which two polynomials. The first and second of these polynomials
changes and repeats on time related data. are for the AR and the MA respectively. This model is stated
Differencing technique, which is generally used for data as the ARMA(p, q) model.
transforming and stationarizing. We use differencing Here,
function to stationarize the electricity consumption data. Let • p denotes the order of the AR polynomial,
the consecutive consumption values are denoted with t and (t- • q denotes the order of the MA polynomial.
1) time unit. This function is expressed as ARMA(p, q) model is defined by the equation:
x(t) – x(t-1) = ARMA (p, q) (1) Xt = c + ԑt + ∑ ᵩ − +∑ − (4)
Where
The difference from equation 1 is called as the Integration
• φ = the autoregressive model’s parameters,
part in AR(I)MA. The three parameters are obtained: p: AR,
d: I and q: MA. • θ = the moving average model’s parameters.
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
• c = a constant, values, coefficient values and ARIMA model plotting for
• ε = error terms (white noise) both types of data. Finally, the experimental results are
ARMA(p, q) and ARIMA(p, d, q) models have analyzed and discussed.
many resemblances such as the AR and MA components are
alike, combining a general autoregressive model AR(p) and A. Datasets: In this experiment, we used artificially
general moving average model MA(q). AR(p) uses previous generated, synthetic, and real-world application, electricity
values of the dependent variable to predict future consumption, datasets. We generated 250 random variates as
information. On the other hand, MA(q) uses the series mean artificial consumption values with Gaussian distribution,
and previous errors to complete predictions. which is considered as synthetic dataset. A sample data from
the synthetic dataset is presented in Table I.
Δyt = ai Δyt-i + bi ϵt-i (5)
The real dataset, electricity consumption data, contains the
6. Calculate AIC value: The Akaike Information Criterion power consumption values of 21330 manufacturing factories
is broadly used to measure a statistical model. We compute at Guangdong province in China [1]. The electricity
AIC to estimate the goodness of fit of a model. The model consumption values were taken every fifteen minutes from
with lower AIC is better than other. smart meters.
7. Select best ARIMA model: Visualization of ARIMA
model is most effective way to compare and determine the We used 96 electricity consumption records as load profile
best model. In case of multiple models with almost similar or data where each load profile contains 500 consumption
slightly different AIC values ARIMA models plotting reduce values as instances in January 2012. A sample of load profile
the confusion in selecting the best model. data is presented in Table II.
By comparing the AIC values and visualization of ARIMA
models based on forecasting performance, the best ARIMA TABLE I. A SAMPLE OF SYNTHETIC DATASETS
model is obtained.
8. Forecast time series data with the best model: The best SL NO. V1
ARIMA model with estimated parameters is used to forecast
1. 20.000000
the future behavior of time series data.
The forecasting process with best ARIMA model is presented 2. 20.594485
in Fig. 1. 3. 19.446299
Electricity consumption data 4. 18.950653
5. 18.577397
Stationary 6. 18.584372
No Yes
7. 18.633751
Power ACF & PACF 8. 17.915348
Transformation
Differencing 9. 16.738053
10. 15.514705
No Yes Parameter
Estimation
Stationary for ARIMA TABLE II. A SAMPLE OF REAL DATASETS
SL V2 V5
No.
Forecasting 1. 2012-01-01 19.09
consumption
data 2. 2012-01-01 21.74
3. 2012-01-01 21.93
4. 2012-01-01 24.86
Selection of best
Model
5. 2012-01-01 22.07
Fig. 1. Model selection process by forecasting.
6. 2012-01-01 26.68
8. 2012-01-01 21.27
In this segment, experimental results of ARIMA(1,1,2),
ARIMA(1,1,7) and ARIMA(1,1,1) are presented for both 9. 2012-01-01 53.48
synthetic and real-world datasets. The datasets demonstrate
the performance of these models to forecast electricity 10. 2012-01-01 55.68
consumption. We also show the ACF/PACF plotting, AIC
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
B. Experiment Setting: Three sets of experiments were
conducted on both synthetic and real-world dataset. One was
to forecast with ARIMA(1,1,1) model and the other two was
ARIMA(1,1,2) and ARIMA(1,1,7) model. R and Rstudio are
used [13], [14], to construct the model [15], [17], [19]. We
compared the forecasting plots found by these three models.
ARIMA modeling needs stationary datasets.
We plotted this synthetic time series and the real dataset in R
to see if the dataset was already stationary.
From the figs. 2 and 3 we can see that synthetic and the
electricity consumption data are not stationary enough.
So, we differenced the datasets to make them stationary to
apply ARIMA model. Then both of these two datasets
became quite stationary. We get the value of d = 1.
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
stationary data, which is easy and acceptable process. We
develop ARIMA(1,1,7), ARIMA(1,1,2) and ARIMA(1,1,1)
models, which are used to predict future behavior of synthetic
and real time electricity consumption data.
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [10] Antonio J. Conejo,Miguel A. Plazas, Rosa Espinola, and Ana
B. Molina, “Day-ahead electricity price forecasting using the
wavelet transform and ARIMA models,” IEEE Trans. Power
[1] Bruce L. Bowerman, Richard T. O’ Connell, & Anne B. Syst., vol. 20, no. 2, 2005, pp. 1035-1042.
Koehler, “Forecasting time series, and regression: an applied
approach,” 4th ed. The United States of America: Thomson [11] Box, G.E.P. and G. Jenkins, “Time Series Analysis Forecasting
Brooks, 2005. and Control,” Holden-Day, San Francisco, CA, 1976.
[2] Makridakis, S., S.C. Wheelwright and RJ. Hyndman, [12] Samer Saab, Elie Badr and Geoge Nasr, “Univariate modeling
“Forecasting: Methods and Applications,” 3 ed. Wiley, Inc., and forecasting of energy consumption: the case of electricity
New York, 1998. in Lebanon,” Energy vol.26,2001, pp. 1-14.
[3] UCI repository of machine learning database [Online]. [13] R-project [Online].
[4] MD ABDUL MASUD, JOSHUA ZHEXUE HUANG, MING [14] R-studio [Online].
ZHONG, AND XIANGHUA FU, “Cluster Survival Model of [15] Jonathan D. Cryer & Kung-Sik Chan, “Time series analysis:
Concept Drift in Load Profile Data,” IEEE ACESS. Vol 6. with applications in R,” 2nd ed. New York: Springer, 2008.
2018. [16] Volkan S. Ediger, Sertac Aktar, “ARIMA forecasting of
[5] Ning Xu, Yaoguo Yande Gong. Novel grey prediction model primary energy demand by fuel in Turkey,” Energy Policy,
with nonlinear optimized time response method for forecasting vol.35, 2007, pp.1701-1708.
of electricityconsumption in China. Energy 2016. [17] Robert H. Shumway & David S. Stoffer, “Time Series Analysis
[6] Song Ding, Keith W. Hipel, Yao-guo Dang. Forecasting and Its Applications with r Examples,” 3rd ed. New York:
China’s electricity consumption using a new grey prediction Springer, 2011.
model. Energy 2018. [18] Qing Zhu, Yujing Guo, Genfu Feng, “Household energy
[7] Aowabin Rahman, Vivek Srikumar, Amanda D. Smith. consumption in China forecasting with BVAR model up to
Predicting electricity consumption for commercial and 2015,” 2012 Fifth International Joint Conference on
residential buildings using deep recurrent neural networks. Computational Sciences and Optimization, 2012.
Applied Energy 2017. [19] Oleg Nenadic, Walter Zucchini, “Statistical Analysis with R –
[8] Yunyou Huang, Jianfeng Zhan, Chunjie Luo, Lei Wang, Nana a quick start -,” Retrieved November 10, 2012.
Wang, Daoyi Zheng, Fanda Fan, Rui Ren. An electricity [20] Javier contreras, Rosario Espinola, Francisco J. Nogales, and
consumption model for synthesizing scalable electricity load Antonio J. Conejo, “ARIMA models to predict next-day
curves. Energy 2018. electricity prices. Power Systems,” IEEE Transactions on
[9] Gamze Oĝcu, Omer F. Demirelb, Selim Zaimc. Forecasting 2003, vol.18, no. 3, pp. 10141020.
Electricity Consumption with Neural Networks and Support
Vector Regression. Procedia 2012.
Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.