0% found this document useful (0 votes)
14 views

Forecasting Electricity Consumption Using ARIMA Model

This document discusses using ARIMA models to forecast electricity consumption. It introduces ARIMA models and compares different ARIMA models fitted on real electricity consumption data to determine the best model for accurate prediction. The experimental results show that the ARIMA(1,1,1) model provides high precision and stable predictions, making it suitable for predicting electricity consumption.

Uploaded by

Pablo Vivero
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Forecasting Electricity Consumption Using ARIMA Model

This document discusses using ARIMA models to forecast electricity consumption. It introduces ARIMA models and compares different ARIMA models fitted on real electricity consumption data to determine the best model for accurate prediction. The experimental results show that the ARIMA(1,1,1) model provides high precision and stable predictions, making it suitable for predicting electricity consumption.

Uploaded by

Pablo Vivero
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2019 International Conference on Sustainable Technologies for

Industry 4.0 (STI), 24-25 December, Dhaka

Forecasting Electricity Consumption using ARIMA


Model
Farjana Mahia†, Arpita Rani Dey*, Md Abdul Masud‡, and Mohammad Sultan Mahmud‡
Faculty of Computer Science and Engineering
Patuakhali Science and Technology University, Patuakhali, Bangladesh
[email protected]†, [email protected]*, [email protected]‡ and [email protected]

Abstract— Autoregressive integrated moving average, paramount class of models that can be applied to many real
ARIMA, is a popular technique, which is used to fit time applications. It is derived from autoregressive moving
series data for prediction and forecasting. This paper average, ARMA. Forecasting electricity consumption using
proposes ARIMA models with different sets of parameters different ARIMA models on real dataset and comparing them
for forecasting electricity consumption. The three ARIMA
to determine the best model gives highly accurate and stable
models, which are quite good and robust to develop a reliable
model, are investigated to forecast electricity consumption prediction. Electricity forecasting is a challenging task, it
for providing the required level of performance. The best can’t be predicted 100% accurately. Because forecasting
fitted model, effective and reliable approach, and network depends on some factors which varies on different sectors,
structure are determined according to the prediction areas, industries etc. Considering electricity consumption of
performance. For this purpose, we use synthetic dataset and any sector, there are so many attributes that can be chosen for
electricity consumption data in industries at Guangdong detecting and predicting the consumption of any area.
province in China. The experimental results show that the ARIMA model is more accurate than traditional forecasting
ARIMA(1,1,1) has high precision, stable predictions and techniques. It is one kind of statistical model to analyze and
suitable for predicting electricity consumption. The
forecast time series data. Specially, ARIMA model is also
forecasting results are essential to manage the required
electricity demand in various kind of industries and other applied to detect patterns and analyze the trends on electricity
sectors. consumption in household (daily, weekly, monthly and
quarterly) [3].
Keywords—Auto Correlation Function, Akaike Information
Criterion, Partial Auto Correlation Function, ARIMA
In this paper, we apply ARIMA model to forecast electricity
I. INTRODUCTION
consumption. The electricity consumption raw data from
Electricity is a fundamental necessary factor in our daily life. different manufacturing factories at Guangzhou in China
The energy source becomes a core component for social and 2012 were collected for prediction [4]. The ultimate goal is
economic development and the central source of its usage of to predict highly accurate results by estimating reliable
a country. Electric power storage is quite impractical and the ARIMA model.
demand of it can change dramatically in space and time
related to different sectors. The forecasting of electricity
consumption is an essential issue for utility owners, power II. RELATED WORKS
system operators, energy planners and system managers. The
methods for prediction are chosen by considering different Increasing electricity demand is a key issue nowadays.
factors including size of the time series, prediction interval, Prediction of electricity consumption is one of the vital
and prediction period [1]. During the last several decade elements for minimizing the waste of electricity. Various
various methods are being used for consumption of electricity types of approaches of prediction have been introduced to
to predict the future consumption accurately. The time series predict the consumption of electricity. In this section we
data has four components: trend (long term direction), briefly explain the existing various prediction procedure. The
seasonal (systematic, calendar related movements) effect, early methods of electricity consumption forecasting
cyclical and irregular (unsystematic, short term fluctuations) techniques include exponential smoothing models, moving
effect [2]. ARIMA is a core forecasting technique to predict average, autoregressive models etc. The forecasting methods
the future electric power production which meets the future are of three categories: grey prediction models, statistical
energy demand. The prediction figure helps to determine the analysis models and non-linear intelligent models. Non-linear
budget and how much electricity should be produced in models consist of Support Vector Machine (SVM), Markov
various sectors including agricultural, transportation, Chain and Artificial Neural Network [5].
residential, commercial. Forecasting is used to predict the In drawing precise prediction, GM(1,1) solutions are
future information by considering previous and present data statistically comprehensive but in volatility order of
and analyzed the trends of them. ARIMA models establish a applications no satisfactory result is provided [6].

978-1-7281-6099-3/19/$31.00 ©2019 IEEE

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
The data in ANN training may generate output even with lost 4. Visualize stationary time series with ACF/PACF: We
information. The performance level depends on incomplete plot ACF/PACF before estimating ARIMA parameters. Auto
data importance. Apposite network structure is obtained Correlation Function (ACF) shows lagged correlation, which
through trial, experience and error but no rule is specified to is the correlation between two series over time. It helps to
determine the structure of ANN. Artificial neural networks visualize the processed series, which returns one lag, compute
needs modifications before using it on time series data [7]. the correlation, again returns one lag, again compute the
Hierarchical multi-matrices Markov, HMM, model is used correlation and so on. If the dataset is strongly seasonal, peaks
when direction of the next observed point is stated rather than coincide with the seasonality period. Plotting ACF may assist
forecasting [8]. to guide the selection of moving average lags. This is a
SVM, Supervised learning method, have been widely used in popular approach to visualize the trend of time series data. A
time series predicting complications though they have not regression of time series, partial autocorrelation function
been broadly explored in seasonal time series forecasting. (PACF), against its past lags helps to find out a likely order
Only the binary classification problems are solved through for the AR term. According to a standard linear regression,
standard SVM formulation and output variables are limited the term can be treated as the contribution of a change in that
to take only binary values [9]. particular lag while holding others constant. As stated in the
rule of thumb, the ACF confirms trend and infers possible
III. METHODOLOGY values of the moving average parameters, and the PACF is
ARIMA is a model which is commonly used to forecast and for the auto regressive part.
predict future information on time series data. There are 5. Estimate parameters for ARIMA model: Parameters are
different settings of ARIMA model which are used as needed to be estimated for developing ARIMA models. The
complementary methods for non-stationary data analysis. p, d and q values define the order of ARIMA model.
In this paper, we use three ARIMA models with different ARIMA(p, d, q) model integrates AR(p), MA(q) models
sets of parameters to forecast electricity data. We define where ACF cuts off after lag ‘p’, PACF cuts off after lag ‘q’
ARIMA model with different parameters (p, d, q) where p, and ‘d’ shows how many times the difference of time series
d, q represents the number of autoregressive terms, the is needed.
number of non-seasonal differences, and the number of The AR model depends on the lagged values of the data. We
lagged error values in prediction respectively. define that AR(p) is an autoregressive model with p lags,
The forecasting of electricity consumption consists of the particular lagged values of yt are predictor variables.
following steps: The AR(p) model is defined by the equation:
1. Visualize the time series data: It is important to visualize yt = δ + φ1yt-1 + φ2yt-2 + … + φpyt-1 + ϵt (2)
the electricity consumption data to understand the trends, Where
seasonality or random behavior for developing time series • yt-1, yt-2…yt-p are the past series values (lags)
model. • ϵt is white noise (i.e. randomness)
2. Test stationary property by Augmented Dickey Fuller and δ is defined by the following equation:
Test: The ARIMA model, an ARIMA(p, d, q), works on = ((1 − ∑ ∅ ) (2.1)
stationary data. Therefore, after visualizing the electricity where μ is the process mean
consumption data, the stationary property is tested with A moving average model depends on the errors (residuals) of
Augmented Dickey Fuller Test, ADF. The ADF test is an the previous forecasts. It uses past prediction errors in a
advanced model tests where the null hypothesis that a unit regression-like model and is common to have negative sign
root is present in an autoregressive model. The existence of for the parameters MA(q) is a moving average model defined
unit roots leads unwanted results in time series analysis, by the equation:
which can cause inaccurate forecasting. The ADF is able to yt =c+ ϵt + θ 1 ϵt −1+ θ 2 ϵt −2+⋯+ θ q ϵt –q (3)
test stationary property and handle more complex statistics
than the traditional Dickey-Fuller test. Where
3. Stationarize the time series data: Dataset should be • ‘q’ is the moving-average trend parameter
stationarized if the time series is not stationary. Three • ϵt−1, ϵt−2...ϵt−q are the error at previous time
methods which are widely used to convert a time series periods.
stationary: detrending, seasonality and differencing. • ϵt is white noise (i.e. randomness)
Detrending is performed by using regression analysis on a An ARMA model describes
time related trend and identified the residuals. weakly stochastic stationary time series data for
Seasonality makes a component linear or nonlinear which two polynomials. The first and second of these polynomials
changes and repeats on time related data. are for the AR and the MA respectively. This model is stated
Differencing technique, which is generally used for data as the ARMA(p, q) model.
transforming and stationarizing. We use differencing Here,
function to stationarize the electricity consumption data. Let • p denotes the order of the AR polynomial,
the consecutive consumption values are denoted with t and (t- • q denotes the order of the MA polynomial.
1) time unit. This function is expressed as ARMA(p, q) model is defined by the equation:
x(t) – x(t-1) = ARMA (p, q) (1) Xt = c + ԑt + ∑ ᵩ − +∑ − (4)
Where
The difference from equation 1 is called as the Integration
• φ = the autoregressive model’s parameters,
part in AR(I)MA. The three parameters are obtained: p: AR,
d: I and q: MA. • θ = the moving average model’s parameters.

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
• c = a constant, values, coefficient values and ARIMA model plotting for
• ε = error terms (white noise) both types of data. Finally, the experimental results are
ARMA(p, q) and ARIMA(p, d, q) models have analyzed and discussed.
many resemblances such as the AR and MA components are
alike, combining a general autoregressive model AR(p) and A. Datasets: In this experiment, we used artificially
general moving average model MA(q). AR(p) uses previous generated, synthetic, and real-world application, electricity
values of the dependent variable to predict future consumption, datasets. We generated 250 random variates as
information. On the other hand, MA(q) uses the series mean artificial consumption values with Gaussian distribution,
and previous errors to complete predictions. which is considered as synthetic dataset. A sample data from
the synthetic dataset is presented in Table I.
Δyt = ai Δyt-i + bi ϵt-i (5)
The real dataset, electricity consumption data, contains the
6. Calculate AIC value: The Akaike Information Criterion power consumption values of 21330 manufacturing factories
is broadly used to measure a statistical model. We compute at Guangdong province in China [1]. The electricity
AIC to estimate the goodness of fit of a model. The model consumption values were taken every fifteen minutes from
with lower AIC is better than other. smart meters.
7. Select best ARIMA model: Visualization of ARIMA
model is most effective way to compare and determine the We used 96 electricity consumption records as load profile
best model. In case of multiple models with almost similar or data where each load profile contains 500 consumption
slightly different AIC values ARIMA models plotting reduce values as instances in January 2012. A sample of load profile
the confusion in selecting the best model. data is presented in Table II.
By comparing the AIC values and visualization of ARIMA
models based on forecasting performance, the best ARIMA TABLE I. A SAMPLE OF SYNTHETIC DATASETS
model is obtained.
8. Forecast time series data with the best model: The best SL NO. V1
ARIMA model with estimated parameters is used to forecast
1. 20.000000
the future behavior of time series data.
The forecasting process with best ARIMA model is presented 2. 20.594485
in Fig. 1. 3. 19.446299
Electricity consumption data 4. 18.950653
5. 18.577397

Stationary 6. 18.584372
No Yes
7. 18.633751
Power ACF & PACF 8. 17.915348
Transformation
Differencing 9. 16.738053

10. 15.514705
No Yes Parameter
Estimation
Stationary for ARIMA TABLE II. A SAMPLE OF REAL DATASETS

SL V2 V5
No.
Forecasting 1. 2012-01-01 19.09
consumption
data 2. 2012-01-01 21.74

3. 2012-01-01 21.93

4. 2012-01-01 24.86
Selection of best
Model
5. 2012-01-01 22.07
Fig. 1. Model selection process by forecasting.
6. 2012-01-01 26.68

IV. EXPERIMENTS AND RESULT DISCUSSION 7. 2012-01-01 19.96

8. 2012-01-01 21.27
In this segment, experimental results of ARIMA(1,1,2),
ARIMA(1,1,7) and ARIMA(1,1,1) are presented for both 9. 2012-01-01 53.48
synthetic and real-world datasets. The datasets demonstrate
the performance of these models to forecast electricity 10. 2012-01-01 55.68
consumption. We also show the ACF/PACF plotting, AIC

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
B. Experiment Setting: Three sets of experiments were
conducted on both synthetic and real-world dataset. One was
to forecast with ARIMA(1,1,1) model and the other two was
ARIMA(1,1,2) and ARIMA(1,1,7) model. R and Rstudio are
used [13], [14], to construct the model [15], [17], [19]. We
compared the forecasting plots found by these three models.
ARIMA modeling needs stationary datasets.
We plotted this synthetic time series and the real dataset in R
to see if the dataset was already stationary.

Fig. 5. A time series presentation of differenced electricity consumption


data

ARIMA(p, d, q) model integrates AR(p), MA(q) models


where ACF and PACF cuts off after lag ‘p’ and ‘q’
respectively. ACF and PACF plotting is needed to estimate
parameters p and q of the ARIMA model. In figs. 6 and 7, we
see that the significance thresholds are represented by
horizontal blue dashed lines and the vertical lines, which
exceed the horizontal lines, are considered significant.
Fig. 2. A time series presentation of synthetic data.

Fig. 6. ACF of synthetic data

Fig. 3. A time series presentation of electricity consumption data

From the figs. 2 and 3 we can see that synthetic and the
electricity consumption data are not stationary enough.
So, we differenced the datasets to make them stationary to
apply ARIMA model. Then both of these two datasets
became quite stationary. We get the value of d = 1.

Fig. 7. PACF of synthetic data

C. Experimental Results discussion and Analysis: First, we


present the forecasting performance of the three models for
both synthetic and real dataset of electricity consumption in
the plots. Then we present the comparison of the three
ARIMA models for those datasets.
The AIC value is shown as it quantifies the goodness of fit
and the simplicity of the model into a single statistic. The
model with the lower AIC is generally considered as “better”
forecasting model. We obtain near to similar AIC values with
Fig. 4. A time series presentation of differenced synthetic data
ARIMA(1,1,2), ARIMA(1,1,7), and ARIMA(1,1,1) on
synthetic dataset. Similarly, we also obtain AIC values,
3944.35, 3946.5, 3945.14 respectively with ARIMA(1,1,7),
ARIMA(1,1,1), and ARIMA(1,1,2) on electricity
consumption data.
Therefore, we consider the visualization approach for
selecting the best model. Visualization is used to forecast the

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
stationary data, which is easy and acceptable process. We
develop ARIMA(1,1,7), ARIMA(1,1,2) and ARIMA(1,1,1)
models, which are used to predict future behavior of synthetic
and real time electricity consumption data.

Fig. 12. Forecasting with ARIMA(1,1,2) on electricity consumption data

Fig. 8. Forecasting with ARIMA(1,1,7) on synthetic data

Fig. 13. Forecasting with ARIMA(1,1,1) on electricity consumption data

Similarly, from figs. 11, 12, 13 we see that the ARIMA(1,1,1)


model performs more accurate forecasting electricity
consumption than other models on real world electricity
Fig. 9. Forecasting with ARIMA(1,1,2) on synthetic data
consumption datasets.

We notice that the forecasting curve generated by


ARIMA(1,1,1) model is close to the curve of original data on
both synthetic and real datasets. Therefore, the forecasting of
electricity consumption provided by ARIMA (1,1,1) model is
more accurate than others.
V. RELATED TO INDUSTRY 4.0
Industry 4.0 represents the 4th revolution and significant
transformation in manufacturing section. After the 1st
Fig. 10. Forecasting with ARIMA(1,1,1) on synthetic data
industrial revolution, electricity (the second revolution) made
significant change in the transformation and adaptation with
From figs. 8, 9, 10 we observe that the ARIMA(1,1,1) model
machine learning approach in autonomous systems. The
forecasting aligns with the true values (blue line) very well
volume and diversity of electricity consumption data are
and performs better forecasting than others on synthetic
presented as big data. Industry sectors want to forecast the
dataset.
electricity consumption for their productivity in future and
power providers need to predict the consumption demand of
their clients. It will help them to adopt their best in unique
cases and executing changes for today and preparing for a
future and to improve their management and demand.
Therefore, the forecasting of electricity consumption time
series data can be contributed to the sustainable technologies
for industry 4.0.
VI. CONCLUSION AND FUTURE WORKS
This paper focuses on the forecasting time series data with
Fig. 11. Forecasting with ARIMA(1,1,7) on electricity consumption data several settings of ARIMA models. Best model with
estimated parameters, is selected based on the prediction
performance. This prediction result presents the electricity
demand of consumers and offers an opportunity of power
providers to manage their electricity power in different
industrial sectors.
The current work can be extended by developing robust and
reliable model for complex time series datasets.

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [10] Antonio J. Conejo,Miguel A. Plazas, Rosa Espinola, and Ana
B. Molina, “Day-ahead electricity price forecasting using the
wavelet transform and ARIMA models,” IEEE Trans. Power
[1] Bruce L. Bowerman, Richard T. O’ Connell, & Anne B. Syst., vol. 20, no. 2, 2005, pp. 1035-1042.
Koehler, “Forecasting time series, and regression: an applied
approach,” 4th ed. The United States of America: Thomson [11] Box, G.E.P. and G. Jenkins, “Time Series Analysis Forecasting
Brooks, 2005. and Control,” Holden-Day, San Francisco, CA, 1976.
[2] Makridakis, S., S.C. Wheelwright and RJ. Hyndman, [12] Samer Saab, Elie Badr and Geoge Nasr, “Univariate modeling
“Forecasting: Methods and Applications,” 3 ed. Wiley, Inc., and forecasting of energy consumption: the case of electricity
New York, 1998. in Lebanon,” Energy vol.26,2001, pp. 1-14.
[3] UCI repository of machine learning database [Online]. [13] R-project [Online].
[4] MD ABDUL MASUD, JOSHUA ZHEXUE HUANG, MING [14] R-studio [Online].
ZHONG, AND XIANGHUA FU, “Cluster Survival Model of [15] Jonathan D. Cryer & Kung-Sik Chan, “Time series analysis:
Concept Drift in Load Profile Data,” IEEE ACESS. Vol 6. with applications in R,” 2nd ed. New York: Springer, 2008.
2018. [16] Volkan S. Ediger, Sertac Aktar, “ARIMA forecasting of
[5] Ning Xu, Yaoguo Yande Gong. Novel grey prediction model primary energy demand by fuel in Turkey,” Energy Policy,
with nonlinear optimized time response method for forecasting vol.35, 2007, pp.1701-1708.
of electricityconsumption in China. Energy 2016. [17] Robert H. Shumway & David S. Stoffer, “Time Series Analysis
[6] Song Ding, Keith W. Hipel, Yao-guo Dang. Forecasting and Its Applications with r Examples,” 3rd ed. New York:
China’s electricity consumption using a new grey prediction Springer, 2011.
model. Energy 2018. [18] Qing Zhu, Yujing Guo, Genfu Feng, “Household energy
[7] Aowabin Rahman, Vivek Srikumar, Amanda D. Smith. consumption in China forecasting with BVAR model up to
Predicting electricity consumption for commercial and 2015,” 2012 Fifth International Joint Conference on
residential buildings using deep recurrent neural networks. Computational Sciences and Optimization, 2012.
Applied Energy 2017. [19] Oleg Nenadic, Walter Zucchini, “Statistical Analysis with R –
[8] Yunyou Huang, Jianfeng Zhan, Chunjie Luo, Lei Wang, Nana a quick start -,” Retrieved November 10, 2012.
Wang, Daoyi Zheng, Fanda Fan, Rui Ren. An electricity [20] Javier contreras, Rosario Espinola, Francisco J. Nogales, and
consumption model for synthesizing scalable electricity load Antonio J. Conejo, “ARIMA models to predict next-day
curves. Energy 2018. electricity prices. Power Systems,” IEEE Transactions on
[9] Gamze Oĝcu, Omer F. Demirelb, Selim Zaimc. Forecasting 2003, vol.18, no. 3, pp. 10141020.
Electricity Consumption with Neural Networks and Support
Vector Regression. Procedia 2012.

Authorized licensed use limited to: Universidad de chile. Downloaded on May 21,2024 at 03:35:45 UTC from IEEE Xplore. Restrictions apply.

You might also like