0% found this document useful (0 votes)
7 views

Building Energy Load Forecasting Using Deep Neural Networks

Uploaded by

achugaming6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Building Energy Load Forecasting Using Deep Neural Networks

Uploaded by

achugaming6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

© 2016 IEEE. Personal use of this material is permitted.

Permission from IEEE must be obtained for all other uses, in any current
or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective
works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Building Energy Load Forecasting using Deep Neural


Networks
Daniel L. Marino, Kasun Amarasinghe, Milos Manic
Department of Computer Science
Virginia Commonwealth University
Richmond, Virginia
[email protected], [email protected], [email protected]

Abstract—Ensuring sustainability demands more efficient be made continuously at aggregate level as well as modular level
energy management with minimized energy wastage. Therefore, in the grid. In achieving that goal and ensuring the reliability of
the power grid of the future should provide an unprecedented level the grid, the ability of forecasting the future demands is
of flexibility in energy management. To that end, intelligent important. [6], [9].
decision making requires accurate predictions of future energy
demand/load, both at aggregate and individual site level. Thus, Further, demand or load forecasting is crucial for mitigating
energy load forecasting have received increased attention in the uncertainties of the future [6]. In that, individual building level
recent past, however has proven to be a difficult problem. This demand forecasting is crucial as well as forecasting aggregate
paper presents a novel energy load forecasting methodology based loads. In terms of demand response, building level forecasting
on Deep Neural Networks, specifically Long Short Term Memory helps carry out demand response locally since the smart grids
(LSTM) algorithms. The presented work investigates two variants incorporate distributed energy generation [6]. The advent of
of the LSTM: 1) standard LSTM and 2) LSTM-based Sequence to smart meters have made the acquisition of energy consumption
Sequence (S2S) architecture. Both methods were implemented on data at building and individual site level feasible. Thus data
a benchmark data set of electricity consumption data from one driven and statistical forecasting models are made possible [7].
residential customer. Both architectures where trained and tested
on one hour and one-minute time-step resolution datasets. Aggregate level and building level load forecasting can be
Experimental results showed that the standard LSTM failed at viewed in three different categories: 1) Short-term 2) Medium-
one-minute resolution data while performing well in one-hour term and 3) Long-term [6]. It has been determined that the load
resolution data. It was shown that S2S architecture performed forecasting is a hard problem and in that, individual building
well on both datasets. Further, it was shown that the presented level load forecasting is even harder than aggregate load
methods produced comparable results with the other deep forecasting [6], [10]. Thus, it has received increased attention
learning methods for energy forecasting in literature. from researchers. In literature, two main methods can be found
for performing energy load forecasting: 1) Physics principles
Keywords—Deep Learning; Deep Neural Networks; Long- based models and 2) Statistical and machine learning based
Short-Term memory; LSTM; Energy; Building Energy; Energy
models. Focus of the presented work is on the second category
Load forecasting
of statistical load forecasting. In [7], the authors used Artificial
I. INTRODUCTION Neural Network (ANN) ensembles to perform the building level
load forecasting. ANNs have been explored in detail for the
Buildings are identified as a major energy consumer purpose of all three categories of load forecasting [9], [11]-[13].
worldwide, accounting for 20%-40% of the total energy In [14], the authors use a support vector machines based
production [1]-[3]. In addition to being a major energy regression model coupled with empirical mode decomposition
consumer, buildings are shown to account for a significant to for long-term load forecasting. In [15], electricity demand is
portion of energy wastage as well [4]. As energy wastage poses forecast using a kernel based multi-task learning methodologies.
a threat to sustainability, making buildings energy efficient is In [10], authors model individual household electricity loads
extremely crucial. Therefore, in making building energy using sparse coding to perform medium term load forecasting.
consumption more efficient, it is necessary to have accurate In the interest of brevity, not all methods in literature are
predictions of its future energy consumption. introduced in the paper. For surveys of different techniques used
At the grid level, to minimizing the energy wastage and for load forecasting, readers are referred to [16], [17] and [8].
making the power generation and distribution more efficient, the Despite the extensive research carried out in the area, individual
future of the power grid is moving to a new paradigm of smart site level load forecasting remains to be a difficult problem.
grids [5], [6]. Smart grids are promising, unprecedented Therefore, the work presented in this paper investigates a
flexibility in energy generation and distribution [7]. In order to deep learning based methodology for performing individual
provide that flexibility, the power grid has to be able to building level load forecasting. Deep learning allows models
dynamically adapt to the changes in demand and efficiently composed of multiple layers to learn representations in data. The
distribute the generated energy from the various sources such as use of multiple layers allow the learning process to be carried
renewables [8]. Therefore, intelligent control decisions should out with multiple layers of abstraction. A comprehensive
Accepted version of the paper appearing in Proceedings of the
42nd Annual Conference of the IEEE Industrial Electronics
Society (IECON), 2016
(a) (b) (c)
Fig. 1. (a) LSTM cell, (b) multilayer LSTM architecture, (c) architecture used for Load forecasting

overview and a review of deep learning methodologies can be overcome the problems of vanishing gradient, providing a
found in [18]. In previous work for load forecasting using deep model that is able to store information for long periods of time.
learning, authors of [6], explore Conditional Restricted
Boltzmann Machines (CRBM) [19] and Factored Conditional An LSTM network is comprised of memory cells with self-
Restricted Boltzmann Machines (FCBRM) [20] for building loops as shown in Fig. 1.a). The self-loop allows it to store
level load forecasting. Authors compare the two methods to temporal information encoded on the cell’s state. The flow of
several traditional methods including Support Vector Machines information through the network is handled by writing, erasing
and Artificial Neural Networks. They conclude that the FCRBM and reading from the cell’s memory state. These operations are
method outperforms the other tested methodologies. handled by three gates respectively: 1) input gate, 2) forget gate
and 3) output gate. The equations (1.a) through (1.f) express a
In this work, the effectiveness of a different deep learning single LSTM cell’s operation.
technique is explored for performing building level forecasting.
The presented methodology uses Long Short Term Memory 𝑖𝑔 = sigm(𝑖[𝑡] 𝑊𝑖𝑥 + 𝑜[𝑡−1] 𝑊𝑖𝑚 + 𝑏𝑖 ) (1.a)
(LSTM) algorithm. Presented work investigates using two 𝑓𝑔 = sigm(𝑖[𝑡] 𝑊𝑓𝑥 + 𝑜[𝑡−1] 𝑊𝑓𝑚 + 𝑏𝑓 ) (1.b)
variations of the LSTM: 1) load forecasting using standard 𝑜𝑔 = sigm(𝑖[𝑡] 𝑊𝑜𝑥 + 𝑜[𝑡−1] 𝑊𝑜𝑚 + 𝑏𝑜 ) (1.c)
LSTM and 2) load forecasting using LSTM based Sequence to
𝑢 = tanh(𝑖[𝑡] 𝑊𝑢𝑥 + 𝑜[𝑡−1] 𝑊𝑢𝑚 + 𝑏𝑢 ) (1.d)
Sequence (S2S) architecture. Both methodologies are tested on
a benchmark dataset which contained electricity consumption 𝑥[𝑡] = 𝑓𝑔 ∘ 𝑥[𝑡−1] + 𝑖𝑔 ∘ 𝑢 (1.e)
data for a single residential customer with time resolutions one 𝑜[𝑡] = 𝑜𝑔 ∘ tanh⁡(𝑢) (1.f)
minute and one hour. In order to compare the results, the same
where 𝑖𝑔 corresponds to the input gate, 𝑓𝑔 to the forget gate and
dataset used in [6] is used. Experimental results show that the
LSTM based S2S architecture performs well on both types of 𝑜𝑔 to the output gate. 𝑥[𝑡] is the value of the state at time step 𝑡,
datasets while the standard LSTM fails to perform well on the 𝑜[𝑡] the output of the cell and 𝑢 is the update signal.
minute resolution data. Further, it’s seen that the LSTM based The input gate decides if the update signal should modify the
algorithms manages to produce results comparable to the memory state or not. This is done by using a sigmoid function
FCRBM and the CRBM in [6]. as a “soft” switch, whose on/off state depends on the current
The rest of the paper is organized as follows. Section II input and previous output (Eq. 1.a). If the value of the input gate
provides background on the Long Short Term Memory (𝑖𝑔 ) is close to zero, the update signal is multiplied by zero,
algorithm. Section III elaborates the load forecasting using therefore the state will not be affected by the update (Eq. 1.e).
standard LSTM. Section IV elaborates load forecasting using Forget and output gates work in a similar manner.
LSTM based S2S architecture. Section V describes the dataset
and the experimental results. Finally, Section VI concludes the LSTM cells can be stacked in a multi-layer architecture to
paper. construct a network similar to the one shown in Fig. 1.b). The
architecture shown in Fig. 1.b) is generally used to predict an
II. LONG SHORT TERM MEMORY outcome 𝑦̂[𝑡] at time 𝑡 ∈ ℕ, given the set of all previous inputs
This section provides a brief introduction on the algorithm, {𝑖[0] , 𝑖[1] , … , 𝑖[𝑡] }.
Long Short Term Memory (LSTM).
III. LOAD FORECASTING USING DEEP NEURAL NETWORKS
Recurrent Neural Networks (RNN) are usually trained using This section elaborates the presented methodologies of Deep
either Back-propagation through time [21], or Real-Time Neural Networks based load forecasting. The presented work
Recurrent Learning [22] algorithm. Training with these methods investigates using two variants of the LSTM algorithm for load
often fails because of vanishing/exploding gradient. LSTM [23] forecasting. This section first discusses the standard LSTM
is a recurrent neural-network that was specifically designed to
(a) (b)
(a)predictions in an arbitrary number of future time steps, (b) unrolled LSTM
Fig. 2. (a) Use of LSTM network to make (b) network for training through BPTT

based load forecasting methodology and then elaborates the to train the network using a gradient based method such as
LSTM based S2S architecture. Stochastic Gradient Descent (SGD).
A. Load Forecasting using Standard LSTM The objective function that is minimized can be expressed
The objective of the presented methodology is to accurately as:
estimate the electricity load (active power) for a time step or 𝑀
multiple time steps in the future, given historical electricity load 2
data. I.e. having 𝑀 load measurements available, which can be
𝐿 = ∑(𝑦[𝑡] − 𝑦̂[𝑡] ) (6)
expressed as: 𝑡=1
During the minimization process, a method called Norm
𝑦 = {𝑦[0] , 𝑦[1] , … , 𝑦[𝑀−1] }, (2)
clipping [24] is used to alleviate the exploding gradient problem.
where 𝑦[𝑡] is the actual load measurement for time step t, the For training, ADAM [25] algorithm is used as the gradient
load for the following 𝑇 − 𝑀 time steps should be predicted. based optimizer, instead of SGD. ADAM outperformed SGD in
The predicted load values can be expressed as: terms of faster convergence and lower error ratios. The unrolling
𝑦̂ = {𝑦̂[𝑀] , 𝑦̂[𝑀+1] , … , 𝑦̂[𝑇] } (3) was implemented with 50 steps (M=50).
B. Load forecasting using LSTM based sequence to sequence
where 𝑦̂[𝑡] is the predicted load for time step t.
architecture
As the first technique, the standard LSTM algorithm was In order to further improve the flexibility of the load
investigated. First model that was tested is illustrated in Fig. 1.c). forecasting methodology, a different architecture based on
The active power of the previous time step, and the date and time LSTM, called sequence to sequence (S2S), is explored. S2S is
of the desired prediction are used as inputs for the model. The an architecture that was proposed to map sequences of different
input vector can be expressed as: lengths [26].
𝑖[𝑡] = [𝑦[𝑡−1] 𝑑𝑎𝑦[𝑡] 𝑑𝑎𝑦_𝑤𝑒𝑒𝑘[𝑡] ℎ𝑜𝑢𝑟[𝑡] ⁡] Fig. 3 shows the S2S architecture that is employed for load
(4)
𝑓[𝑡] = [𝑑𝑎𝑦[𝑡] 𝑑𝑎𝑦_𝑤𝑒𝑒𝑘[𝑡] ℎ𝑜𝑢𝑟[𝑡] ] forecasting. The architecture consists of two LSTM networks:
encoder and a decoder. The task of the encoder is to convert
The output of the network, 𝑦̂[𝑡] ∈ ⁡ℝ is an estimation of the input sequences of variable length and encode them in a fixed
active power for the next time step. With this model, the length vector, which is then used as the input state for the
electricity load for the next time step is predicted given a set of decoder. Then, the decoder generates an output sequence of
load measurements of the past. To predict further into the length n. In this instance, that output sequence is the energy load
future, the predictions made by the model can be used as forecast for the next n steps.
additional inputs for the next time step. Then, the input vector The main advantage of this architecture is that it allows
of the next time step can be expressed as: inputs of arbitrary length. I.e. an arbitrary number of available
𝑖[𝑡+1] = [𝑦̂[𝑡] 𝑑𝑎𝑦[𝑡+1] 𝑑𝑎𝑦_𝑤𝑒𝑒𝑘[𝑡+1] ℎ𝑜𝑢𝑟[𝑡+1] ⁡] (5) load measurements of previous time steps can be used as inputs,
to predict the load for an arbitrary number of future time steps.
Fig. 2.a) illustrates this process.
To perform the prediction, 𝑦 (Eq. 2) is used as the input for
To train the model, back-propagation through time(BPTT) is the encoder together with the corresponding date and time for
used. The network is unrolled by a fixed number of time steps and time is used as input.
as shown in Fig. 2.b). The resultant network can be seen as a
very deep standard feedforward network with shared For training, the encoder network is pre-trained to minimize
parameters. Therefore, standard backpropagation can be applied the following error:
Fig. 3: S2S LSTM-based architecture for load forecasting (backpropagation signals shown in dashed arrows)

𝑀 resolution data. As in [6], the first three years where used to train
2 the model and the last year was used as testing data. These
𝐿𝐸 = ∑(𝑦[𝑡] − 𝑦̂[𝑡] ) (7)
ranges were chosen to be comparable with the work in [6].
𝑡=1
Then the encoder plugged into the decoder network and train B. Experimental Results using standard LSTM
the two networks to reduce the objective function: As the first experiment, predicting one step ahead with the
𝑇
standard LSTM network was attempted. It proved to be an easy
2 task for the standard LSTM architecture, providing low error
𝐿𝐷 = ∑ (𝑦[𝑡] − 𝑦̂[𝑡] ) (8)
ratios with the test dataset. However, the model failed to provide
𝑡=𝑀+1 an accurate forecast when the same model was used to predict
further in the future, using the predictions as inputs, as
Fig 3 also shows the path that back-propagation signals
mentioned in section III A (See Fig 2a).
follow during training. Back-propagation signals are allowed to
flow from the decoder to the encoder. Therefore, weights for Fig. 4 illustrates the performance of the model. For the first
both encoder and decoder are updated in order to minimize the 60 hours, the actual load measurements on previous time steps
objective function expressed in Eq. 8. Both decoder and encoder were used as inputs to perform the prediction. Starting at hour
are updated because the pre-training of the encoder alone is 60, the predictions were introduced as inputs to generate a
insufficient to achieve good performance. forecast for the next 60 hours. The figure shows how the model
is incapable of providing accurate forecast for the last 60 hours,
IV. DATASET AND EXPERIMENTAL RESULTS even with the forecast for one-time step ahead being very
This section first introduces the dataset used for testing and accurate.
then elaborates the experimental results obtained for the two It can be assumed that the reasons behind this behavior is
models investigated. that predicting the next step can be achieved with low error by
A. Dataset
The presented methods were implemented on a benchmark
dataset of electricity consumption for a single residential
customer, named “Individual household electric power
consumption” [27]. The data set contained power consumption
measurements gathered between December 2006 and November
2010 with one-minute resolution. The dataset contained
aggregate active power load for the whole house and three sub
metering for three sections for the house. In this paper, only the
aggregate active load values for the whole house is used.
The dataset contained 2075259 measurements. Two typed of
the data set were tested: 1) one-minute resolution data (original
dataset) and 2) one-hour resolution data. The hourly resolution
data were obtained by averaging the one-minute resolution data.
Both architectures were tested on both one minute and one-hour Fig 4: load forecasting for 60 hours using standard LSTM architecture
TABLE I. ERRORS FOR S2S ARCHITECTURE (ONE HOUR TIMESTEP, 60 HOURS TABLE II. ERRORS FOR S2S ARCHITECTURE (SUMMARY)
FORECAST)
RMSE RMSE
RMSE RMSE (Training) (Testing)
Layers Units
(Training) (Testing) 60 hours, one-hour resolution 0.701 0.625
(2 Layers, 10 Units)
1 5 0.713 0.640
60 min, one-minute resolution 0.742 0.667
0.662 0.657 (2 Layers, 50 Units)
1 20
(0.677) (0.631)
0.606 0.686 architecture. The results of this approach are elaborated in the
1 50 next subsection.
(0.697) (0.634)
0.527 0.729
1 100 C. Experimental Results using LSTM based S2S architecture
(0.697) (0.634)
0.678 0.642 As mentioned, LSTM-based S2S architecture was proposed
2 5
(0.688) (0.642) to alleviate the problems encountered in the previous section.
0.604 0.675
2 20
(0.7) (0.639) Fig. 3 illustrates how the available load measurements are
0.543 0.727 only introduced on the encoder, while the decoder’s inputs are
2 50
(0.689) (0.634)
0.633 0.665
only date and time. This architecture allows us to prevent the
3 20
(0.696) (0.642) decoder from learning the naïve mapping of passing the input
straight to the output as explained in the previous section.
simply bypassing the input from the current step straight to the Table I shows the Root Mean Square Error (RMSE) on
output, this is because consequent measurements are very training and testing datasets for different number of layers and
similar (when using one-minute resolution data). Therefore, if units using the S2S architecture for data with one hour
the network predicts that the load for the next time step is the resolution. The table shows the value of the errors at the end of
same that the load on the current time. Thus, the neural network the training and in parenthesis the lowest error obtained on
is learning a naïve mapping, where it generates an output equal testing dataset that was found during training.
to the input
It can be seen that the proposed architecture is able to
Two approaches were investigated to solve the produce very low errors on training dataset. Further, it was
aforementioned problem. First was to introduce measurements noticed that increasing the capacity of the network by increasing
from further in the past as inputs, for example 5 steps back, as the number of layers and units only improves error on training
opposed to inputting the load from the previous time step. This dataset. Fig. 6 shows an example of how well the model
was done so that the input and output would different enough for performs on training dataset using a 2 layer network with 50
the network to be able to learn a useful representation of the data. units in each layer. However, increasing the capacity of the
Fig. 5 shows the prediction made by the neural network after network did not improve performance on testing data. In order
introducing the first 60 hours of load measurement and to improve accuracy on testing data Dropout [28] was used as
forecasting the next 60 hours. It can be seen that the architecture regularization methodology. Table II shows the errors obtained
provides an estimation that follows the general trend of the using Dropout for the one-minute and for the one-hour datasets.
future load. This method produced accurate results when used The predictions were made for 60 time steps in the future. The
with hourly data, but failed to perform well with one-minute results shown on Table II are comparable to the results shown in
resolution data. [6] by using FCRBM. Fig. 7 shows an example of the prediction
Given that a delay of the input was not sufficient to provide on the testing dataset.
a useful model for one-minute time steps, the second approach
that was tested was to experiment with a S2S, LSTM-based

Fig 5: load forecast for 60 hours using standard LSTM and delayed input Fig 6: Prediction results for the training dataset using S2S model
[7] Jorjeta G. Jetcheva, Mostafa Majidpour, Wei-Peng Chen, Neural network
model ensembles for building-level electricity load forecasts, Energy and
Buildings, Volume 84, December 2014, Pages 214-223
[8] Pierluigi Siano, Demand response and smart grids—A survey, Renewable
and Sustainable Energy Reviews, Volume 30, February 2014, Pages 461-
478
[9] Carlos Roldán-Blay, Guillermo Escrivá-Escrivá, Carlos Álvarez-Bel,
Carlos Roldán-Porta, Javier Rodríguez-García, Upgrade of an artificial
neural network prediction method for electrical consumption forecasting
using an hourly temperature curve model, Energy and Buildings, Volume
60, May 2013, Pages 38-46, ISSN 0378-7788
[10] C. N. Yu; P. Mirowski; T. K. Ho, "A Sparse Coding Approach to
Household Electricity Demand Forecasting in Smart Grids," in IEEE
Transactions on Smart Grid , vol.PP, no.99, pp.1-11
[11] M. Q. Raza and Z. Baharudin, "A review on short term load forecasting
using hybrid neural network techniques," Power and Energy (PECon),
2012 IEEE International Conference on, Kota Kinabalu, 2012, pp. 846-
Fig 7: Prediction results for the testing dataset using S2S model 851.
[12] M. De Felice and Xin Yao, "Short-Term Load Forecasting with Neural
V. CONCLUSIONS Network Ensembles: A Comparative Study [Application Notes]," in IEEE
Computational Intelligence Magazine, vol. 6, no. 3, pp. 47-56, Aug. 2011.
The goal of the presented work was to investigate the
[13] Rajesh Kumar, R.K. Aggarwal, J.D. Sharma, Energy analysis of a
effectiveness in using LSTM based neural networks for building building using artificial neural network: A review, Energy and Buildings,
level energy load forecasting. This paper presented two LSTM Volume 65, October 2013, Pages 352-358, ISSN 0378-7788
based neural networks architectures for load forecasting. Both [14] L. Ghelardoni, A. Ghio and D. Anguita, "Energy Load Forecasting Using
were trained and tested for one hour and one minute time-step Empirical Mode Decomposition and Support Vector Regression," in
resolution data. The standard LSTM architecture was unable to IEEE Transactions on Smart Grid, vol. 4, no. 1, pp. 549-556, March 2013.
accurately forecast loads using one-minute resolution, the S2S [15] J. B. Fiot; F. Dinuzzo, "Electricity Demand Forecasting by Multi-Task
LSTM-based architecture performed well in both datasets. Learning," in IEEE Transactions on Smart Grid , vol.PP, no.99, pp.1-1.
Further, the S2S architecture provides a flexible model that is [16] A. K. Singh, Ibraheem, S. Khatoon, M. Muazzam and D. K. Chaturvedi,
able to receive an arbitrary number of previous available load "Load forecasting techniques and methodologies: A review," Power,
Control and Embedded Systems (ICPCES), 2012 2nd International
measurements as input to estimate the load for an arbitrary Conference on, Allahabad, 2012, pp. 1-10
number of future time steps. The presented S2S model was able [17] L. Hernandez et al., "A Survey on Electric Power Demand Forecasting:
to produce comparable results to the FCRBM based results Future Trends in Smart Grids, Microgrids and Smart Buildings," in IEEE
presented in [6] for the same dataset. However, to compare the Communications Surveys & Tutorials, vol. 16, no. 3, pp. 1460-1495,
effectiveness of these algorithms, both algorithms need to be Third Quarter 2014.
tested on different real world datasets as future work. Further, [18] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning."
we plan on investigating other deep learning algorithms as well Nature 521.7553 (2015): 436-444.
as other regularization approaches to improve the generalization [19] V. Mnih, H. Larochelle, G. Hinton, Conditional restricted Boltzmann
machines for structured output prediction, in: Proceedings of the
of the models. International Conference on Uncertainty in Artificial Intelligence, 2011
REFERENCES [20] G.W. Taylor, G.E. Hinton, S.T. Roweis, Two distributed-state models for
generating high-dimensional time series, J. Mach. Learn. Res. 12 (2011)
1025–1068.
[1] L.P. Lombard, J. Ortiz, C. Pout, “A review on buildings energy [21] P. J. Werbos, "Backpropagation through time: what it does and how to do
consumption information,” in Energy and Buildings, vol. 40, pp. 394– it," Proceedings of the IEEE, vol. 78, no. 10, pp. 1550 - 1560, 1990.
398, 2008.
[22] R. J. Williams and D. Zipser, "Gradient-Based Learning Algorithms for
[2] K. Amarasinghe, D. Wijayasekara, H. Carey, M. Manic, D. He, W. Chen, Recurrent Networks and Their Computational Complexity," in
"Artificial Neural Networks based Thermal Energy Storage Control for Backpropagation: Theory, Architectures, and Applications, 1995, pp.
Buildings," in Proc. 41st Annual Conference of the IEEE Industrial 433-486.
Electronics Society, IEEE IECON 2015, Yokohama, Japan, Nov. 09-12, [23] S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural
2015. Computation, vol. 9, no. 8, 1997.
[3] D. Wijayasekara, M. Manic, "Data-Fusion for Increasing Temporal [24] R. Pascanu, T. Mikolov and Y. Bengio, "On the Difficulty of Training
Resolution of Building Energy Management System Data," in Proc. 41st Recurrent Neural Networks," in 30th International Conference on
Annual Conference of the IEEE Industrial Electronics Society, IEEE Machine Learning, Atlanta, 2013.
IECON 2015, Yokohama, Japan, Nov. 09-12, 2015.
[25] D. P. Kingma and J. L. Ba, "ADAM: A Method for Stochastic
[4] Sareh Naji, Afram Keivani, Shahaboddin Shamshirband, U. Johnson Optimization," in ICLR, San Diego, 2015
Alengaram, Mohd Zamin Jumaat, Zulkefli Mansor, Malrey Lee,
Estimating building energy consumption using extreme learning machine [26] I. Sutskever, O. Vinyals and Q. V. Le, "Sequence to Sequence Learning
method, Energy, Volume 97, 15 February 2016, Pages 506-516 with Neural Networks," in NIPS 2014, 2014.
[5] Cédric Clastres, Smart grids: Another step towards competition, energy [27] K. Bache, M. Lichman, UCI machine learning repository, 2013
security and climate change objectives, Energy Policy, Volume 39, Issue [28] V. Pham, T. Bluche, C. Kermorvant and . J. Louradour, "Dropout
9, September 2011, Pages 5399-5408 Improves Recurrent Neural Networks for Handwriting Recognition," in
[6] Elena Mocanu, Phuong H. Nguyen, Madeleine Gibescu, Wil L. Kling, 14th International Conference on Frontiers in Handwriting Recognition
Deep learning for estimating building energy consumption, Sustainable (ICFHR), Heraklion, 2014.
Energy, Grids and Networks, Volume 6, June 2016, Pages 91-99

You might also like