Article
Article
Energy and AI
journal homepage: www.sciencedirect.com/journal/energy-and-ai
H I G H L I G H T S G R A P H I C A L A B S T R A C T
A R T I C L E I N F O A B S T R A C T
Keywords: Energy efficiency is an important aspect of increasing production capacity, minimizing environmental impact,
Energy efficiency prediction and reducing energy usage in the petrochemical industries. However, in practice, data quality can be degraded
Transfer learning by measurement malfunction throughout the operation, leading to unreliable and inaccurate prediction results.
Petrochemical process
Therefore, this paper presents a transfer learning fault detection and identification-energy efficiency predictor
Measurement reliability
Fault detection and identification
(TFDI-EEP) model formulated using long short-term memory. The model aims to predict the energy efficiency of
the petrochemical process under uncertainty by using the knowledge gained from the uncertainty detection task
to improve prediction performance. The transfer procedure resolves weight initialization by applying partial
layer freezing before fine-tuning the additional part of the model. The performance of the proposed model is
verified on a wide range of fault variations to thoroughly examine the maximum contribution of faults that the
model can tolerate. The results indicate that the TFDI-EEP achieved the highest r-squared and lowest error in the
testing step for both the 10% and 20% fault variation datasets compared to other conventional methods.
Furthermore, the revelation of interconnection between domains shows that the proposed model can also
identify strong fault-correlated features, enhancing monitoring ability and strengthening the robustness and
reliability of the model observed by the number of outliers. The transfer parameter improves the prediction
performance by 9.86% based on detection accuracy and achieves an r-squared greater than 0.95 on the 40%
testing fault variation.
* Corresponding author.
E-mail address: [email protected] (C. Panjapornpon).
https://doi.org/10.1016/j.egyai.2022.100224
2
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
3
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
Table 2
Comparison of advantages and disadvantages of each deep network structure.
Model Advantages Disadvantages
without any cycle or loop [24]. The hidden layers are constructed using information processing steps are visualized in Fig. 2.
fully connected layers that sum each channel of the previous layer
output and transform it into the final output using linear activation of
the regression layers. RNN and CNN are the more complex deep learning 2.3. Output task
models. RNN is structured based on a fully connected and regression
layer similar to FFNN. RNN, on the contrary, contains cyclic connections The difference in tasks between the source and target domains
that allows state variables to capture the temporal dynamic behavior of require different types of output layers. In the source domain, the clas
data and recall the state variable at previous time steps [25], where this sification output layer is deployed for the measurement fault detection -
dynamic recurrent feature will feed the current state to the following identification task. This layer computes the cross-entropy loss (LossC )
observations. CNN consists of various convolutional and pooling layers. and weighted classification tasks with mutually exclusive classes. The
When dealing with time series data (input variables, observations, and classification layer usually follows a fully connected layer and a softmax
time steps), the convolution layer convolves a significant part of the layer. A fully connected layer summarizes all the vectors equal to the
inputs over the time dimension using a specified filter to create a feature output size, while softmax activation normalizes it into the probability
mapping [26]. The pooling layers then carry out downsampling by distribution. The calculation of fully connected value, softmax activa
reducing the size of the information and sending it to the subsequent tion, and cross-entropy loss for single output classification are given in
layers (fully connect layer to adjust size and regression layer). CNN is Equations below.
highly capable of performing strong feature extraction and multidi o2t = wfc ht + bfc (1)
mensional feature condensing. The EEP is an advanced deep learning
model for energy efficiency prediction based on time-series data; it is
composed of the LSTM computational layers, fully connected layer, and ) 2
eot
regression layer. Furthermore, FDI-EEP uses a result of fault classifica ynk = o3t = Softmax(o2t = ∑
n (2)
2
eot
tion as an additional predictor, while TFDI-EEP uses the partial transfer t=1
knowledge of the FDI task concatenated with the predictor to perform
energy efficiency prediction. Details of LSTM-based and proposed 1 ∑N ∑ K
models will be discussed in the subsequent section. The comparison of LossC = − wk tnk lnynk (3)
N n=1 k=1
advantages and disadvantages for each type of deep neural network is
shown in Table 2. where the wfc, bfc are the array of weighting factors and bias of the fully
connected layer and o2t is the output vector of the fully connected layer,
2.2. LSTM cell structure N is the number of samples, K is the number of classes, wk is the penalty
weight for class k, tnk is the indicator that the n-th sample belongs to the
In the LSTM-based models, the LSTM layer was deployed to improve k-th class, and ynk is the output class for sample n for the k-th class.
the ability of information capturing. The LSTM uses state variables to For the target domain, a regression layer determines the half-mean-
extract temporal behavior and manipulate long-term dependency by a squared-error loss function (LossR ) of the predicted responses at each
gating mechanism. Additionally, LSTM prevents gradient problems by time step in sequence-to-sequence regression networks.
storing the memory in sigmoid ranges, which is suitable for future The loss function in Eq. (4) is utilized to tune hyperparameters and
backpropagation. The input gate of LSTM contains two types of activa update weighting factors as well as bias in each layer of a deep network
tion functions, which are the sigmoidal and hyperbolic tangent func structure.
tions. First, the hyperbolic tangent generates a cell candidate and is
updated by the gating vector from the sigmoid activation function. Then, 1 ∑N
( )2
LossR = yactual,t − o2t (4)
the forget gate decides whether part of the information needs to be 2N t=1
considered and which may be disregarded. Finally, the output gate
calculates the output vector from the current and previous inputs. By the where the yactual,t is the actual class values, and N is the number of
updated cell state, the output gate decides which part of the current cell samples.
state will be carried out as a final output from LSTM layers. The LSTM To prevent the overfitting problem in this work, regularization is
4
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
5
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
applied to the target tasks by adding to the penalty term, which mini in Fig. 5.
mizes the weighting factor in each layer. The final error function in the
target task can be calculated by Eqs. (5) and (6). 3.1. Data normalization
ER = LossR + λΩ(w) (5)
The scale of the input variable is one of the most important aspects of
1 learning stability. Relatively different input variables not only conduct
Ω(w) = wT w (6) imbalanced weighting factors and bias in prediction but also cause the
2
learning process to fail with an exploding gradient. In the study, the
where w is the weighting vector, λ is the regularization factor. input variable consists of large-scale input, i.e., flow rate and utility
consumption, and small scale, including positive and negative measur
2.4. Transfer learning modeling able temperature. Hence, this study applied z-score normalization on
sequence input layers to adjust the input variable to the same scale and
The transfer learning problem allows users to transfer knowledge ensure that the network can operate on the information using Eqs. (7)–
and adopt the model across the domain differently based on the source (9).
and target tasks [27]. In this study, the model parameter-based transfer
learning is deployed by presupposing that the source and target tasks xji − μj
zji = (7)
overlap some parameter or subsequent distributions of the σj
hyper-parameters [28]. The knowledge gained from the source task can ∑
n
solve the weight initialization problem and increase the prediction xji
reliability of the target task [29]. Partial layer freezing is also imple μj = i=1 (8)
n
mented to prevent overwriting pre-trained weight and bias of the
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
learnable activations and maintain the monitoring ability in fault ∑n ( j )2
i=1 xi − μ
j
detection - identification of the source task. The learnable activation j
σ = (9)
information of the source and target tasks are summarized in Fig. 3, and n− 1
the structure of the proposed TDFI-EEP is visualized in Fig. 4. j
where xi is the original i th data point of the j-th input feature, μj is the
3. Data processing steps mean value of the j-th input feature, and σ j is the standard deviation of
the j-th input feature.
This section provides data processing steps and modeling procedures
for conventional neural networks and transfers learning across the 3.2. Hyperparameter tuning and model validation
domain. The information in this section is based on the data perspective,
which includes data normalization, hyperparameter tuning, model The grid-search method explores the optimal set of hyperparameters
validation, and model performance indicator calculation, as illustrated under the searching domain specified in Table 3. To evaluate the general
6
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
7
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
performance of the model under fault scenarios and prevent the model √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
from overfitting problems, k-fold cross-validation is applied with the 1∑n
RMSE = (yi − ̂ y i )2 (11)
number of folds equal to five in each iteration of hyperparameter n i=1
exploration. Cross-validation is the resampling method that uses n ⃒ ⃒
different parts of the information to validate the model by dividing the 1∑ ⃒yi − ̂
y i ⃒⃒
MAPE = ⃒
⃒ × 100% (12)
entire dataset into five groups. Then, four folds are used as the training n i=1 yi ⃒
dataset, while the remaining one is the testing dataset. The procedure is
repeated until every fold is used as a testing dataset. The average per where yi is the actual output of the i th sample, ̂
y i is the predicted output
formance indicator is calculated from every fold result and reported as a value of the i th sample, and y is the mean value of the original output
validation performance. Finally, the hyperparameter that gives the value.
8
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
4. Descriptions of process and data generation which are 10% fault variation (randomly between 10% and 15%) and
20% fault variation (randomly between 20% and 25%). The whole
The studied vinyl chloride monomer (VCM) production process dataset is divided into training, validation, and testing with a percentage
consists of five sections: chlorination, oxychlorination, ethylene of 60%, 20%, and 20%. Fig. 9 illustrates an example of the input vari
dichloride (EDC) purification, EDC cracking, and VCM purification as ables under 10% fault variation, which are characterized by the tem
illustrated in Fig. 6. According to the energy distribution in Fig. 7, the perature and flow rate, respectively.
EDC cracking and VCM purification sections consume a high energy load
because it consists of the cracking furnace, multiple energy-interaction 4.3. Energy efficiency and specific energy consumption
units, and a series of quencher and distillation columns [30]. There
fore, the EDC cracking and VCM purification sections are selected as the Energy efficiency can be interpreted in various forms, according to
case study for this energy efficiency prediction. the aims of the study. A specific energy consumption (SEC), which refers
to the ratio of the energy supplied as an input to the number/quantity of
4.1. EDC cracking and VCM purification sections the product produced [33], is adopted as a monitoring indicator in this
work. The SEC represents the energy intensity and productivity of the
In the EDC cracking section, vaporized EDC is fed into the thermal process, which is calculated by Eq. (13)
furnace, in which it is pyrolyzed into VCM and other byproducts. A se
Ein
ries of quenchers and condensers quickly cool the hot gas mixture before SEC = (13)
V
transferring it to the VCM purification section. The first column refines
HCl to the top before recycling it to the oxychlorination section, while where Ein is the energy supplied to the process, and V is the VCM pro
the second column separates VCM from unreacted EDC. Fig. 8 depicts duction rate.
the process flow diagram of the EDC cracking and VCM purification Since the SEC is easily understood and enables direct reporting of the
sections, including a notation of 40 measurable variables used to esti amount of energy consumed per unit of product, it is beneficial for
mate energy efficiency. These input variables are directly supplied into managing, monitoring, and comparing the energy usage of the petro
the input layer of the model without any data preprocessing techniques, chemical industry. Additionally, it can be utilized for benchmarking at
or feature selection approaches employed. multiple scales, including process, national, and international bench
marking. Likewise, the SEC calculations also have the potential to
4.2. Aberrant measurement signal determine operating profit margins per product and machine safety.
The aberrant measurement signal can be defined as a temporary or 5. Results and discussion
permanent fluctuation in the output signal of sensors under normal
conditions [31]. The period of these variations depends on the types of 5.1. Hyperparameter tuning result
anomalies that cause faulty behavior. It can be present in the form of
outliers, measurement faults [32], and process uncertainty. The number The optimal hyperparameters tuning by the grid-search method with
of fault classes of the studied VCM process is 41, which corresponds to maximum cross-validation r-squared for conventional and LSTM-based
the number of sensors included in the normal operation of the measuring methods are shown in Tables 4 and 5, respectively. Before being trans
system. ferred to the target domain to predict energy efficiency, the classifica
The data used in this study was simulated to gather process infor tion models of the FDI-EEP and TFDI-EEP must first be optimized in the
mation by UniSim Design Suite and develop aberrant process signals training step for the number of hidden layers and hidden nodes until the
using MATLAB through the co-simulation approach. MATLAB delivered maximum detection accuracy is achieved.
the requested operational condition information to the UniSim Design Fig. 10 visualizes the activation nodes and their values along the
Suite, which utilized them to produce normal samples and construct training sequence. The activation visualization by an activation map
datasets of defective signals. As a result, two datasets with over 2000 [34] improves interpretability and transparency for valuable insights
datapoints with different amplitude of fault variation were generated, from a data analytics perspective. Furthermore, it shows the values of
9
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
10
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
Fig. 8. Process flow diagram of the EDC cracking and VCM purification sections.
11
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
Fig. 9. Examples of a) flow and b) temperature input variables under 10% fault variation.
12
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
Table 4 LSTM over time to evaluate the involvement of the input variables in the
Optimized hyperparameters of the conventional models. source task. There are 41 fault cases, but only 26 features were used
Hyperparameter 10% Variation 20% Variation since the activation steps demonstrate that the LSTM has feature selec
FFNN RNN CNN FFNN RNN CNN tion capability, and not all the information from the process features is
FFNN hidden node 5 – – 25 – – necessary to detect and identify all the cases with measurement faults.
FFNN hidden layers 1 – – 1 – –
RNN hidden node – 120 – – 130 – 5.2. Energy efficiency prediction
Recurrent delay – 2 – – 1 –
CNN Filter Size 3 3
Table 6 summarizes the energy efficiency prediction performance
– – – –
Number of CNN filters – – 30 – – 30
Number of CNN layers – – 1 – – 2 under 10% and 20% fault variations. All models were averaged with 15
CNN padding – – Same – – Same iterations, and the training epoch was specifically set to 500 for evalu
Learning rate 0.1 1 0.1 0.1 1 0.1 ating the general performance of the network. The results show that
Regularization factor 1 0 0 0.5 0 0
TFDI-EEP performs energy efficiency prediction outstandingly
Weight update optimizer Adam SGD Adam Adam SGD Adam
compared with other models. In addition, the transfer parameters assist
the target prediction task in providing the lowest RMSE, MAPE, and the
highest r-squared on both fault variations. Table 7 shows the speed in
dicator of each model, including training time, prediction speed, and
Table 5
Optimized hyperparameters of LSTM-based models. execution time. These indicators refer to the time required in the
training and model implementation step. As a result, the training time of
Hyperparameter 10% Variation 20% Variation
TFDI-EEP is 4 to 5 longer than the regular LSTM-based models such as
EEP FDI- TFDI- EEP FDI- TFDI-
EEP EEP EEP EEP EEP but compensates with higher prediction accuracy. The execution
time and prediction speed of every model in this study are less than a
LSTM layers for – 2 2 – 1 1
classification second, making it applicable for real-time implementation.
Hidden node of the – 40 40 – 40 40 Fig. 11 shows the comparative results of the predicted and actual
first LSTM layers energy efficiency values. One important point to view from the figure is
Hidden node of the – 40 40 – – – that most models encounter bias/variance problems. Since the fault
second LSTM layers
variation increases, the error is inadvertently increased on the predicted
LSTM layers for 1 1 1 1 1 1
regression SEC. However, as can be seen from the prediction residual of the diag
Hidden node of LSTM 10 60 5 40 65 5 onal line, the model parameter-based transfer learning method helps the
layers model discover an excellent tradeoff between bias and variance.
Learning rate 0.45 0.75 0.4 1 0.9 0.4
Regularization factor 0.6 0.3 0.2 0.3 0.4 0.3
Weight optimizer Adam Adam Adam Adam Adam Adam
13
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
Table 6
Model performance for energy efficiency prediction.
Dataset Model 10% fault variation 20% fault variation
R-squared MAPE RMSE R-squared MAPE RMSE
14
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
15
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
16
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
when the classification accuracy drop. Moreover, the number of pre [3] Machalek D, Tuttle J, Andersson K, Powell KM. Dynamic energy system modeling
using hybrid physics-based and machine learning encoder–decoder models. Energy
dictor variables for the two domains must be identical because the size
AI 2022;9:100172. https://doi.org/10.1016/j.egyai.2022.100172.
of the transferred weight and bias can be inconsistent with the dimen [4] Chen C-Y, Chai KK, Lau E. AI-Assisted approach for building energy and carbon
sion of the designed network. We plan to use the non-identical features footprint modeling. Energy AI 2021;5:100091. https://doi.org/10.1016/j.
of transfer learning by implementing alternative transfer procedures in egyai.2021.100091.
[5] Sharifian S, Sotudeh-Gharebagh R, Zarghami R, Tanguy P, Mostoufi N. Uncertainty
the future. in chemical process systems engineering: a critical review. Rev Chem Eng 2021;37:
687–714. https://doi.org/10.1515/revce-2018-0067.
[6] Jan SU, Lee YD, Koo IS. A distributed sensor-fault detection and diagnosis
Declaration of Competing Interest framework using machine learning. Inf Sci (Ny) 2021;547:777–96. https://doi.org/
10.1016/j.ins.2020.08.068.
The authors declare that they have no known competing financial [7] Xu C, Zhao S, Liu F. Sensor fault detection and diagnosis in the presence of outliers.
Neurocomputing 2019;349:156–63. https://doi.org/10.1016/j.
interests or personal relationships that could have appeared to influence neucom.2019.01.025.
the work reported in this paper. [8] Yoo M. A resilience measure formulation that considers sensor faults. Reliab Eng
Syst Saf 2020;7. https://doi.org/10.1016/j.ress.2019.02.025.
[9] Beisheim B, Rahimi-Adli K, Krämer S, Engell S. Energy performance analysis of
Data availability continuous processes using surrogate models. Energy 2019;183:776–87. https://
doi.org/10.1016/j.energy.2019.05.176.
The data that has been used is confidential. [10] Moghadasi M, Ozgoli HA, Farhani F. Steam consumption prediction of a gas
sweetening process with methyldiethanolamine solvent using machine learning
approaches. Int J Energy Res 2021;45:879–93. https://doi.org/10.1002/er.5979.
[11] Geng Z, Zhang Y, Li C, Han Y, Cui Y, Yu B. Energy optimization and prediction
Acknowledgments modeling of petrochemical industries: an improved convolutional neural network
based on cross-feature. Energy 2020;194:116851. https://doi.org/10.1016/j.
energy.2019.116851.
The author would like to acknowledge the support of the Faculty of [12] Miele ES, Bonacina F, Corsini A. Deep anomaly detection in horizontal axis wind
Engineering, Kasetsart University (Grant No. 65/10/CHEM/M.Eng), the turbines using graph convolutional autoencoders for multivariate time series.
Energy AI 2022;8:100145. https://doi.org/10.1016/j.egyai.2022.100145.
Kasetsart University Research and Development Institute, and Kasetsart [13] Roelofs CMA, Lutz M-A, Faulstich S, Vogt S. Autoencoder-based anomaly root
University. In addition, the authors would like to acknowledge Honey cause analysis for wind turbines. Energy AI 2021;4:100065. https://doi.org/
well and GRD Tech Co., Ltd for providing the UniSim Design Suite 10.1016/j.egyai.2021.100065.
[14] Fang X, Gong G, Li G, Chun L, Li W, Peng P. A hybrid deep transfer learning
simulation software used in this study.
strategy for short term cross-building energy prediction. Energy 2021;215:119208.
https://doi.org/10.1016/j.energy.2020.119208.
References [15] Shi Z, Chehade A. A dual-LSTM framework combining change point detection and
remaining useful life prediction. Reliab Eng Syst Saf 2021;205:107257. https://doi.
org/10.1016/j.ress.2020.107257.
[1] Golmohamadi H. Demand-side management in industrial sector: a review of heavy
[16] Tien PW, Wei S, Darkwa J, Wood C, Calautit JK. Machine learning and deep
industries. Renew Sustain Energy Rev 2022;156:111963. https://doi.org/10.1016/
learning methods for enhancing building energy efficiency and indoor
j.rser.2021.111963.
environmental quality – a review. Energy and AI 2022;10:100198. https://doi.org/
[2] Hassani H, Silva ES, Al Kaabi AM. The role of innovation and technology in
10.1016/j.egyai.2022.100198.
sustaining the petroleum and petrochemical industry. Technol Forecast Soc Change
2017;119:1–17. https://doi.org/10.1016/j.techfore.2017.03.003.
17
C. Panjapornpon et al. Energy and AI 12 (2023) 100224
[17] Panjapornpon C, Bardeeniz S, Hussain MA. Improving energy efficiency prediction [27] Peirelinck T, Kazmi H, Mbuwir BV, Hermans C, Spiessens F, Suykens J, et al.
under aberrant measurement using deep compensation networks: a case study of Transfer learning in demand response: a review of algorithms for data-efficient
petrochemical process. Energy 2023;263:125837. https://doi.org/10.1016/j. modelling and control. Energy AI 2022;7:100126. https://doi.org/10.1016/j.
energy.2022.125837. egyai.2021.100126.
[18] Westermann P, Evins R. Using Bayesian deep learning approaches for uncertainty- [28] Pinto G, Wang Z, Roy A, Hong T, Capozzoli A. Transfer learning for smart
aware building energy surrogate models. Energy AI 2021;3:100039. https://doi. buildings: a critical review of algorithms, applications, and future perspectives.
org/10.1016/j.egyai.2020.100039. Adv Appl Energy 2022;5:100084. https://doi.org/10.1016/j.adapen.2022.100084.
[19] Panjapornpon C, Bardeeniz S, Hussain MA. Deep learning approach for energy [29] Fan C, Sun Y, Xiao F, Ma J, Lee D, Wang J, et al. Statistical investigations of transfer
efficiency prediction with signal monitoring reliability for a vinyl chloride learning-based methodology for short-term building energy predictions. Appl
monomer process. Reliab Eng Syst Saf 2022:109008. https://doi.org/10.1016/j. Energy 2020;262:114499. https://doi.org/10.1016/j.apenergy.2020.114499.
ress.2022.109008. [30] Chinprasit J, Panjapornpon C. Model predictive control of vinyl chloride monomer
[20] Oyewole I, Chehade A, Kim Y. A controllable deep transfer learning network with process by Aspen plus dynamics and MATLAB/Simulink co-simulation approach.
multiple domain adaptation for battery state-of-charge estimation. Appl Energy IOP Conf Ser: Mater Sci Eng 2020;778:012080. https://doi.org/10.1088/1757-
2022;312:118726. https://doi.org/10.1016/j.apenergy.2022.118726. 899X/778/1/012080.
[21] Yang D, Peng X, Ye Z, Lu Y, Zhong W. Domain adaptation network with uncertainty [31] Saeed U, Lee Y-D, Jan SU, Koo I. CAFD: context-aware fault diagnostic scheme
modeling and its application to the online energy consumption prediction of towards sensor faults utilizing machine learning. Sensors 2021;21:617. https://doi.
ethylene distillation processes. Appl Energy 2021;303:117610. https://doi.org/ org/10.3390/s21020617.
10.1016/j.apenergy.2021.117610. [32] Zhang Z, Mehmood A, Shu L, Huo Z, Zhang Y, Mukherjee M. A survey on fault
[22] Wang C, Chen D, Chen J, Lai X, He T. Deep regression adaptation networks with diagnosis in wireless sensor networks. IEEE Access 2018;6:11349–64. https://doi.
model-based transfer learning for dynamic load identification in the frequency org/10.1109/ACCESS.2018.2794519.
domain. Eng Appl Artif Intell 2021;102:104244. https://doi.org/10.1016/j. [33] Lawrence A, Thollander P, Andrei M, Karlsson M. Specific energy consumption/use
engappai.2021.104244. (SEC) in energy management for improving energy efficiency in industry: meaning,
[23] Mondal S, Chattopadhyay A, Mukhopadhyay A, Ray A. Transfer learning of deep usage and differences. Energies 2019;12:247. https://doi.org/10.3390/
neural networks for predicting thermoacoustic instabilities in combustion systems. en12020247.
Energy AI 2021;5:100085. https://doi.org/10.1016/j.egyai.2021.100085. [34] Salahuddin Z, Woodruff HC, Chatterjee A, Lambin P. Transparency of deep neural
[24] Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw 2015; networks for medical image analysis: a review of interpretability methods. Comput
61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003. Biol Med 2022;140:105111. https://doi.org/10.1016/j.
[25] Wang H, Lei Z, Zhang X, Zhou B, Peng J. A review of deep learning for renewable compbiomed.2021.105111.
energy forecasting. Energy Convers Manage 2019;198:111799. https://doi.org/ [35] Gundersen OE, Shamsaliei S, Isdahl RJ. Do machine learning platforms provide
10.1016/j.enconman.2019.111799. out-of-the-box reproducibility? Future Gener Comput Syst 2022;126:34–47.
[26] Walser T, Sauer A. Typical load profile-supported convolutional neural network for https://doi.org/10.1016/j.future.2021.06.014.
short-term load forecasting in the industrial sector. Energy AI 2021;5:100104. [36] Luo X, Oyedele LO. A self-adaptive deep learning model for building electricity
https://doi.org/10.1016/j.egyai.2021.100104. load prediction with moving horizon. Mach Learn Appl 2022;7:100257. https://
doi.org/10.1016/j.mlwa.2022.100257.
18