Deep Learning-Based Non-Intrusive Commercial
Deep Learning-Based Non-Intrusive Commercial
Article
Deep Learning-Based Non-Intrusive Commercial
Load Monitoring
Mengran Zhou , Shuai Shao * , Xu Wang , Ziwei Zhu and Feng Hu
School of Electrical and Information Engineering, Anhui University of Science and Technology,
Huainan 232001, China; [email protected] (M.Z.); [email protected] (X.W.); [email protected] (Z.Z.);
[email protected] (F.H.)
* Correspondence: [email protected]
Abstract: Commercial load is an essential demand-side resource. Monitoring commercial loads helps
not only commercial customers understand their energy usage to improve energy efficiency but
also helps electric utilities develop demand-side management strategies to ensure stable operation
of the power system. However, existing non-intrusive methods cannot monitor multiple commer-
cial loads simultaneously and do not consider the high correlation and severe imbalance among
commercial loads. Therefore, this paper proposes a deep learning-based non-intrusive commercial
load monitoring method to solve these problems. The method takes the total power signal of the
commercial building as input and directly determines the state and power consumption of several
specific appliances. The key elements of the method are a new neural network structure called
TTRNet and a new loss function called MLFL. TTRNet is a multi-label classification model that can
autonomously learn correlation information through its unique network structure. MLFL is a loss
function specifically designed for multi-label classification tasks, which solves the imbalance problem
and improves the monitoring accuracy for challenging loads. To validate the proposed method,
experiments are performed separately in seen and unseen scenarios using a public dataset. In the
seen scenario, the method achieves an average F1 score of 0.957, which is 7.77% better than existing
Citation: Zhou, M.; Shao, S.; Wang,
multi-label classification methods. In the unseen scenario, the average F1 score is 0.904, which is
X.; Zhu, Z.; Hu, F. Deep
Learning-Based Non-Intrusive
1.92% better than existing methods. The experimental results show that the method proposed in this
Commercial Load Monitoring. paper is both effective and practical.
Sensors 2022, 22, 5250. https://
doi.org/10.3390/s22145250 Keywords: non-intrusive load monitoring; commercial load; deep learning; multi-label classification;
correlation; imbalance
Academic Editor: Anastasios
Doulamis
Compared with intrusive load monitoring, it has the advantages of low cost and easy
implementation, and is more cost-effective for commercial users [7]. Therefore, proposing
an effective method to realize non-intrusive commercial load monitoring is crucial.
Existing NILM methods are mainly used for residential loads [7]. Commercial loads
differ from residential loads in terms of energy consumption and load characteristics, which
is the main reason why existing NILM solutions are not directly applicable to commercial
loads [8]. The earliest NILM study specifically for commercial loads was proposed by
Norford et al. [9]. They do this by matching steady-state and transient changes to known
patterns. Their work sparked interest in high-power loads such as heating, ventilation and
air conditioning (HVAC) in commercial buildings. Subsequently, Ji et al. [10] proposed a
NILM method based on a Fourier series model to determine the hourly end-use of HVAC
in commercial buildings. In recent years, there have been attempts to use machine learning
algorithms for NILM of commercial loads. Ling et al. [11] and Xiao et al. [12] simultaneously
proposed an approach based on random forest, one for disaggregating out the energy
consumption of building subsystems and one for disaggregating out the cooling load of
buildings. In addition, generative models are also applied to NILM, such as EnerGAN [13]
and EnerGAN++ [14]. Henriet et al. [15] were the first to apply generative models to
non-intrusive commercial load monitoring. The input signal can also be processed using a
graph-based method [16]. While paying attention to models or algorithms, the practicality
of the NILM method has also attracted more and more attention. In practical applications,
NILM is usually performed at a lower sampling frequency [17]. Rafsanjani et al. [18] uses
density-based spatial clustering of applications with noise and quadratic discriminant
analysis to perform non-intrusive commercial load monitoring at the occupant level rather
than on specific HVAC equipment or systems. Modern commercial buildings generally
have building automation systems (BAS), so Zaeri et al. [19] proposed a disaggregation
method for the end-use of commercial buildings based on BAS data and multiple linear
regression models.
To sum up, traditional classification and regression algorithms still dominate non-
intrusive commercial load monitoring. These methods typically require manual feature
extraction using domain expert knowledge and are less transferable [20]. Additionally, non-
intrusive commercial load monitoring has the following problems: 1. Multiple electrical
devices cannot be identified at the same time. Demand response often requires acquiring
multiple loads simultaneously. However, due to the large number and variety of devices
used by commercial customers, the approach of training a separate model for each load
is no longer applicable [21]. 2. The potential correlation between electrical equipment
cannot be considered. For example, a high correlation between commercial air conditioning
units may lead to simultaneous startup or shutdown events, violating the one-at-a-time
assumption [8]. 3. The unbalanced phenomenon of the state of electrical equipment cannot
be considered. Under different power consumption scenarios, different types of devices
have different startup or shutdown times. For example, some devices are turned off for a
long time and only turned on for a short time, which affects the monitoring accuracy [21].
In response to the above challenges, this paper proposes a deep learning method for
non-intrusive commercial load monitoring, which can directly obtain the operating status
and power consumption of multiple internal electrical devices from the overall power
consumption of commercial buildings without a complex features project. The method in-
cludes a novel deep learning framework called Transformer-Temporal Pooling-RethinkNet
(TTRNet) and a novel loss function called Multi-Label Focal Loss (MLFL). Considering the
need for commercial loads to participate in demand response and the power consumption
logic of commercial customers, two different NILM evaluation scenarios, called “seen” and
“unseen”, were created and tested to demonstrate that the proposed method can achieve
high performance with good transferability in terms of equipment state identification and
energy decomposition. The following are the main contributions of this work:
1. The proposed deep learning method for non-intrusive commercial load monitoring
is based on a multi-label classification task, which can simultaneously identify the
Sensors 2022, 22, 5250 3 of 18
operating status of multiple commercial electrical devices and decompose the power
consumption of multiple devices, reducing the time cost of existing commercial
NILM methods. Moreover, the label correlation and class imbalance problems are
solved from the model framework and training method, respectively, to improve
monitoring accuracy.
2. Compared with the existing models, the encoder part of TTRNet designs the structure
of stacking multiple identical blocks, and each block is composed of a transformer
encoder and max-pooling to automatically extract the characteristics of the total
input power sequence. A Temporal Pooling block is added between the encoder and
decoder to provide more detailed features by adding contextual information when
identifying the activation state. Finally, RethinkNet is introduced in the decoder part
to enhance the learning of interrelationships between each power-consuming device
and improve the accuracy of multi-label classification.
3. By improving the existing single-label Focal Loss for multi-label classification tasks,
not only the load imbalance problem in NILM is solved, but also the accuracy of
hard-to-identify loads is improved.
The rest of this paper is organized as follows: Section 2 presents and compares the
related work. Section 3 describes the proposed deep learning method in detail. Section 4
describes the experimental steps in the public dataset. Section 5 presents the experimental
results, and Section 6 presents the analysis. Section 7 concludes and proposes some possible
further work.
2. Related Work
Recurrent Neural Networks (RNN) and their variants have been the dominant method
for solving problems with sequence data. However, the structure of RNN leads to weak-
nesses in parallel computing. In addition, one of the biggest drawbacks of RNN is the
problem of vanishing gradients and exploding gradients when the sequence is too long.
To overcome these limitations, the attention mechanism is introduced as a solution [22].
On this basis, Vaswani et al. [23] proposed a new simple network architecture, namely a
Transformer. Transformers currently have success in field such as natural language pro-
cessing, computer vision, and time series forecasting, sparking great interest in the NILM
community. Lin et al. [24] were the first to apply a Transformer to NILM and simultane-
ously proposed two different networks, one containing only multiple encoder blocks and
the other keeping the original encoder-decoder structure. Experiments demonstrate that a
Transformer improves NILM accuracy, robustness, and training cost compared to existing
RNNs and Convolutional Neural Networks (CNN). Yue et al. [25] proposed an improved
objective function specially designed for NILM learning and a Transformer representation
architecture based on bidirectional encoders and also achieved better results than the exist-
ing methods. In addition to using off-the-shelf Transformers, the architecture and training
methods can be improved to better fit the NILM task. Yue et al. [26] further replaced the
original self-attention with local attention to improve the poor performance of capturing
local signal patterns, while Sykiotis et al. [27] proposed the use of unsupervised pre-training
and downstream task fine-tuning to improve prediction accuracy and reduce training time,
both with better results. Several NILM methods mentioned above are compared in Table 1.
The comparison shows that existing Transformer-based methods either use encoders or
improve them. These methods have only been tested for residential loads and have not
been evaluated for commercial loads. Furthermore, these methods are mainly used for
regression or single-label classification tasks rather than multi-label classification tasks.
Therefore, this paper pioneers the use of Transformers for non-intrusive commercial load
monitoring and multi-label classification tasks.
Sensors 2022, 22, 5250 4 of 18
Framework
Reference Name Publication Date Dataset
Task Main Components Loss
Lin et al. [24] MA-net and August 2020 REDD r and c Encoders and Decoders MSE
MAED-net
Yue et al. [25] BERT4NILM November 2020 REDD and UK-DALE r and c Encoders MSE +KL divergence
+soft-margin
Yue et al. [26] ELTransformer March 2022 REDD and UK-DALE r and c Encoders MSE
Sykiotis et al. [27] ELECTRIcity April 2022 REDD and r and c Encoders MSE +KL divergence
UK-DALE and Refit
3. Methodology
The method proposed in this paper has two main parts: a neural network architecture,
TTRNet, and a loss function, MLFL. Where TTRNet consists of four components: Input
Embedding, Transformer, Temporal Pooling, and RethinkNet, the overall structure is shown
in Figure 1. Each part is described separately below.
Sensors 2022, 22, 5250 5 of 18
The positional encoding is used to generate the relative and absolute position informa-
tion for the input sequence.
p
E p ( p, 2i ) = sin 2i/d
10000 (2)
p
E p ( p, 2i + 1) = cos
100002i/d
Finally, the feature vector of the value embedding transformation is added to the
position vector generated by the positional encoding to add position information to the
input sequence.
Ei (x) = αEv (x) + E p (p, ) (3)
where α is a factor that balances the value embedding and the positional encoding. The in-
put sequence will be normalized during data preprocessing later, so here α = 1.
3.2. Transformer
The Transformer part of TTRNet, i.e., the encoder part of TTRNet, is used to extract
input features. This part uses the Transformer’s encoder structure. It consists of a stack of
Sensors 2022, 22, 5250 6 of 18
N identical modified encoder blocks, each of which includes a Transformer encoder layer
and a max-pooling layer, whose structure is shown in Figure 2.
Each Transformer encoder layer consists of two sublayers. The first layer is multi-head
attention, and the second layer is a feedforward network. Each sublayer performs skip
connections followed by layer normalization. Multi-head attention consists of several
parallel self-attentions. Self-attention operates on the input Q, K, and V matrices. First, Q
and K are multiplied, then divided by the square root of the hidden size; softmax is used to
generate soft attention and multiplied by V to get the final weighted value matrix.
QK T
Attention( Q, K, V ) = softmax √ V (4)
dk
MultiHead( Q, K, V ) = Concat O
(head1 , . . . , headh )W (5)
where headi = Attention QWiQ , KWiK , VWiV
As shown in the figure above, the Temporal Pooling module has five channels, one
of which does nothing and keeps the original features unchanged. The remaining four
channels perform Temporal Pooling operations. In these four channels, feature vectors are
first passed through average pooling layers with kernel sizes of 5, 10, 20, and 30 to reduce
the temporal resolution. Then, each channel performs a 1D convolution to reduce the
feature dimension to a quarter of the original dimension. The Temporal Pooling operation
for each channel can be expressed as:
3.4. RethinkNet
The main role of the RethinkNet module is to decode the previously extracted feature
vector into an output sequence with multiple labels. Figure 4 illustrates the design scheme
of this part, which consists mainly of a transposed convolution, multiple recurrent network
layers, and finally a fully connected layer.
The output of the Temporal Pooling module is the input to the transposed convolution,
which uses a kernel size and stride of 8 to increase the temporal resolution of features while
reducing the number of features. After layer normalization, these features are fed into
multiple recurrent networks. Multiple recurrent network layers are arranged in parallel,
the input of each layer is from the transposed convolution, and the hidden state of each
layer is from the previous recurrent network layer.
RethinkNet is a memory label correlation model across recursive network layers.
RethinkNet forms initial guesses in the first recursive grid layer stores them in memory
and iteratively corrects them using label correlation. Each layer of the recursive network is
an iterative process that “rethinking” through multiple iterations.
ŷ(t) = σ W1 x + W2 ŷ(t−1) (7)
where W1 x is the feature term, which comes from transposed convolution, and W2 ŷ(t−1) is
the memory term that converts the previous predictions into the current label vector space.
Sensors 2022, 22, 5250 8 of 18
The label correlation information is stored in global memory, so the order of the labels does
not affect classification results.
In this study, a long short-term memory network (LSTM) was selected as the recurrent
network layer. This is because LSTMs can guarantee good results when dealing with long
sequences of data no matter how many iterations are made. In this study, four LSTM
layers were used. The output of LSTM is linearly transformed by the fully connected layer,
and finally activated by the sigmoid function to realize multi-label classification.
3.5. MLFL
In practical applications, existing methods often perform poorly in practice due to
the different proportions of different operating states of different electrical equipment of
commercial customers. This can be solved by changing the loss function. Larger weights
are assigned to smaller proportions of samples, and smaller weights are assigned to larger
proportions of samples. Increasing the proportion of small-scale samples in the overall
loss function guides network training to favor small-scale samples, thereby improving the
classification accuracy of small-scale samples.
In addition to this, in the later stages of training, the most identifiable loads are
correctly classified, while only a few challenging loads are misclassified. Likewise, the clas-
sification accuracy of hard-to-identify loads can be improved by weighting.
Focal Loss is an excellent solution to this problem, but the original Focal Loss was
applied to single-label classification tasks. In this study, Focal Loss is improved by designing
weights for each label separately, making it usable for multi-label classification tasks.
The improved MLFL is shown below:
N L
1
∑ ∑ αl ∗ ynl ∗ (1 − pnl )γ ∗ log( pnl ) + (1 − ynl ) ∗ pnl ∗ log(1 − pnl )
γ
L MLFL = − (8)
N n =1 l =1
where αl represents the proportion of negative to positive samples for the lth label. γ is the
weight in the loss function for challenging loads that can take on values between 0 and 5.
In this work, γ is set to 2.
4. Experiment
This section focuses on the experiments performed to verify the effectiveness of the
proposed method. Section 4.1 describes the dataset used. Section 4.2 describes the data
Sensors 2022, 22, 5250 9 of 18
preprocessing process. Section 4.3 details the experiments conducted in this paper for two
different evaluation scenarios (seen and unseen). Section 4.4 presents evaluation metrics
describing the experimental results.
4.1. Dataset
In this study, the above methods are experimented with using the Commercial Building
Energy Dataset (COMBED) [8]. This dataset includes real-world data collected from smart
meters deployed in different buildings and subsystems of IIITD. These real-world figures
include total electricity consumption per building, high energy loads, and total electricity
consumption per floor. The Academic Block can be thought of as a commercial building
similar to an IT office. Like other commercial buildings, it mainly includes air conditioning
load, lighting load, elevator load, etc. Since the air conditioning load is often used for
demand response, this paper takes the air handling unit (AHU) of the Academic Block as
the research object. AHU0, AHU1, AHU2, and AHU5 are four loads in different power
consumption scenarios. The smart meter installation location of the Academic Block is
shown in Figure 5, and the collected data information is shown in Table 2.
Table 2. Attribute information for electrical load data collected from the Academic Block.
In the field of NILM, it is common to use not only real data but also synthetic data
in order to improve the generalization performance of the model [37]. Here, data such as
lighting and elevator loads collected from sub-meters are used as noise. The main meter
data for the new commercial building is formed by subtracting this noise from the actual
main meter data. This synthetic data retains the original characteristics of the AHU load in
it, with some noise and other factors only found in real data, so it can be experimented with
as a regular commercial building. Compared to training with real data only, a mixture of
real and synthetic data can improve the transferability of the above methods to applications
in different types of commercial buildings. The ratio of synthetic data to generated real data
is 4:1, and the total measurement properties of the resulting dataset are shown in Table 3.
Sensors 2022, 22, 5250 10 of 18
Table 3. The main meter data information of commercial buildings in the synthetic dataset.
Table 4. Parameter information for obtaining the activation state of the AHU device.
In the seen scenario, we train the first part of building 1 using real data and the first
part of buildings 2, 3, 4, and 5 using synthetic data. We verify the second part of Building 1.
During the validation phase, the network parameters are saved when a new minimum
value is reached throughout the training period. Finally, we test the third part of Building 1.
Training and testing the model in this context enables the evaluation of the model’s ability
to identify and decompose the load when the composition of the commercial load is known.
In the unseen scenario, we train the first part of buildings 1 and 5, validate the second
part of buildings 1 and 5, and also save the network weights when the loss function reaches
a new minimum. Building 2 is used to simulate an unknown building and test all of its data.
Training and testing the model, in this case, enables the evaluation of the generalization
performance of the model. Since the original purpose of NILM research is to apply to
unknown buildings in real life, the results in this scenario have greater application value.
All experiments are performed on a Linux host with the following specifications, CPU:
15-core AMD EPYC 7543 32-core processor 30 GB; Graphics: RTX A5000*1 24 GB. In these
two different evaluation scenarios, the network parameters are optimized using the Adam
optimization method, which uses the gradient descent technique with a learning rate of
10–5 and a batch size of 32. The above hyperparameter combinations do not reflect the
maximum accuracy of the test case, as the purpose of this experiment is mainly to verify
the effectiveness of the proposed technique.
TP
Precision = (9)
TP + FP
TP
Recall = (10)
TP + FN
TP + TN
Accuracy = (11)
TP + TN + FP + FN
Precision × Recall
F1 = 2 (12)
Precision + Recall
Sensors 2022, 22, 5250 12 of 18
TP × TN − FP × FN
MCC = p (13)
( TP + FP)( TP + FN )( TN + FP)( TN + FN )
The EE metric estimates the precise amount of energy consumed by the device. This
paper uses mean absolute error (MAE) (14) and signal aggregation error (SAE) (15) to
measure the accuracy of active power estimates for individual devices. MAE measures
the average deviation of estimated power relative to actual power at each instant. At the
same time, SAE measures the relative error of the power estimates used throughout the
evaluation period.
1
N∑
MAE = |ŷ (t ) − y (t )| (14)
∑ ŷ (t ) − ∑ y (t )
SAE = (15)
∑ y (t )
where y (t ) denotes the true value of the power and ŷ (t ) denotes the estimated value of
the power.
5. Results
This section describes and explains the experimental results for two different NILM
evaluation scenarios, the seen and unseen scenarios, respectively.
5.1. Seen
A total of twenty experiments were performed, and the results were averaged. The 90%
interval for multiple results is also reported to show the stability of the method. Table 6
shows the performance of the method for identifying states, estimating instantaneous
power consumption and total power consumption in the scenarios seen. As can be seen
from the table, all five ED indicators except individual were above 0.9, indicating that the
method effectively identified the activation states of the four AHUs. The average F1 score
is 0.957, the average precision is 0.988, the average recall is 0.931, the average precision is
0.976, and the average MCC is 0.943, all with good stability. For the estimated instantaneous
power consumption, the average MAE of the four AHUs is 197.13. Among them, AHU5
has the worst result because AHU5 has the highest operating power. For the estimated
total power consumption, the average SAE of the four AHUs is −0.049.
Table 6. Performance of the method in the seen scenario, including the average and 90% interval of
each load over twenty experiments and the average of all loads.
Then, the output states are compared with the real states, respectively. Figure 6
shows the identification of the states of the four AHUs in this scenario. It can be seen that
there are errors in the identification of several activations of AHU0 and AHU1, while the
identification of both AHU0 and AHU1 is perfect, which may be due to the large differences
in the operation of AHU0 and AHU1. Overall, the method is effective in capturing the
changes in the operating state of each AHU.
Sensors 2022, 22, 5250 13 of 18
AHU0 AHU1
1.00 Real State 1.00 Real State
0.75 Output State 0.75 Output State
State
State
0.50 0.50
0.25 0.25
0.00 0.00
0 2000 4000 6000 8000 10000 12000 0 2000 4000 6000 8000 10000 12000
T T
AHU2 AHU5
1.00 Real State 1.00 Real State
0.75 Output State 0.75 Output State
State
State
0.50 0.50
0.25 0.25
0.00 0.00
0 2000 4000 6000 8000 10000 12000 0 2000 4000 6000 8000 10000 12000
T T
Figure 6. Comparison of the output state and the real state in the seen scenario.
5.2. Unseen
Twenty experiments were also conducted in the unseen scenario. Table 7 shows the
method’s performance in identifying the state and estimating the instantaneous and total
power consumption in this scenario. The average F1 score is 0.904, the average accuracy
is 0.932, the average recall is 0.878, the average accuracy is 0.919, and the average MCC
is 0.834. For estimating the transient power consumption, the average MAE of the four
AHUs is 433.24. For estimating the total power consumption, the average SAE of the four
AHUs is −0.058. With a guaranteed average F1 score of 0.904, the accuracy of AHU2 and
AHU5 remain at 0.940 and 0.983, respectively, indicating that the method is still effective in
identifying unseen scenarios.
Table 7. Performance of the method in the unseen scenario, including the average and 90% interval
of each load over twenty experiments and the average of all loads.
As can be seen from the above table, the method performs relatively poorly in estimat-
ing the instantaneous power in the unseen scenario. Figure 7 shows the power estimates
of the method for the four AHUs. The poor MAE was obtained because these AHUs had
high operating power. However, the total power estimate for the whole monitoring period
is still excellent, as evidenced by the average SAE of −0.058.
Sensors 2022, 22, 5250 14 of 18
8000
Power
6000
4000
2000
0
0 20000 40000 60000 80000
T
Figure 7. Comparison of the instantaneous power estimates and the actual power in the
unseen scenario.
6. Discussion
This study aims to enable non-intrusive monitoring of multiple commercial loads
simultaneously. According to the research results, in the see scene: the average score of
F1 is 0.957, the average score of SAE is −0.049, and in the unseen scene: the average score
of F1 is 0.904, and the average score of SAE is −0.058. The above results show that the
method proposed in this paper has achieved good results in the state identification and
power decomposition of all four commercial loads, indicating that the method can achieve
the preliminary purpose of the experiment.
Furthermore, the basic idea of this study is to improve monitoring accuracy by consid-
ering the correlation and imbalance of commercial loads. In this approach, correlations are
solved by RethinkNet and imbalances are solved by MLFL. To this end, extensive ablation
studies were performed to verify the role of each component, and the results are shown
in Table 8.
Model Transformer Temporal Pooling RethinkNet Focal Loss AHU0 AHU1 AHU2 AHU5 Avg
Baseline − X − − 0.926 0.772 0.923 0.930 0.888
ModelA − − − − 0.871 0.683 0.871 0.880 0.826
ModelB X X − − 0.898 0.840 0.901 0.922 0.890
ModelC − X X − 0.943 0.831 0.899 0.890 0.891
ModelD − X X X 0.904 0.897 0.921 0.931 0.913
ModelE X X X − 0.924 0.844 0.964 0.977 0.927
TTRNet X X X X 0.892 0.957 0.992 0.991 0.957
The TP-NILM [28] was used as the baseline. The accuracy of AHU2 is particularly
poor compared with the other three loads, indicating that AHU2 is a load that is difficult to
identify. Model A removes the Temporal Pooling module of the baseline and becomes a
structure composed entirely of deep CNN. The result is a significant drop in F1 scores for
all four loads, proving that the contextual information does have the effect of improving the
Sensors 2022, 22, 5250 15 of 18
results. Model B uses a Transformer as the encoder and trade-offs each load to improve the
average F1 score from 0.844 to 0.890 in the baseline. This result proves that a Transformer
can extract load features better than a CNN in NILM. Model C adds RethinkNet to the
baseline and improves the average result, indicating the importance of considering label
relevance in multi-label classification. Part of the load accuracy decreases, which may
be related to the correlation difference. Model D was trained using MLFL on top of
Model C and continued to improve the average F1 score, indicating that solving the load
imbalance problem can effectively improve the classification accuracy. Especially for AHU2,
it has increased from 0.772 to 0.897, indicating that MLFL can improve the monitoring
performance of challenging loads. Model E adds RethinkNet to Model B, and the increase
in average F1 score proves that Transformer and RethinkNet work better together. Finally,
the complete TTRNet model is obtained using MLFL based on model E. The average F1
score is improved from 0.888 in the baseline to 0.957, which is an improvement of 7.77%.
After adding MLFL, the results of AHU1 will be slightly reduced, which can also be seen
in the comparison of model C and model D. This may be affected by the current MLFL
weight parameter combination. The current setting is enough to prove the effectiveness of
this method, and it will be optimized by adjusting parameters in the future.
This method is also thoroughly compared with existing NILM methods based on
multi-label classification. Here, the existing methods are chosen from CNN and TP-NILM,
i.e., Model A and the baseline mentioned before. Tables 9 and 10 indicate the performance
comparison of different methods in the seen and unseen scenarios, respectively.
By comparing the seven performance metrics with existing methods, most of the
metrics of TTRNet are far better than other methods, proving that the comprehensive
performance of the proposed method in this paper is the best. However, the present method
still has some shortcomings. This method is slightly less effective in identifying AHU1,
and the specific reasons have been explained above. The most significant advantage of this
method is its good power estimation performance while ensuring the correct identification
of the load state, especially for high power loads. The MAE of AHU5 in the seen scenario
is 380.09, which is 123.08% better than the 847.91 of TP-NILM, and the MAE of AHU5
in the unseen scenario is 592.14, which is 781.18% better than the 1055.06 of TP-NILM.
The evaluation of the unseen scenario is more meaningful because it is consistent with
the application of NILM in real life. In this scenario, the average F1 score of TTRNet is
0.904, which is 1.92% better than the 0.887 of TP-NILM. Although the results show that
the improvement of the proposed method over existing methods is not particularly large,
the main reason is that the data of the used public dataset are relatively ideal. In this case,
the advantages of this method are still reflected, and the effect of this method will be more
obvious when it is applied to more realistic data in the future. To sum up, the method
proposed in this paper has higher practical value than existing methods.
focus on lightweight models to speed up training and inference and investigate more
efficient optimization processes to improve the quality of monitoring for challenging loads.
Author Contributions: Conceptualization, M.Z. and S.S.; Methodology, S.S.; Project administration,
X.W. and Z.Z.; Software, S.S.; Supervision, F.H.; Validation, S.S.; Writing—original draft, S.S.; Writing—
review & editing, S.S. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the National Key Research and Development Program of
China, grant number 2018YFC0604503, the Energy Internet Joint Fund Project of Anhui province,
grant number 2008085UD06, the Major Science and Technology Program of Anhui Province, grant
number 201903a07020013.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Duan, W.; Khurshid, A.; Nazir, N.; Khan, K.; Calin, A.C. From gray to green: Energy crises and the role of CPEC. Renew. Energy
2022, 190, 188–207. [CrossRef]
2. Al-Shetwi, A.Q.; Hannan, M.; Jern, K.P.; Mansur, M.; Mahlia, T. Grid-connected renewable energy sources: Review of the recent
integration requirements and control methods. J. Clean. Prod. 2020, 253, 119831. [CrossRef]
3. Aghaei, J.; Alizadeh, M.I. Demand response in smart electricity grids equipped with renewable energy sources: A review. Renew.
Sustain. Energy Rev. 2013, 18, 64–72. [CrossRef]
4. Zheng, S.; Jin, X.; Huang, G.; Lai, A.C. Coordination of commercial prosumers with distributed demand-side flexibility in energy
sharing and management system. Energy 2022, 248, 123634. [CrossRef]
5. Gopinath, R.; Kumar, M.; Joshua, C.P.C.; Srinivas, K. Energy management using non-intrusive load monitoring techniques–State-
of-the-art and future research directions. Sustain. Cities Soc. 2020, 62, 102411. [CrossRef]
6. Hart, G.W. Nonintrusive appliance load monitoring. Proc. IEEE 1992, 80, 1870–1891. [CrossRef]
7. Angelis, G.F.; Timplalexis, C.; Krinidis, S.; Ioannidis, D.; Tzovaras, D. NILM Applications: Literature review of learning
approaches, recent developments and challenges. Energy Build. 2022, 261, 111951. [CrossRef]
8. Batra, N.; Parson, O.; Berges, M.; Singh, A.; Rogers, A. A comparison of non-intrusive load monitoring methods for commercial
and residential buildings. arXiv 2014, arXiv:1408.6595.
9. Norford, L.K.; Leeb, S.B. Non-intrusive electrical load monitoring in commercial buildings based on steady-state and transient
load-detection algorithms. Energy Build. 1996, 24, 51–64. [CrossRef]
10. Ji, Y.; Xu, P.; Ye, Y. HVAC terminal hourly end-use disaggregation in commercial buildings with Fourier series model. Energy
Build. 2015, 97, 33–46. [CrossRef]
11. Ling, Z.; Tao, Q.; Zheng, J.; Xiong, P.; Liu, M.; Xiao, Z.; Gang, W. A Nonintrusive Load Monitoring Method for Office Buildings
Based on Random Forest. Buildings 2021, 11, 449. [CrossRef]
12. Xiao, Z.; Gang, W.; Yuan, J.; Zhang, Y.; Fan, C. Cooling load disaggregation using a NILM method based on random forest for
smart buildings. Sustain. Cities Soc. 2021, 74, 103202. [CrossRef]
13. Kaselimi, M.; Voulodimos, A.; Protopapadakis, E.; Doulamis, N.; Doulamis, A. Energan: A generative adversarial network for
energy disaggregation. In Proceedings of the ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1578–1582.
14. Kaselimi, M.; Doulamis, N.; Voulodimos, A.; Doulamis, A.; Protopapadakis, E. EnerGAN++: A generative adversarial gated
recurrent network for robust energy disaggregation. IEEE Open J. Signal Process. 2020, 2, 1–16. [CrossRef]
15. Henriet, S.; Şimşekli, U.; Fuentes, B.; Richard, G. A generative model for non-intrusive load monitoring in commercial buildings.
Energy Build. 2018, 177, 268–278. [CrossRef]
16. Stankovic, V.; Liao, J.; Stankovic, L. A graph-based signal processing approach for low-rate energy disaggregation. In Proceedings
of the 2014 IEEE symposium on computational intelligence for engineering solutions (CIES), Orlando, FL, USA, 9–12 December
2014; IEEE: Piscataway, NJ, USA, 2014; pp. 81–87.
17. Zhao, B.; Ye, M.; Stankovic, L.; Stankovic, V. Non-intrusive load disaggregation solutions for very low-rate smart meter data.
Appl. Energy 2020, 268, 114949. [CrossRef]
18. Rafsanjani, H.N.; Ahn, C.R.; Chen, J. Linking building energy consumption with occupants’ energy-consuming behaviors in
commercial buildings: Non-intrusive occupant load monitoring (NIOLM). Energy Build. 2018, 172, 317–327. [CrossRef]
19. Zaeri, N.; Ashouri, A.; Gunay, H.B.; Abuimara, T. Disaggregation of electricity and heating consumption in commercial buildings
with building automation system data. Energy Build. 2022, 258, 111791. [CrossRef]
Sensors 2022, 22, 5250 18 of 18
20. Murray, D.; Stankovic, L.; Stankovic, V.; Lulic, S.; Sladojevic, S. Transferability of neural network approaches for low-rate
energy disaggregation. In Proceedings of the ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP), Brighton, UK, 12–17 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 8330–8334.
21. Meier, A.; Cautley, D. Practical limits to the use of non-intrusive load monitoring in commercial buildings. Energy Build. 2021,
251, 111308. [CrossRef]
22. Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473.
23. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need.
Adv. Neural Inf. Process. Syst. 2017, 30.
24. Lin, N.; Zhou, B.; Yang, G.; Ma, S. Multi-head attention networks for nonintrusive load monitoring. In Proceedings of the 2020
IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Macau, China, 21–24 August
2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5.
25. Yue, Z.; Witzig, C.R.; Jorde, D.; Jacobsen, H.A. Bert4nilm: A bidirectional transformer model for non-intrusive load monitoring.
In Proceedings of the 5th International Workshop on Non-Intrusive Load Monitoring, New York, NY, USA, 18 November 2020;
pp. 89–93.
26. Yue, Z.; Zeng, H.; Kou, Z.; Shang, L.; Wang, D. Efficient Localness Transformer for Smart Sensor-Based Energy Disaggregation.
arXiv 2022, arXiv:2203.16537.
27. Sykiotis, S.; Kaselimi, M.; Doulamis, A.; Doulamis, N. ELECTRIcity: An Efficient Transformer for Non-Intrusive Load Monitoring.
Sensors 2022, 22, 2926. [CrossRef] [PubMed]
28. Massidda, L.; Marrocu, M.; Manca, S. Non-intrusive load disaggregation by convolutional neural network and multilabel
classification. Appl. Sci. 2020, 10, 1454. [CrossRef]
29. Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer
vision and pattern recognition, Honolulu, HI, USA, 26 July 2017; pp. 2881–2890.
30. Kaselimi, M.; Doulamis, N.; Voulodimos, A.; Protopapadakis, E.; Doulamis, A. Context aware energy disaggregation using
adaptive bidirectional LSTM models. IEEE Trans. Smart Grid 2020, 11, 3054–3067. [CrossRef]
31. da Silva Nolasco, L.; Lazzaretti, A.E.; Mulinari, B.M. DeepDFML-NILM: A New CNN-Based Architecture for Detection, Feature
Extraction and Multi-Label Classification in NILM Signals. IEEE Sens. J. 2021, 22, 501–509. [CrossRef]
32. Verma, S.; Singh, S.; Majumdar, A. Multi-label LSTM autoencoder for non-intrusive appliance load monitoring. Electr. Power Syst.
Res. 2021, 199, 107414. [CrossRef]
33. Yang, P.; Sun, X.; Li, W.; Ma, S.; Wu, W.; Wang, H. SGM: Sequence generation model for multi-label classification. arXiv 2018,
arXiv:1806.04822.
34. Yang, Y.Y.; Lin, Y.A.; Chu, H.M.; Lin, H.T. Deep learning with a rethinking structure for multi-label classification. In Proceedings
of the Asian Conference on Machine Learning. PMLR, Nagoya, Japan, 17–19 November 2019; pp. 125–140.
35. Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International
Conference on Computer Vision, Venice, Italy, 22–29 October 2017; Volume 9, pp. 2980–2988.
36. Zhou, X.; Li, S.; Liu, C.; Zhu, H.; Dong, N.; Xiao, T. Non-intrusive load monitoring using a CNN-LSTM-RF model considering
label correlation and class-imbalance. IEEE Access 2021, 9, 84306–84315. [CrossRef]
37. Kelly, J.; Knottenbelt, W. Neural nilm: Deep neural networks applied to energy disaggregation. In Proceedings of the 2nd ACM
international conference on embedded systems for energy-efficient built environments, New York, NY, USA, 4–5 November 2015;
pp. 55–64.