Reliable Industry 4.0 Based On Machine Learning and IoT For
Reliable Industry 4.0 Based On Machine Learning and IoT For
Article
Reliable Industry 4.0 Based on Machine Learning and IoT for
Analyzing, Monitoring, and Securing Smart Meters
Mahmoud Elsisi 1,2 , Karar Mahmoud 3,4 , Matti Lehtonen 3 and Mohamed M. F. Darwish 2,3, *
1 Industry 4.0 Implementation Center, Center for Cyber–Physical System Innovation, National Taiwan
University of Science and Technology, Taipei 10607, Taiwan; [email protected] or
[email protected]
2 Department of Electrical Engineering, Faculty of Engineering at Shoubra, Benha University,
Cairo 11629, Egypt
3 Department of Electrical Engineering and Automation, Aalto University, FI-00076 Espoo, Finland;
[email protected] (K.M.); [email protected] (M.L.)
4 Department of Electrical Engineering, Faculty of Engineering, Aswan University, Aswan 81542, Egypt
* Correspondence: [email protected] or [email protected]
Abstract: The modern control infrastructure that manages and monitors the communication between
the smart machines represents the most effective way to increase the efficiency of the industrial
environment, such as smart grids. The cyber-physical systems utilize the embedded software and
internet to connect and control the smart machines that are addressed by the internet of things (IoT).
These cyber-physical systems are the basis of the fourth industrial revolution which is indexed by
industry 4.0. In particular, industry 4.0 relies heavily on the IoT and smart sensors such as smart
energy meters. The reliability and security represent the main challenges that face the industry
4.0 implementation. This paper introduces a new infrastructure based on machine learning to
analyze and monitor the output data of the smart meters to investigate if this data is real data or
fake. The fake data are due to the hacking and the inefficient meters. The industrial environment
affects the efficiency of the meters by temperature, humidity, and noise signals. Furthermore, the
Citation: Elsisi, M.; Mahmoud, K.;
proposed infrastructure validates the amount of data loss via communication channels and the
Lehtonen, M.; Darwish, M.M.F. internet connection. The decision tree is utilized as an effective machine learning algorithm to carry
Reliable Industry 4.0 Based on out both regression and classification for the meters’ data. The data monitoring is carried based on
Machine Learning and IoT for the industrial digital twins’ platform. The proposed infrastructure results provide a reliable and
Analyzing, Monitoring, and Securing effective industrial decision that enhances the investments in industry 4.0.
Smart Meters. Sensors 2021, 21, 487.
https://doi.org/10.3390/s21020487 Keywords: smart systems; industry 4.0; internet of things; machine learning
energy efficiency, where it can be integrated into energy production, management, and
planning.
Electricity stealing and energy meter billing are considered the most important issues
of the energy distribution system [14], where the usage of manual systems is unlikely. That
is due to the different mistakes that occur by humans at different utility companies, which
cause a problem for customers in the wrong reading. In addition, it is less accurate, less
reliable, and non-tamper proof. Further, it is essential to enhance the accuracy of the billing
system. To avoid human involvement in the process of billing, an automatic smart system
for reading and transfer data of meters should be applied [15–17]. Presently, this technique
is employed in the countries that can only monitor and measure the energy consumption
of electricity, but do not permit remote access. The other problem with the current situation
is that it needs a lot of manpower, which means time-consuming and various errors.
Smart energy meter can answer these problems by introducing various solutions
and services to the consumer through transferring messages, including the power con-
sumption (in KW), and when the credit is low, it automatically alerts to recharge money.
Moreover, other beneficial features like tamper-proof, fault detection, etc., are developed.
The numbers of these smart meters in power systems connected in the UK, the US, and
China were 2900 thousand, 70,000 thousand, and 96,000 thousand respectively, in 2016 [18].
The execution of the smart meters will assist in better energy management evaluation,
conservation of energy, and also indeed the needless disturbances of wrong billing, where
the billing system is important in evaluating the pathway of consumption and resolving
any dissimilarities on consumption and billing [19,20]. Smart meters and communication
networks establish the advanced metering infrastructure (AMI), which can save the de-
mand profiles and ease bi-directional data flow [21]. The smart energy meter has various
merits like detecting faults in the distribution network by examining the supply status at
a power transformer [22,23]. Indeed, these smart meters are considered the future vision
that has massive advantages for not only consumers but also producers/suppliers.
The Fourth Industrial Revolution (i.e., Industry 4.0) and industrial IoT technologies
are speedily driving data and software solutions powered digitalization in several subjects.
Among the numerous advantages offered by them is the infrastructure for utilizing big
data, machine learning, as well as cloud computing software platforms. Industry 4.0 is
defined by the continuing automation of conventional manufacturing and industrial proce-
dures, through diverse modern intelligent technologies. Specifically, large-scale machine-
to-machine communication (M2M) and the IoT are combined for improved automation
processes, superior communication and self-monitoring, and assembly of intelligent ma-
chines that can solve issues without the requirement of human interference [24,25]. In this
work, we focus on the security part and data loss in the infrastructure of industry 4.0 and
apply this topology in the smart meters, which can be applied in other smart sensors in the
future, not the total industry 4.0 system for managing and securing smart meters.
As shown above, the recent control infrastructure of smart grids represents the most
effective way to increase the efficiency of the grid. Specifically, it manages and monitors
the communication between the smart machines (i.e., smart meters). For these reasons,
this paper introduces an efficient infrastructure based on machine learning to analyze
and monitor the data of smart meters. Further, it has the ability to investigate if this data
is real data or fake, where fake data can be caused by the hacking of these meters. The
merit of the proposed infrastructure is the consideration of the cyber-physical systems
that are the basis of the fourth industrial revolution (labeled industry 4.0). Such adopted
industry 4.0 involves the IoT and smart sensors (smart energy meters). Also, the proposed
infrastructure can validate the amount of data loss through communication channels and
the internet connection. A remarkable advantage is mitigating the industrial environment
impacts on the efficiency of the meters (temperature, humidity, and noise signals). To do so,
the decision tree is utilized in the proposed infrastructure as an effective machine learning
algorithm to carry out both regression and classification for the meters’ data. Accordingly,
Sensors 2021, 21, x FOR PEER REVIEW 3 of 16
Sensors 2021, 21, 487 3 of 16
algorithm to carry out both regression and classification for the meters’ data. Accordingly,
the proposed infrastructure can provide a reliable and effective industrial decision that
the proposed infrastructure can provide a reliable and effective industrial decision that
improvesthe
improves theinvestments
investmentsininindustry
industry4.0.
4.0.
2.2.Architecture
ArchitectureDescription
Descriptionof ofthe
theProposed
ProposedInfrastructure
Infrastructure
Today,aalot
Today, lotofofindustrial
industrialvendors
vendorsprovide
providecomputing
computingservices
servicesincluding
includinghardware
hardware
and software tools for IoT purposes. These services provide data
and software tools for IoT purposes. These services provide data analytics platforms for analytics platforms for
different industrial applications,
different industrial applications, as shown in Figure 1
1 [26]. The data acquisition processis
[26]. The data acquisition process
isperformed
performedbybythe themeasurements
measurementsfrom fromthethelocal
localsensors
sensorsforfordifferent
differentpurposes
purposessuch suchas
asstorage,
storage,visualization,
visualization, andand analysis.
analysis. TheThedatadata
acquisition
acquisitionis carried out by
is carried oututilizing inter-
by utilizing
faces suchsuch
interfaces as Modbus,
as Modbus, Open Platform
Open Platform Communications
Communications (OPC) and and
(OPC) different network
different network pro-
tocols like
protocols Hypertext
like Hypertext Transfer
TransferProtocol
Protocol(HTTP)
(HTTP)and andMessage
Message Queue Telemetry
TelemetryTransport
Transport
(MQTT). Currently,
(MQTT). Currently, many IoT IoTplatforms
platformsare areprovided
provided forfor
data collection
data within
collection edgeedge
within com-
puting devices
computing and cloud
devices gateway.
and cloud The selection
gateway. of the data
The selection acquisition
of the process influences
data acquisition process
influences
real-time real-time
monitoring. monitoring.
In the case In of
theacase of a high-latency
high-latency manner,manner,
offline offline data analysis
data analysis is pre-
isferred.
preferred. Data visualization,
Data visualization, selecting selecting
datasets,datasets,
and data and data filtering
filtering are the popular
are the popular common
common
proceduresprocedures for data pre-processing.
for data pre-processing. In addition,In addition,
the delaythe delay estimation,
estimation, reductionreduction
of dimen-
of dimension,
sion, and resampling
and resampling can beinincluded
can be included in the data pre-processing.
the data pre-processing. Then,ofthe
Then, the selection the
selection of the proper model including the training, testing, and
proper model including the training, testing, and validation data is an important stage forvalidation data is an
important stageThe
data analysis. for data
data analysis.
analysis Themodel data analysison
depends model depends on thecharacteristics
the environmental environmentalof
characteristics
each factory. of Theeach factory. The
proposed systemproposed system
is carried outisbased
carriedonout based
edge on edge computing
computing devices as
devices
realisticas
andrealistic
parallel and parallel computing
computing architecture. architecture.
Each device Each device
collects collects
the data fromthethedata
ma-
from
chinesthe machines
and sends this and sends
data this
to the data to server
database the database serverthe
to overcome todifferent
overcome the different
communication
communication
protocols of theprotocols
sensors. of Thethegateway
sensors. classifies
The gateway classifies data
the collected the collected
based on data
thebased on
Artificial
the
Intelligence (AI) techniques and sends the final results via MQTT protocol to monitor ittoin
Artificial Intelligence (AI) techniques and sends the final results via MQTT protocol
monitor it in theofdashboard
the dashboard the industrial of the industrial
digital twins’digital twins’ platform.
platform.
Figure1.1.The
Figure Theschematic
schematicdiagram
diagramofofthe
theinfrastructure
infrastructureofofInternet
InternetofofThings
Things(IoT)
(IoT)processing.
processing.
Thebenefits
The benefitsofofIoT
IoTintegrated
integratedinto
intothe
theindustry
industry4.0
4.0isisto
toenable
enablereal-time
real-timedata
datainter-
inter-
change among a large variety of smart meters for different applicants, such
change among a large variety of smart meters for different applicants, such as factories, as factories,
homes,and
homes, andhospitals,
hospitals, which
which are
areessential
essentialfor
forsmart
smartgrids.
grids. Indeed,
Indeed, the
the progress
progress ofof IoT-
IoT-
enabledsmart
enabled smartmeters
metersisisempowered
empoweredvia viaaadistinct
distinctnetwork
networkconnection,
connection,e.g.,
e.g.,Bluetooth
Bluetoothandand
GlobalSystem
Global SystemforforMobiles
Mobiles(GSM),
(GSM),forforexamining
examining andand managing
managing smart
smart meters
meters remotely.
remotely.
Withthe
With thedevelopment
developmentininindustry
industry4.0,
4.0,the
thesmart
smartmeter
meterindustry
industryhas
hasused
usedtechnologies
technologieslikelike
IoT and big data that allow incremental datasets from users and smart devices. Conse-
quently, several frameworks and platforms are donated to specific industries and appliances
Sensors 2021, 21, 487 4 of 16
using diverse technologies beneath the industry 4.0 umbrella. In this study, the advan-
tage of utilizing the IoT framework under industry 4.0 for smart meters is that it enables
an enterprise resource planning system, which can promote the manufactured electric
industry in several features, including forecasting, real-time visibility, remote monitoring,
cyber-physical security, and alerts and notifications.
Decision Tree
The decision tree is an algorithm to generate appropriate rules in order to approximate
the discrete functions. This algorithm analyzes and classifies the input data then it can
make a decision for the new data. The learning algorithm discovers the rules between the
input data to build the decision tree [27]. The decision tree with high accuracy and small
scale is the main target of the decision tree algorithm. The classification in the decision
tree algorithm is carried out as sets of “if–then” or a conditional distributed probability
based on the class and features space. The decision algorithm has three stages: selection of
features, generation of decision, and pruning. The processing in the decision algorithm,
initiates from the root node by testing, for instance, a certain feature, then this feature can
be assigned for the next node according to the results of the last testing. The values of the
tested feature are divided for each child node simultaneously. The testing and assignment
of the feature are still carried out until reaching the leaf node. In the final stage, the feature
values are divided into the leaf node class. The decision tree applies an index named
information entropy to detect the uncertainty of the tested set and utilizes the information
gain as a measure of the uncertainty or purity. Then, it can split the node based on the
feature that has the largest information gain. The entropy utilizes the expected information
to detect if the set needs to be divided into multiple classes or not [28]. The information
index of symbol xi is formulated as follows:
I ( xi ) = − log2 P( xi ) (1)
where m is the total number of variables that require classification. The greater entropy
indicates the greater uncertainty of the variable. After the calculation of the entropy
probability of the estimated data, the empirical entropy that corresponds to this probability
is formulated as follows:
c
|n | |n |
E(Y ) = − ∑ i log2 i (3)
i =1
| M | | M|
c
where c is the maximum integer limit of the dataset Y, ni is the size of Y, and M = ∑ ni .
i =1
Then, the uncertainty of variable Y under the information of known variable X can be
represented by the conditioned entropy as follows [28]:
m
E (Y | X ) = − ∑ p i E (Y | X = x i ) (4)
i =1
The entropy is utilized to formulate the information gain. The information gain is the
relative gain of the feature and it is formulated as follows:
G = E (Y ) − E (Y | X ) (5)
where E(Y ) is the empirical entropy of the training dataset Y, and E(Y | X ) is the conditional
entropy of the feature ‘X’ in the dataset ‘Y’. The size of information gain changes according
to the training dataset. When the problem is difficult, the empirical entropy and the
information gain increase. On the contrary, the information gain is small when the problem
is simple.
Smart Meter
Wind
Turbine
Solar cells
Transmission system
City Power Plant
Genset
Industrial
Customers
Smart Meter
Wind Farm
Electric Vehicle
Solar Farm
Smart Meter
Figure 2.
Figure 2. Smart
Smart grid
grid with
with distributed
distributed power
power suppliers.
suppliers.
Thisnovelty
The paper introduces
of this work a is
new reliable intelligent
to introduce system to check
a new infrastructure basedthe on sharing
the decisiondata
based
tree on the decision
technique to analyze,treemonitor,
to classifyandthe realthe
secure data and fake
output datadata.
of theFurthermore,
smart meters,the algo-
thereby
rithm can check
investigating the data
if this amount of data
is real dataloss due to
or fake the low hacking).
(meaning speed and This discontinuity of the
feature allows
internet. This intelligent
compensating the industrialsystem enhances the
environment decision
affects on theof efficiency
the powerofgrid the about
metersenergy
(e.g.,
management and control.
temperature, humidity, and noise signals). Most importantly, the proposed infrastructure
The the
validates novelty
amountof this work
of data is via
loss to introduce a new channels
communication infrastructure
and thebased on the
internet decision
connection,
providing an option
tree technique to measure
to analyze, monitor, the internet
and securespeed and status
the output data ofandthe minimize
smart meters, the thereby
system
cost. The decision
investigating if thistree is used
data is realhere
dataasoranfake
effective machine
(meaning learning
hacking). Thistool to perform
feature allows both
com-
regression and classification for the meters’ data. The main advantage of the
pensating the industrial environment affects on the efficiency of the meters (e.g., temper- decision tree is
its simplicity to build and intuition to understand. However, its limitation
ature, humidity, and noise signals). Most importantly, the proposed infrastructure vali- is that it requires
suitable
dates thedepth
amountto perform withvia
of data loss good accuracy. Note
communication that data
channels andmonitoring
the internet is connection,
conducted
based on the industrial digital twins’ framework. The proposed infrastructure can provide
providing an option to measure the internet speed and status and minimize the system
a reliable and effective industrial decision to enhance the investments in industry 4.0.
cost. The decision tree is used here as an effective machine learning tool to perform both
regression and classification for the meters’ data. The main advantage of the decision tree
is its simplicity to build and intuition to understand. However, its limitation is that it re-
quires suitable depth to perform with good accuracy. Note that data monitoring is con-
ducted based on the industrial digital twins’ framework. The proposed infrastructure can
provide a reliable and effective industrial decision to enhance the investments in industry
4.0.
Sensors 2021, 21, 487 7 of 16
Label of Validation
Current ‘I’ Change in Current ‘Delta I’ Validation
Real ‘10 , Fake ‘20
5.7 0 Real 1
2.7 −3 Real 1
2.8 0.1 Real 1
2.7 −0.1 Real 1
2.8 0.1 Real 1
2.8 0 Real 1
−6.3 −9.1 Fake 2
5.6 11.9 Fake 2
−6.1 −11.7 Fake 2
9.8 15.9 Fake 2
6 −3.8 Fake 2
−1.6 −7.6 Fake 2
Figure 3 shows all samples of the inputs training dataset, that is represented by the
current ‘I’ and its rate of change ‘Delta I’. Figure 4a shows all samples of the output training
dataset, that is represented by the labeled validation (Real ‘10 , Fake ‘20 ). Figure 4b is a
zoomed in image of Figure 4a for a few samples of the output training dataset. All samples
of the input testing dataset are clear in Figure 5, while all samples of the output testing
dataset are shown in Figure 6a. Further, Figure 6b shows a zoomed in image for a few
samples of the output testing dataset from Figure 6a. The process of the decision tree is
described in Figure 7 and it includes the following components:
• An entire population is represented by the “root node”.
• The “splitting” is the step of dividing the node into two or more sub-nodes.
• The “decision node” is represented by a sub-node that is split into different sub-nodes.
• The final node in a decision tree is called “leaf/terminal node”.
• The process of removing sub-nodes from the decision node named “Pruning”, this
process is unlike splitting.
• The branch is a subsection of the entire tree and it is named “Sub-tree”.
• In the sub-tree, the node that split into two nodes named the parent as “A” in Figure 7
and new nodes named children as “B” and “C” in Figure 7.
other machine learning techniques: (1) Artificial neural network, (2) K-nearest neighbor,
(3)other
Logistic regression,
machine andtechniques:
learning (4) Naïve Bayes. Notably,
(1) Artificial the decision
neural network, tree
(2)performed
K-nearestsuperior
neighbor,
accuracy in a regression,
(3) Logistic lot of worksand compared
(4) Naïvewith theNotably,
Bayes. four machine learning
the decision algorithms.
tree performedFor this
superior
reason, the in
accuracy decision
a lot oftree has compared
works been assigned
withinthe
this work
four rather learning
machine than the algorithms.
others. For this
Sensors 2021, 21, 487 8 of 16
Figure
reason, the8 decision
shows the treeoutput classification
has been assigned inregions of the
this work decision
rather than thetreeothers.
after training,
while Figure
Figure98shows
showsthe thelocations of the testingregions
output classification data in the classified
of the decision regions of real
tree after and
training,
fake data.
while Figure 9 shows the locations of the testing data in the classified regions of real and
fake data.
Figure3.3.The
Figure Theinput
input training
training data
dataofofthe decision
the tree.
decision tree.
Figure 3. The input training data of the decision tree.
(b)
Figure 4. The output Figure
training4.dataset of the decision
The output trainingtree: (a) all
dataset ofoutput of training
the decision tree:dataset, (b) zoomed
(a) all output in on a few
of training samples
dataset, (b) zoomed
from the output training dataset.
in on a few samples from the output training dataset.
Sensors 2021, 21, 487 (b) 9 of 16
Figure 4. The output training dataset of the decision tree: (a) all output of training dataset, (b) zoomed in on a few samples
from the output training dataset. (b)
Figure 4. The output training dataset of the decision tree: (a) all output of training dataset, (b) zoomed in on a few samples
from the output training dataset.
Figure
Figure 5.
5. The
The input
input testing
testing data
data of
of the
the decision
decision tree.
tree.
Figure 5. The input testing data of the decision tree.
(a)
(a)
(b)
FigureFigure
6. The6.output
The output testing
testing dataset
dataset of the
of the decision
decision tree:(a)
tree: (a)all
alloutput
output of
of testing
testingdataset,
dataset,(b)(b) zoomed
zoomed in on a few samples from the output testing dataset.
in on a few samples from the output testing dataset.
(b)
Sensors 2021, 21, 487 10 of 16
Figure 6. The output testing dataset of the decision tree: (a) all output of testing dataset, (b)
zoomed in on a few samples from the output testing dataset.
Figure 7.
Figure 7. Identification
Identification of
of smart
smart meter
meter reading
reading into
into real
real and
and fake
fake data
data by
bydecision
decisiontree
treealgorithm.
algorithm.
Table 2 shows how the decision tree algorithm determines the uncertainty, named
entropy, according to the following formula [29]:
where p represents the probability of real data happening, and q represents the probability
of fake data happening. Reference [30] compares the accuracy of the decision tree with
other machine learning techniques: (1) Artificial neural network, (2) K-nearest neighbor, (3)
Logistic regression, and (4) Naïve Bayes. Notably, the decision tree performed superior
accuracy in a lot of works compared with the four machine learning algorithms. For this
reason, the decision tree has been assigned in this work rather than the others.
Figure 8 shows the output classification regions of the decision tree after training,
while Figure 9 shows the locations of the testing data in the classified regions of real and
fake data.
Sensors 2021, 21, 487 11 of 16
2021, 21,
2021, 21, xx FOR
FOR PEER
PEER REVIEW
REVIEW 11 of
11 of 16
16
8.Figure
Figure 8. 8. The classified
The classified
classified regions
regions of
of the of the
the real
real andreal
fakeand fake
data dataon
based based on the decision
the decision
decision tree. tree.
Figure The regions and fake data based on the tree.
Figure 9.
Figure 9. The
The locations
Figure locations of the
of the testing
9. The locationstesting datasetdataset
of the dataset
testing on the
on the regions
regions of the
of the real
on the regions real and
and
of the fake
fake
real data.
data.
and fake data.
Table 2.
Table 2. Sample
Sample of
of decision
decisionAfter
tree selection
tree selection based
trainingbased on the
the probability
and testing,
on probability
the createdofmodel
of events.of the decision tree is encrypted with the
events.
IoT system to classify the online reading of the smart meter and publish it in the dashboard.
Scenario
Scenario Real Data
Real Data Uncertainty
Uncertainty Fake Data
Fake Data Uncertainty
Uncertainty Overall Uncertainty
Uncertainty
Furthermore, the validation of the data loss due Overall
to the variations of the internet speed is
qual chances of
qual chances of win
win −0.5 log 2 (0.5) = 0.5
−0.5 log2 (0.5) −0.5
= 0.5 during the−0.5 log 2 (0.5) = 0.5
log2operation. 0.5 + 0.5 == 11
(0.5) = 0.5 The following pseudo-code
0.5 + 0.5
performed online (Algorithm 1) describes
% chances
chances of
of aa win
win for
for the=full operation of−0.2
data acquisition, validation,0.2575
and visualization,
−0.8 log
−0.8 (0.8)
log22 (0.8) 0.2575
= 0.2575 log
−0.2 log 2 (0.2) = 0.4644
2 (0.2) = 0.4644 0.2575 ++ 0.4644
0.4644 == 0.7219
0.7219
Real data
Real data
After training
After training and
and testing,
testing, the
the created
created model
model of of the
the decision
decision tree
tree is
is encrypted
encrypted withwith
the IoT
the IoT system
system toto classify
classify the
the online
online reading
reading of
of the
the smart
smart meter
meter and
and publish
publish itit in
in the
the dash-
dash-
board. Furthermore,
board. Furthermore, thethe validation
validation ofof the
the data
data loss
loss due
due toto the
the variations
variations of
of the
the internet
internet
speed is performed during the online operation. The following pseudo-code (Algorithm
1) describes the full operation of data acquisition, validation, and visualization,
Sensors 2021, 21, 487 12 of 16
Algorithm 1 The pseudo-code of data acquisition, validation, and visualization
1: Read data from the smart meter
2: Input the data to the Decision tree model
3: Algorithm
Classify the data
1 The by the Decision
pseudo-code of data tree
acquisition, validation, and visualization
4: 1: Read
if the
dataoutput
from theof smart
the Decision
meter tree == 1
5: 2: Input the data
Publish
to the that the data
Decision is ‘Real’
tree model
6: 3: Classify
else the data by the Decision tree
7: 4: if the output
Publish of the
thatDecision
the data tree
is == 1
‘Fake’
5: Publish that the data is ‘Real’
8: end if
6: else
9: 7:Record the receiving
Publish time of data
that the datais ‘Fake’
10:8:Calculate
end if the change of time
11:9: Record
if thethe
change of time
receiving ≤ sample
time of data rate of the smart meter
12:10: Calculate the change of time
Publish ‘No loss data’
13:11: else if the change of time ≤ sample rate of the smart meter
12: Publish ‘No loss data’
14: Publish ‘There is loss data’
13: else
15:14: end if Publish ‘There is loss data’
16:15:Publish
endthe
if reading of the smart meter
16: Publish the reading of the smart meter
The final results about the data validation and data loss due to the internet problems
will be recorded
The on the
final results database
about server
the data and presented
validation and dataon thedue
loss dashboard of the problems
to the internet IoT plat-
form.
will be recorded on the database server and presented on the dashboard of the IoT platform.
5.1. Scenario
5.1. Scenario1: 1:Normal
NormalCase
Case
In this
In this scenario,
scenario, the
the proposed
proposed system
system is
is tested
tested when
when the
the data
data ofof the
the meter
meter is
is real
real and
and
the internet network does not overload. Figure 10 shows the output of
the internet network does not overload. Figure 10 shows the output of the IoT system due the IoT system due
to this test, that is presented in the dashboard of the IoT platform. As is clear
to this test, that is presented in the dashboard of the IoT platform. As is clear in this figure,in this figure,
the data
the data is is real
real and
and there
there is
is no
no loss.
loss. Furthermore,
Furthermore, the the operation
operation condition
condition is is green,
green, asas is
is
clear in the traffic light, which means that the system is stable, and
clear in the traffic light, which means that the system is stable, and so no event and/orso no event and/or
alarmsare
alarms arenoticed.
noticed.This
Thisproves
provesthat
that the
the model
model of of
thethe decision
decision treetree works
works wellwell without
without an
an error.
error.
Figure 10. The dashboard of the IoT platform results in the case of scenario 1.
Figure 10. The dashboard of the IoT platform results in the case of scenario 1.
5.2. Scenario 2: Testing Fake Data and No Loss
5.2. Scenario 2: Testing Fake Data and No Loss
In
In this
this scenario,
scenario, the
the proposed
proposed system
system is
is tested
tested when
when the
the data
data of
of the
the meter
meter is
is fake
fake and
and
the
the internet network does not overload. As is clear in Figure 11, the data is fake and there
internet network does not overload. As is clear in Figure 11, the data is fake and there
is no loss. This proves that the model of the decision tree work can catch the fake data
to help the user to secure and check the smart meter. Furthermore, the fake data and the
corresponding time are recorded in the database for any future action and forecasting.
Besides, the traffic light changed to a red color to hint the user about the abnormal case of
fake data. Moreover, the dashboard shows that there is no loss in the entry data, which
proves that the internet network is stable and works effectively without overload.
is no loss. This proves that the model of the decision tree work can catch the fake data to
help the user to secure and check the smart meter. Furthermore, the fake data and the
corresponding time are recorded in the database for any future action and forecasting.
Sensors 2021, 21, 487 13 ofof
Besides, the traffic light changed to a red color to hint the user about the abnormal case 16
fake data. Moreover, the dashboard shows that there is no loss in the entry data, which
proves that the internet network is stable and works effectively without overload.
Figure 11. The dashboard of the IoT platform results in a case of fake data and no loss testing.
Figure 11. The dashboard of the IoT platform results in a case of fake data and no loss testing.
5.3.Scenario
5.3. Scenario3:3:Testing
TestingInternet
InternetCapability
Capability
Internetspeed
Internet speedrepresents
representsa abig big challenge
challenge against
against thethe
IoTIoT systems.
systems. So,smart
So, the the smart
sys-
system
tem mustmust
checkcheck the data
the data loss loss
and and the required
the required internet
internet speedspeed fornetwork.
for the the network. Insce-
In this this
scenario,
nario, the proposed
the proposed system
system is tested
is tested to check
to check the data
the data loss loss
and and publish
publish the results
the results on
on the
the dashboard
dashboard of theofIoT
the platform.
IoT platform.
Figure Figure 12 shows
12 shows that there
that there is data
is data loss, loss, as shown
as shown in
in the
the dashboard
dashboard of theofIoT
the platform.
IoT platform.
This This concludes
concludes that that the network
the network speed speed
is notis enough
not enough
for
for smart
the the smart system.
system. In spite
In spite of theoffact
the that
fact that the data
the data is and
is real real and secured,
secured, as is as is clear
clear in
in the
the dashboard, there is not enough criteria to make a good decision
dashboard, there is not enough criteria to make a good decision due to the presence of due to the presence
of losses
losses in the
in the data.data.
This This
data data loss causes
loss causes a higha high temperature,
temperature, whichwhich can damage
can damage the
the elec-
electrical
trical machine.
machine.
Figure 12. The dashboard of the IoT platform results in the case of testing internet capability.
Figure 12. The dashboard of the IoT platform results in the case of testing internet capability.
This
Thiscase
caseisiscarried
carriedoutoutto
tocheck
checkboth
boththe
thedata
datatype
typeandandinternet
internetcapability
capabilitytogether.
together.
Figure
Figure 13 shows that the data is fake and there is data loss, as shown inthe
13 shows that the data is fake and there is data loss, as shown in thedashboard
dashboardof of
the
theIoT
IoTplatform.
platform.ThisThisisisthe
themost
mostabnormal
abnormalcase
caseand
andititindicates
indicatesthat
thatthe
thedata
dataisisfake
fakeand
and
the
thenetwork
networkspeed
speedisisnot
notenough
enoughforforthe
thesmart
smartsystem.
system.Furthermore,
Furthermore,whenwhenthe
theoperation
operation
condition
condition is a red color, as shown in the traffic light, the system is unstable, resultingin
is a red color, as shown in the traffic light, the system is unstable, resulting in
abnormal
abnormalevents
eventsand/or
and/oralarms.
alarms.
Sensors 2021, 21, 487 14 of 16
Sensors 2021, 21, x FOR PEER REVIEW 14 of 16
Figure 13. The dashboard of the IoT platform results in a case of testing the fake data and internet capability.
Figure 13. The dashboard of the IoT platform results in a case of testing the fake data and internet capability.
6.
6. Conclusions
Conclusions
This
Thiswork
workintroduced
introduced aanewlynewlydeveloped
developed intelligent
intelligent technique
technique to to check
check the thereliability
reliability
of
of the IoT smart systems. The data validation of the smart meter is accomplished by
the IoT smart systems. The data validation of the smart meter is accomplished by aa
machine learning technique named decision tree. The decision
machine learning technique named decision tree. The decision tree technique performed the tree technique performed
the regression
regression and and classification
classification of theofsmart
the smart
meter meter
reading reading to real
to real and fakeand fakeFurthermore,
types. types. Fur-
thermore,
the developed the developed
algorithm algorithm
can detect candatadetect
loss due datatoloss due to
unstable unstable
internet internet
signals. Thesignals.
online
The online
output of the output
system,of the
likesystem, like
data loss, data
real, andloss,
fake real, and
data, arefake data, are
presented presented
in the dashboard in the
of
dashboard of the IoT
the IoT platform. Theplatform.
efficacy ofThe
theefficacy
proposed of the proposed infrastructure
infrastructure has been evaluated has been by eval-
three
uated by three
scenarios. scenarios.
Scenario 1 provesScenario
that the1 proves that and
data is real the data
thereisisreal and while
no loss, there is thenomodel
loss, while
of the
the modeltree
decision of the decision
works well tree works
without anwell
error.without
Regarding an error. Regarding
Scenario Scenario
2, it shows that2,theit shows
model
that thedecision
of the model of the
tree decision
can treefake
catch the candata
catch tothehelp fake
thedata
userto to help
secure theand
user to secure
check and
the smart
meter.the
check Accordingly,
smart meter. the Accordingly,
traffic light has thebeen changed
traffic to abeen
light has red color
changedto hint
to the
a reduser about
color to
the abnormal
hint the user aboutcase ofthefake data. In the
abnormal casefinal scenario,
of fake data.theIn proposed infrastructure
the final scenario, has been
the proposed
tested to checkhas
infrastructure thebeen
data tested
loss andto publish
check the the results
data lossonand the dashboard
publish of the IoT
the results on theplatform,
dash-
where it is concluded that the network speed is not enough for
board of the IoT platform, where it is concluded that the network speed is not enough forthe smart system. Generally,
the smart
the proposed method
system. enhances
Generally, thethe reliability
proposed methodof theenhances
smart IoTthe systems, which
reliability increases
of the smart
the systems,
IoT investments whichin industry
increases4.0.theBesides,
investmentsit canin beindustry
applied 4.0.
to different
Besides,kinds
it canofbesensors
appliedand to
machineskinds
different in future work. and machines in future work.
of sensors
Author Contributions:
Author Contributions: AllAllauthors
authorshave
havecontributed
contributed toto
thethe preparation
preparation of of
thisthis manuscript.
manuscript. M.E.M.E.
designedthe
designed theidea
ideastrategy
strategyand
andstudied
studied thethe data.
data. M.M.F.D.
M.M.F.D. andand K.M.
K.M. wrote
wrote the manuscript
the manuscript and and
designed some
designed some figures related
related to
to smart
smart systems
systemsandandIoT.
IoT.Finally,
Finally,M.L.
M.L.and
andM.E.
M.E. performed
performedreviewing,
review-
editing,
ing, andand
editing, supporting different
supporting improvements
different improvements forfor
the
themanuscript.
manuscript.All
Allauthors
authors have
have read and
agreed
and to the
agreed to published version
the published of the
version of manuscript.
the manuscript.
Thiswork
Funding: This
Funding: workwas
wassupported
supportedininpart
partbyby the
the Department
Department of of Electrical
Electrical Engineering
Engineering andand Au-
Auto-
tomation,
mation, Aalto
Aalto University,
University, Espoo,
Espoo, Finland,
Finland, andand in part
in part by Center
by the the Center for Cyber-Physical
for Cyber-Physical System
System
Innovation from the Featured
Innovation Featured Areas
AreasResearch
ResearchCenter
CenterProgram
Programininthe
theAgenda
Agenda ofof
thethe
Higher Education
Higher Educa-
Sprout
tion Project,
Sprout Taiwan.
Project, Taiwan.
Institutional Review
Institutional Review Board
Board Statement: Notapplicable.
Statement: Not applicable.
Informed Consent
Informed Consent Statement: Notapplicable.
Statement: Not applicable.
Data
Data Availability
Availability Statement: The
The data
data presented
presented in
in this
this study
study are
are available
available on
on request
request from
from the
the
corresponding
corresponding author.
author.
Acknowledgments:
Acknowledgments: The Theauthors
authorsacknowledge
acknowledgethetheCONTACT
CONTACTElements
Elementsfor IoT
for platform
IoT forfor
platform sup-
sup-
porting
porting this
this work
work that
that was
was applied
applied in
in industry
industry 4.0.
4.0.
Conflicts
Conflicts of
of Interest: The
The authors
authors declare
declare no conflict of interest.
Sensors 2021, 21, 487 15 of 16
References
1. Lin, J.C.-W.; Yeh, K.-H. Security and Privacy Techniques in IoT Environment. Sensors 2020, 21, 1. [CrossRef] [PubMed]
2. Lloret, J.; Tomas, J.; Canovas, A.; Parra, L. An Integrated IoT Architecture for Smart Metering. IEEE Commun. Mag. 2016, 54,
50–57. [CrossRef]
3. Alablani, I.; Alenazi, M. EDTD-SC: An IoT Sensor Deployment Strategy for Smart Cities. Sensors 2020, 20, 7191. [CrossRef]
[PubMed]
4. Bedi, G.; Venayagamoorthy, G.K.; Singh, R.; Brooks, R.R.; Wang, K.C. Review of Internet of Things (IoT) in Electric Power and
Energy Systems. IEEE Internet Things J. 2018, 5, 847–870. [CrossRef]
5. Zhao, L.; Brandao Machado Matsuo, I.; Zhou, Y.; Lee, W.J. Design of an Industrial IoT-Based Monitoring System for Power
Substations. IEEE Trans. Ind. Appl. 2019, 55, 5666–5674. [CrossRef]
6. Morello, R.; De Capua, C.; Fulco, G.; Mukhopadhyay, S.C. A smart power meter to monitor energy flow in smart grids: The role
of advanced sensing and iot in the electric grid of the future. IEEE Sens. J. 2017, 17, 7828–7837. [CrossRef]
7. Trakadas, P.; Simoens, P.; Gkonis, P.; Sarakis, L.; Angelopoulos, A.; Ramallo-González, A.P.; Skarmeta, A.; Trochoutsos, C.; Calvo,
D.; Pariente, T.; et al. An Artificial Intelligence-Based Collaboration Approach in Industrial IoT Manufacturing: Key Concepts,
Architectural Extensions and Potential Applications. Sensors 2020, 20, 5480. [CrossRef]
8. Petrillo, A.; Picariello, A.; Santini, S.; Scarciello, B.; Sperlí, G. Model-based vehicular prognostics framework using Big Data
architecture. Comput. Ind. 2020, 115, 103177. [CrossRef]
9. De santo, A.; Galli, A.; Gravina, M.; Moscato, V.; Sperli, G. Deep Learning for HDD health assessment: An application based on
LSTM. IEEE Trans. Comput. 2020, 1, 1. [CrossRef]
10. Chang, C.-Y.; Kuo, C.-H.; Chen, J.-C.; Wang, T.-C. Design and Implementation of an IoT Access Point for Smart Home. Appl. Sci.
2015, 5, 1882–1903. [CrossRef]
11. Abate, F.; Carratù, M.; Liguori, C.; Paciello, V. A low cost smart power meter for IoT. Meas. J. Int. Meas. Confed. 2019, 136, 59–66.
[CrossRef]
12. García-Magariño, I.; Nasralla, M.M.; Nazir, S. Real-Time Analysis of Online Sources for Supporting Business Intelligence
Illustrated with Bitcoin Investments and IoT Smart-Meter Sensors in Smart Cities. Electronics 2020, 9, 1101. [CrossRef]
13. Chen, Y.; Martínez, J.-F.; Castillejo, P.; López, L. An Anonymous Authentication and Key Establish Scheme for Smart Grid: FAuth.
Energies 2017, 10, 1354. [CrossRef]
14. Nabil, M.; Ismail, M.; Mahmoud, M.M.E.A.; Alasmary, W.; Serpedin, E. PPETD: Privacy-Preserving Electricity Theft Detection
Scheme with Load Monitoring and Billing for AMI Networks. IEEE Access 2019, 7, 96334–96348. [CrossRef]
15. Cunha, V.C.; Freitas, W.; Trindade, F.C.L.; Santoso, S. Automated Determination of Topology and Line Parameters in Low Voltage
Systems Using Smart Meters Measurements. IEEE Trans. Smart Grid 2020, 11, 5028–5038. [CrossRef]
16. Ferreira, T.S.D.; Trindade, F.C.L.; Vieira, J.C.M. Load Flow-Based Method for Nontechnical Electrical Loss Detection and Location
in Distribution Systems Using Smart Meters. IEEE Trans. Power Syst. 2020, 35, 3671–3681. [CrossRef]
17. Bu, F.; Dehghanpour, K.; Yuan, Y.; Wang, Z.; Zhang, Y. A Data-Driven Game-Theoretic Approach for Behind-the-Meter PV
Generation Disaggregation. IEEE Trans. Power Syst. 2020, 35, 3133–3144. [CrossRef]
18. Wang, Y.; Chen, Q.; Hong, T.; Kang, C. Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges.
IEEE Trans. Smart Grid 2019, 10, 3125–3148. [CrossRef]
19. Rahman, M.A.; Manshaei, M.H.; Al-Shaer, E.; Shehab, M. Secure and private data aggregation for energy consumption scheduling
in smart grids. IEEE Trans. Dependable Secur. Comput. 2017, 14, 221–234. [CrossRef]
20. Ibrahem, M.I.; Nabil, M.; Fouda, M.M.; Mahmoud, M.; Alasmary, W.; Alsolami, F. Efficient Privacy-Preserving Electricity Theft
Detection with Dynamic Billing and Load Monitoring for AMI Networks. arXiv 2020, arXiv:2005.13793.
21. Kumar, P.; Lin, Y.; Bai, G.; Paverd, A.; Dong, J.S.; Martin, A. Smart Grid Metering Networks: A Survey on Security, Privacy and
Open Research Issues. IEEE Commun. Surv. Tutor. 2019, 21, 2886–2927. [CrossRef]
22. Sun, Q.; Li, H.; Ma, Z.; Wang, C.; Campillo, J.; Zhang, Q.; Wallin, F.; Guo, J. A Comprehensive Review of Smart Energy Meters in
Intelligent Energy Networks. IEEE Internet Things J. 2016, 3, 464–479. [CrossRef]
23. Spanò, E.; Niccolini, L.; Di Pascoli, S.; Iannaccone, G. Last-meter smart grid embedded in an internet-of-things platform. IEEE
Trans. Smart Grid 2015, 6, 468–476. [CrossRef]
24. Kabugo, J.C.; Jämsä-Jounela, S.L.; Schiemann, R.; Binder, C. Industry 4.0 based process data analytics platform: A waste-to-energy
plant case study. Int. J. Electr. Power Energy Syst. 2020, 115, 105508. [CrossRef]
25. Aheleroff, S.; Xu, X.; Lu, Y.; Aristizabal, M.; Pablo Velásquez, J.; Joa, B.; Valencia, Y. IoT-enabled smart appliances under industry
4.0: A case study. Adv. Eng. Inform. 2020, 43, 101043. [CrossRef]
26. IoT Platform for Digital Business Models|CONTACT Software. Available online: https://www.contact-software.com/
en/products/iot-platform-for-digital-business-models/?fbclid=IwAR0oYDd4qHpCd0BEZaGrLHEAQGYoQ2BhBmDzbF35-
cyM6QrNHAkziWDC8yo (accessed on 22 December 2020).
27. Myles, A.J.; Feudale, R.N.; Liu, Y.; Woody, N.A.; Brown, S.D. An introduction to decision tree modeling. J. Chemom. 2004, 18,
275–285. [CrossRef]
28. Liu, S.; Yang, Z.; Li, Y.; Wang, S. Decision Tree-Based Sensitive Information Identification and Encrypted Transmission System.
Entropy 2020, 22, 192. [CrossRef]
Sensors 2021, 21, 487 16 of 16
29. Ayyadevara, V.K. Pro Machine Learning Algorithms; Apress: Berkeley, CA, USA, 2018.
30. Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing different supervised machine learning algorithms for disease
prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 281. [CrossRef]