Federated Learning-Based Anomaly Detection For IoT Security Attacks
Federated Learning-Based Anomaly Detection For IoT Security Attacks
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 1
Abstract—The Internet of Things (IoT) is made up of billions IoT devices are providing improvised smart solutions and
of physical devices connected to the Internet via networks that enhancing services in various domains such as healthcare [1],
perform tasks independently with less human intervention. Such intelligent digital assistants [2], smart home [2], [3], and in
brilliant automation of mundane tasks requires a considerable
amount of user data in digital format, which in turn makes IoT the industrial domain, also known as the Industrial Internet of
networks an open-source of Personally Identifiable Information Things (IIoT), to name a few.
data for malicious attackers to steal, manipulate and perform IoT devices are proven to excel in delivering AI solutions
nefarious activities. Huge interest has developed over the past but on the downside, they rely on sensitive user data to perform
years in applying machine learning (ML)-assisted approaches tasks. The requirement of IoT devices to function with optimal
in the IoT security space. However, the assumption in many
current works is that big training data is widely available and energy consumption makes its micro-architecture style less
transferable to the main server because data is born at the edge suitable to deploy computationally heavy security firewalls,
and is generated continuously by IoT devices. This is to say making IoT devices more vulnerable to various attacks. As
that classic ML works on the legacy set of entire data located discussed in [4], Mirai and other variations of malware bots
on a central server, which makes it the least preferred option can exploit the vulnerabilities in IoT devices and take control
for domains with privacy concerns on user data. To address
this issue, we propose federated learning (FL)-based anomaly over AI functionality of it and in turn accessing other non-
detection approach to proactively recognize intrusion in IoT IoT devices connected to it. This emphasizes the fact that
networks using decentralized on-device data. Our approach uses unguarded IoT devices could turn into an open threat to
federated training rounds on Gated Recurrent Units (GRUs) all other network devices interconnected with them. Network
models and keeps the data intact on local IoT devices by sharing protocols in IoT networks are a critical interface that connects
only the learned weights with the central server of the FL.
Also, the approach’s ensembler part aggregates the updates from physical devices with the digital world. Research work in
multiple sources to optimize the global ML model’s accuracy. Our [5], [6] explores the vulnerabilities in IoT and feature-based
experimental results demonstrate that our approach outperforms security risks are explored and authors in [7] discuss various
the classic/centralized machine learning (non-FL) versions in attacks and their impact on IIoT networks. This emphasizes
securing the privacy of user data and provides an optimal the fact that Mirai is just one of the attacks and that there
accuracy rate in attack detection.
has been exponential growth in malicious activities, which are
Index Terms—Internet of Things, Security, Federated Learn- successful in exploiting the vulnerabilities of IoT networks
ing, Recurrent neural networks, Gated Recurrent Units. [8]. The idea of micro-devices delivering intelligent digital
assistance has been tremendously appreciated and proven to
I. I NTRODUCTION reduce manual work, which in turn created a high demand for
the production of enormous variants of IoT devices. The ea-
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 2
3) Training huge volumes of data on a single server can be DÏoT consists of a security gateway and IoT security services.
computationally expensive. A security gateway is configured as an access point between
One of the promising and well adaptable approaches which IoT devices and the Internet. The anomaly detection compo-
can rectify these disadvantages in the ML-based approach is nent is integrated with a security gateway, which monitors the
Federated Learning (FL) [15], [16]. In FL, decentralized ML network for abnormal activity. IoT security services maintain
model training keeps the data intact on the edge device and a repository of device-specific anomaly detection models and
only the learned ML model weights are transferred to a central aggregate the model weights updates from IoT devices. When
server. This strategy of FL is proven to secure the privacy of new devices are included in the IoT network, the device-
user data [17] making it the preferred approach in comparison specific anomaly detection retrieves existing anomaly detec-
to non-FL solutions. tion models from the repository and enables monitoring of
In this paper, we propose a decentralized federated learning network traffic. Due to self-learning algorithm labelling, the
approach with an ensembler to enable anomaly detection on attack is not mandatory as DÏoT learns the pattern in the attack
the IoT networks. The approach enables on-device training category. As per evaluation results, False alarms are minimized
and helps to train the anomaly detection ML model on IoT in detecting attacks. However, the approach was limited to
networks without the need to transfer network data to a single (Mirai) attack types and lacks implementation of an
centralized server. To ensure optimal results in predicting FL-specific deep learning framework.
intrusion in IoT networks, we use Long Short Term Mem- The authors in [22] propose an FL-based intrusion detection
ory (LSTMs) and Gated Recurrent Units (GRUs)(which are framework called Deepfed, for identifying threats in cyber-
the improved versions of basic Recurrent Neural Networks physical systems (CPSs). A combination of convolutional
(RNNs)) neural network models to efficiently train the ML neural networks (CNNs) and Gated recurrent units (GRUs) are
model on the Modbus network dataset [18]. Our experimental leveraged for threat identification and a security protocol based
results demonstrate a minimized error rate in predicting attacks on the Paillier cryptosystem is used to ensure the security
and a reduced number of false alarms in comparison to the of local and global models during the FL training process.
classic (centralized) ML approach. Our contributions in this Research work in [23] proposes FedAGRU, an FL-based
work can be summarized as follows: Attention Gated Recurrent Unit. FedAGRU is an improvised
• Enabling on-device ML training with federated learning
federated averaging algorithm that is designed to identify
to secure data privacy at end-devices. poisoning attacks and eliminate minimal contributing updates
• Achieving higher accuracy rates and minimized false
for a highly efficient global model with optimal communi-
alarms in attack detection compared to a centralized ML cation costs. Evaluation results on three datasets [24]–[26]
(non-FL) approach. show promising results supporting the proposed approach.
• Demonstrating the benefits of integrating federated learn-
Similarly, authors in [27], propose FL based approach for
ing with ensembler to achieve optimal results. wireless intrusion detection (WID) with the awid dataset2 .
In [28], the authors leverage a mimic learning strategy to
The remainder of the paper is structured in the following
implement federated learning and combine it with ML-based
manner. Section II gives the related work. Section III presents
IDS. Another FL based approach presented by [29] is a
the proposed approach and illustrates the underlying archi-
FL based intrusion detection system (IDS) using TensorFlow
tecture with implementation details. Section IV presents the
federated (TFF 3 ) framework.
dataset, metrics, and evaluation results and summarizes our
Among the proposed approaches, ML-based intrusion de-
findings. Finally, Section V concludes the paper.
tection presented in [13], is similar to our work, where a cen-
tralized version of anomaly detection is proposed TensorFlow
II. R ELATED W ORK based deep learning framework. Six LSTMs [30] of different
IoT is proven to be successful in delivering ML solutions layers are used as a threat detection algorithm. Evaluation
in its micro-architecture design [11], [19]. The growing popu- results confirm the efficiency of the model, but it is limited to a
larity and usage of IoT have created many interesting research centralized version of ML and our approach enhances it much
areas. One such research path is the detection and classification further by implementing FL and computationally inexpensive
of attacks in IoT networks, and there are several research GRUs with PySyft [31] deep learning frameworks. Another
works proposed to secure IoT networks from malicious at- ML-based anomaly detection is proposed in the [14] research
tacks [20]. This section will cover the recent studies that are work, where distinct attributes in the dataset are used to
proposed to improve the security of IoT networks using ML identify anomalies in the smart home IoT devices. The authors
techniques. proposed basic artificial neural networks (ANNs) classification
The authors in [21] propose an autonomous self-learning algorithm and the logistic regression algorithm for identifying
system named DÏoT, which is an FL-based approach for de- attacks. The focus of the paper is limited to identifying the
tecting IoT devices infected with Mirai malware in IoT smart attack patterns in the data and uses a basic classification
home networks. FL part is implemented using Python’s flask algorithm that may not be adaptable to an evolving range of
and flask socketio. TensorFlow 1 Deep learning framework IoT devices. Authors in [32] use attention-based convolutional
is used to implement FL’s global model. The architecture of
2 https://icsdweb.aegean.gr/awid/
1 https://www.tensorflow.org 3 https://www.tensorflow.org/federated
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 3
neural network Long Short term memory (LSTMs) for iden- approach, we have experimented with both GRUs and LSTMs,
tifying anomalies in time series data if industrial IoT (IIoT) initial FL training rounds in which GRUs models shown in
devices. PySyft and PyTorch 4 deep learning frameworks are Table II outperformed LSTMs in achieving a higher accuracy
used to implement FL and a gradient compression technique rate, and being computationally inexpensive [41].
is proposed to improve communication efficiency.
To summarize, existing research work lags in implement-
ing a decentralized communication efficient framework for
anomaly detection in IoT networks. In our approach, we con-
sidered those limitations and propose an FL-based approach
for IoT security attacks.
Acronym Description
Below are the details of the components of GRUs and
GRUs Gated Recurrent Units
LSTMs illustrated in Fig. 1 adopted from a blog5 .
• Sigmoid function: σ provides a way to decide whether any
LSTMs Long Short Term Memory Networks information needs to be retained or discarded. σ generates
values ranging between 0 and 1, where a value near to 0
RNNs Recurrent Neural Networks
will enable information in the network to be forgotten and
1 indicates information needs to be kept for future updates.
FL Federated Learning
tanh Tangent hyperbolic • Cell State: Represents the information retained throughout
the memory block of LSTM. Ct represents current cell
rfc Random forest classifier state or memory cell and Ct−1 represent previous cell state.
PCAP Packet capture • Gates of LSTMs: Based on the human’s ability to memorize
rhyming patterns, LSTMs are proposed as a memory block
where interconnected memory cells collect and retain
A. LSTMs and GRUs information for long-term reference. Gates are used to
controlling information retention, retrieval, and deletion
In this part, we give an overview of the deep learning ML
through memory cells. LSTMs consist of an Input gate,
models we have used in our proposed approach. Long Short
Forget gate, and Output gate. Below are details of each gate.
Term Memory Networks (LSTMs) [30], [33], [34] and Gated
Recurrent Units (GRUs) [35] is a variation of RNNs which
are proposed to address the short-term memory/vanishing gra- – Forget Gate: The information which does not contribute
dients problem in a basic variant of recurrent neural networks towards the learning of LSTM network is discarded for
(RNNs). The architecture of LSTMs and GRUs consists of the given cell as shown in Equation 1.
gates to monitor the information flow and control the learning
process, which enables the network to learn from long-term ft = σ(Wf [ht−1 , xt ] + bf ), (1)
dependencies. Gates act as switches in the network which
helps in retaining long and short-term information. Anomaly where ft is the current value of forget gate which is
detection [36], speech recognition, speech synthesis, and text the result of sigmoid function, xt is the current input
generation [37]–[40] are few real-time implementations of for the memory cell, Wf the weight matrix from forget
LSTMs and GRUs. During the evaluation of our proposed gate to input, b is the forget gate bias, and ht−1 is the
4 https://pytorch.org 5 https://colah.github.io/posts/2015-08-Understanding-LSTMs/
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 4
– Input Gate: Helps to determine whether the current where it is the Results of the sigmoid layer, ĥt is the
information is useful enough to retain it in the cell state Vector which is created by tanh layer, and ht−1 is the
for future reference. Equations 2, 3, and 4 represent the previous cell state value.
calculation of input gate where current cell state value We are using seven different window sizes and the input
is determined by the sum of forget gate ft and previous size for each LSTM/GRU is varied with the selected window
cell state ct−1 product and input gate and current state size. Selection of window size is crucial as the amount of data
candidate value ĉt , respectively. differs for each window size which contributes towards the
better performance of the ML model. Increased window size
length impacts the training time as the information retained
it = σ(Wi [ht−1 , xt ] + bi ) (2) increases in each memory cell of the neural network. Similar
ĉt = tanh(Wc [ht−1 , xt ] + bc ) (3) to the hyperparameter of the ML model, there is no ground
ct = ft ∗ ct−1 + it ∗ ĉt , (4) rule which confirms the relation between window size and
performance of the model. However there are a few research
where it results from the sigmoid layer representing works [34] which suggest that the impact of Window size,
the input gate, ĉt is a cell activation function which Layers of GRU/LSTM is dependent on the type and size of
is created by the tanh layer, ct−1 is the cell state of the dataset.
previous timestamp memory cell, ct calculates the current
cell value which is the information which is predicted as TABLE II: GRUs
important to save for future reference, respectively.
GRUs Model Name Number of Layers Dropout Hidden layer size
– Output Gate: This gate decides the final output of the GRU-1 2 0.01 256
network. In Equations 5 and 6, ht is calculated by running
GRU-2 1 0.00 100
ct (derived from Equation 4) through tanh activation
function. GRU-3 1 0.00 200
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 5
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 6
9 return mwi
10 EndF unction
11 Function f laverage (mwi ):
12 foreach wi in W do
13 Mwi = f laverage (mwi ) Fig. 3: Illustration of Ensembler in proposed approach
14 return Mwi
15 EndF unction A. Dataset
16 Function Ensembler(Mwi ): For evaluation of our approach, we have used a Modbus-
17 networkdata /* New flows in-network data */ based network dataset [18]. Modbus is a decades-old protocol,
18 foreach wi in Wi do which provides an efficient way to communicate with physical
19 lstmpredictions = mwi (newnetworkdata) devices that lack an inbuilt communication protocol. Modbus
20 anomalydetectionf lag = is used to establish request-response-based communication
Ensembler(lstmP redictions) between devices and is a well know protocol for many
21 EndF unction legacy industrial applications. The automation strategies in
22 mwi = F LT raining (maxtrainingrounds) the industrial domain aim to resolve the interoperability issue
23 Mwi = f laverage (mwi ) using a combination of IoT devices and Modbus protocol.
24 while f li in F L do Fig. 6 illustrates the message format in Modbus RTU and
25 foreach wi in W do Modbus TCP/IP protocols. We have used CICFlowmeter9 [42]
26 mwi = Mwi /* replace local ML */ to extract the ML readable CSV from captured network traffic
data.
27 AttackP rediction =Ensembler(Mwi ) Several research works use the Modbus protocol for IoT, es-
pecially for IIoT. Authors in [45] suggest that the combination
of Modbus TCP with IoT specific message queuing telemetry
transport (MQTT) brings in interoperability to industrial de-
IV. E VALUATION R ESULTS vices such as Internet-based monitoring and industrial control
systems. IoT gateway is an interface connecting application
layer information to the back-end server, research work in [46]
To evaluate the performance of our proposed approach, we proposes an approach to adapt Modbus protocol for different
compare it with a non-FL (classic ML) version (using the same IoT sensor devices. However, Modbus protocol is vulnerable
deep learning algorithms) based on the dataset and evaluation to many attacks [47], of which the dataset we chose contains
metrics presented in this section. Our environment set-up is the below-summarized attacks:
configured on the lambda GPU(Graphics Processing Unit) • Man-in-the-middle Attack: As the name suggests, dur-
server hosted Ubuntu 18.0.0 LTS Operating system. The ing communication between two parties the third party
deep learning framework we have used is PySyft [31] for entity the attacker impersonates as either sender/receiver
FL features, and GRUs as our ML neural network. For non- and tries to steal information or tries to perform actions as
FL/classic ML approach we have used Pytorch deep learning sender/receiver. Thus, the attacker gains access to control
framework. While FL implements Algorithm 1, non-FL-based the traffic and creates fake transactions.
GRUs setup trains on classic environment set-up of centralized
training data. 9 https://github.com/ahlashkari/CICFlowMeter
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 7
Fig. 4: GRU-1: Model evaluation results of proposed FL approach and Non-FL approach
Fig. 5: GRU-2: Model evaluation results of proposed FL approach and Non-FL approach
B. Evaluation Metrics
In ML space, the performance of trained model predictions
are compared with actual values and based on the compari-
Fig. 6: Modbus Message Format son results, True Positives (TP), True Negatives (TN), False
positives (FP), and False Negatives (FN) are calculated. TP
and TN represent the number of instances that the ML model
• Ping DDoS Flood Attack (Internet control message predictions match with real labels/actual values while FP and
protocol - ICMP): Most common variant of Distributed FN count the number of instances where the ML model has
Denial of service (DDoS) attack, where continuous pings predicted incorrect values. We have evaluated our approach
from the attacker overwhelm the server making it go offline and compared the results against the peer approach using the
and deny further connection requests. following metrics.
• Modbus Query Flood Attack: A variant of DDoS attack in • Accuracy
Modbus [48], where the attacker sends a flood of messages
to overwhelm the end-device and make it unavailable to TP + TN
Accuracy =
serve genuine message packets. TP + TN + FP + FN
• SYN DDoS Attack: In a Syn attack, repeated syn packets • Recall
are sent to the server to initiate a connection handshake in
TP
an attempt to keep all the ports busy and disabling the server Recall =
to offer more open connection ports to accept connections. TP + FN
SYN DDoS attack is usually achieved using a bot that sends • Precision
repeated connection requests by hiding the actual device TP
internet protocol (IP) address and sends many requests with P recision =
TP + FP
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 8
Fig. 7: GRU-3: Model evaluation results of proposed FL approach and Non-FL approach
Fig. 8: GRU-4: Model evaluation results of proposed FL approach and Non-FL approach
is the time taken for training ith data source and 11 https://scikit-learn.org/
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 9
Fig. 9: Training Time: For Different Window Sizes for pro- D. Implications
posed FL approach and Non-FL approach This paper provides useful practical implications for
adapters of FL. The proposed approach can serve as a good
starting point to design architecture for migrating non-FL-
based approaches to FL. The results section gives insight
into the performance of GRUs for different window sizes
and layer sizes, which can be further utilized to avoid cold
start problems in FL. Our analysis in this work can help
to fasten the process of promoting an ML product to FL in
a production environment. Moreover, the proposed FL-based
solution mitigates the disadvantages of non-FL-based intrusion
detection systems while the decentralized training empowers
end-device data security and enables sharing of computational
resources that might very well result in efficient and greener
ML-based products.
Fig. 10: Cross validation results: For Different Window Sizes V. C ONCLUSION AND F UTURE W ORK
of proposed FL approach In this paper, we proposed a federated learning-based
anomaly detection for accurate identification and classification
of attacks in IoT networks. The FL implementation part of our
proposed approach shares computational power with on-device
training and different layers of GRUs ensure higher accuracy
rates in classifying attacks. The performance of the approach
is improved further with the ensembler which combines the
predictions from different layers of GRUs. The FL benefits of
user data privacy add a secure layer in IoT networks, making
IoT devices more reliable. Our evaluation results demonstrate
that our proposed approach outperforms the non-FL version of
intrusion detection algorithms. Our future work is to enhance
the proposed approach with a testbed of IoT devices and
evaluate it with live data from device-specific datasets which
can classify all known and unknown vulnerabilities of IoT
Fig. 11: Average Accuracy of proposed FL approach and Non- devices.
FL approach
R EFERENCES
[1] L. Catarinucci, D. De Donno, L. Mainetti, L. Palano, L. Patrono,
M. L. Stefanizzi, and L. Tarricone, “An iot-aware architecture for smart
is 99.5% making the anomaly detection rate high with a healthcare systems,” IEEE internet of things journal, vol. 2, no. 6, pp.
minimal number of false alarms. The overall average accuracy 515–526, 2015.
of FL in comparison to non-FL is illustrated in Fig. 11. [2] T. Ammari, J. Kaye, J. Y. Tsai, and F. Bentley, “Music, search, and iot:
How people (really) use voice assistants.” ACM Trans. Comput. Hum.
To summarize, evaluation results emphasize the fact that our Interact., vol. 26, no. 3, pp. 17–1, 2019.
proposed approach outperforms non-FL implementation. For [3] H. Ghayvat, S. Mukhopadhyay, X. Gui, and N. Suryadevara, “Wsn-and
demonstration purposes, we have implemented FL with ML iot-based smart homes and their extension to smart buildings,” Sensors,
vol. 15, no. 5, pp. 10 350–10 379, 2015.
model built from scratch, but in realistic practical productions [4] C. Kolias, G. Kambourakis, A. Stavrou, and J. Voas, “Ddos in the iot:
implementation of FL, the global model can be trained with Mirai and other botnets,” Computer, vol. 50, no. 7, pp. 80–84, 2017.
known attacks and sample training dataset. This helps the [5] N. Neshenko, E. Bou-Harb, J. Crichigno, G. Kaddoum, and N. Ghani,
“Demystifying iot security: An exhaustive survey on iot vulnerabilities
global model to get pre-trained and the live data from IoT and a first empirical look on internet-scale iot exploitations,” IEEE
devices keeping the global model up-to-date and optimizes Communications Surveys Tutorials, vol. 21, no. 3, pp. 2702–2733, 2019.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2021.3077803, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL, 2021 10
[6] W. Zhou, Y. Jia, A. Peng, Y. Zhang, and P. Liu, “The effect of iot new [27] B. Cetin, A. Lazar, J. Kim, A. Sim, and K. Wu, “Federated wireless
features on security and privacy: New threats, existing solutions, and network intrusion detection,” in 2019 IEEE International Conference
challenges yet to be solved,” IEEE Internet of Things Journal, vol. 6, on Big Data (Big Data), 2019, pp. 6004–6006.
no. 2, pp. 1606–1616, 2019. [28] N. A. Al-Athba Al-Marri, B. S. Ciftler, and M. M. Abdallah, “Federated
[7] A. C. Panchal, V. M. Khadse, and P. N. Mahalle, “Security issues in iiot: mimic learning for privacy preserving intrusion detection,” in 2020 IEEE
A comprehensive survey of attacks on iiot and its countermeasures,” in International Black Sea Conference on Communications and Networking
2018 IEEE Global Conference on Wireless Computing and Networking (BlackSeaCom), 2020, pp. 1–6.
(GCWCN), 2018, pp. 124–130. [29] S. A. Rahman, H. Tout, C. Talhi, and A. Mourad, “Internet of things
[8] W. Al Amiri, M. Baza, M. Mahmoud, b. K. Banawan, W. Alasmary, and intrusion detection: Centralized, on-device, or federated learning?” IEEE
K. Akkaya, “Privacy-preserving smart parking system using blockchain Network, vol. 34, no. 6, pp. 310–317, 2020.
and private information retrieval,” Proc. of the IEEE International [30] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Conference on Smart Applications, Communications and Networking computation, vol. 9, no. 8, pp. 1735–1780, 1997.
(SmartNets 2019), 2020. [31] T. Ryffel, A. Trask, M. Dahl, B. Wagner, J. Mancuso, D. Rueckert, and
[9] J. Wu, M. Dong, K. Ota, J. Li, and W. Yang, “Application-aware J. Passerat-Palmbach, “A generic framework for privacy preserving deep
consensus management for software-defined intelligent blockchain in learning,” arXiv preprint arXiv:1811.04017, 2018.
iot,” IEEE Network, vol. 34, no. 1, pp. 69–75, 2020. [32] Y. Liu, S. Garg, J. Nie, Y. Zhang, Z. Xiong, J. Kang, and M. S.
[10] H. Liang, J. Wu, S. Mumtaz, J. Li, X. Lin, and M. Wen, “Mbid: Micro- Hossain, “Deep anomaly detection for time-series data in industrial
blockchain-based geographical dynamic intrusion detection for v2x,” iot: A communication-efficient on-device federated learning approach,”
IEEE Communications Magazine, vol. 57, no. 10, pp. 77–83, 2019. IEEE Internet of Things Journal, pp. 1–1, 2020.
[11] H. HaddadPajouh, A. Dehghantanha, R. M. Parizi, M. Aledhari, and [33] F. A. Gers and J. Schmidhuber, “Recurrent nets that time and count,”
H. Karimipour, “A survey on internet of things security: Requirements, in Proceedings of the IEEE-INNS-ENNS International Joint Conference
challenges, and solutions,” Internet of Things, p. 100129, 2019. on Neural Networks. IJCNN 2000. Neural Computing: New Challenges
[12] M. Baza, A. Salazar, M. Mahmoud, M. Abdallah, and K. Akkaya, “On and Perspectives for the New Millennium, vol. 3, 2000, pp. 189–194
sharing models instead of data using mimic learning for smart health vol.3.
applications,” in 2020 IEEE International Conference on Informatics, [34] H. Sak, A. W. Senior, and F. Beaufays, “Long short-term memory
IoT, and Enabling Technologies (ICIoT), 2020, pp. 231–236. recurrent neural network architectures for large scale acoustic modeling,”
[13] M. Saharkhizan, A. Azmoodeh, A. Dehghantanha, K. K. R. Choo, Google, 2014.
and R. M. Parizi, “An ensemble of deep recurrent neural networks [35] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares,
for detecting iot cyber attacks using network traffic,” IEEE Internet of H. Schwenk, and Y. Bengio, “Learning phrase representations using
Things Journal, vol. 7, no. 9, pp. 8852–8859, 2020. rnn encoder-decoder for statistical machine translation,” arXiv preprint
[14] N. K. Sahu and I. Mukherjee, “Machine learning based anomaly arXiv:1406.1078, 2014.
detection for iot network: (anomaly detection in iot network),” in 2020 [36] P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, “Long short term
4th International Conference on Trends in Electronics and Informatics memory networks for anomaly detection in time series,” in Proceedings,
(ICOEI)(48184), 2020, pp. 787–794. vol. 89. Presses universitaires de Louvain, 2015, pp. 89–94.
[37] E. H. Bahadur, A. K. M. Masum, A. Barua, M. G. R. Alam, M. A.
[15] M. Aledhari, R. Razzak, R. M. Parizi, and F. Saeed, “Federated learning:
U. Z. Chowdhury, and M. R. Alam, “Lstm based approach for diabetic
A survey on enabling technologies, protocols, and applications,” IEEE
symptomatic activity recognition using smartphone sensors,” in 2019
Access, vol. 8, pp. 140 699–140 725, 2020.
22nd International Conference on Computer and Information Technol-
[16] V. Mothukuri, R. M. Parizi, S. Pouriyeh, Y. Huang, A. Dehghantanha,
ogy (ICCIT). IEEE, 2019, pp. 1–6.
and G. Srivastava, “A survey on security and privacy of federated
[38] S. Selvin, R. Vinayakumar, E. Gopalakrishnan, V. K. Menon, and K. So-
learning,” Future Generation Computer Systems, 2020.
man, “Stock price prediction using lstm, rnn and cnn-sliding window
[17] A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augen-
model,” in 2017 international conference on advances in computing,
stein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for
communications and informatics (icacci). IEEE, 2017, pp. 1643–1647.
mobile keyboard prediction,” arXiv preprint arXiv:1811.03604, 2019.
[39] Y. Luan and S. Lin, “Research on text classification based on cnn and
[18] I. Frazão, P. H. Abreu, T. Cruz, H. Araújo, and P. Simões, “Denial of lstm,” in 2019 IEEE International Conference on Artificial Intelligence
service attacks: detecting the frailties of machine learning algorithms and Computer Applications (ICAICA). IEEE, 2019, pp. 352–355.
in the classification process,” in International Conference on Critical [40] H. Xiao, M. A. Sotelo, Y. Ma, B. Cao, Y. Zhou, Y. Xu, R. Wang, and
Information Infrastructures Security. Springer, 2018, pp. 230–235. Z. Li, “An improved lstm model for behavior recognition of intelligent
[19] M. S. Mekala, A. Jolfaei, G. Srivastava, X. Zheng, A. Anvari- vehicles,” IEEE Access, 2020.
Moghaddam, and P. Viswanathan, “Resource offload consolidation based [41] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation
on deep-reinforcement learning approach in cyber-physical systems,” of gated recurrent neural networks on sequence modeling,” 2014.
IEEE Transactions on Emerging Topics in Computational Intelligence, [42] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani,
pp. 1–10, 2020. “Characterization of encrypted and vpn traffic using time-related,” in
[20] M. S. Mekala and V. Perumal, “Machine learning inspired phishing Proceedings of the 2nd international conference on information systems
detection (pd) for efficient classification and secure storage distribution security and privacy (ICISSP), 2016, pp. 407–414.
(ssd) for cloud-iot application,” in 2020 IEEE Symposium Series on [43] T. G. Dietterichl, “Ensemble learning,” in The Handbook of Brain Theory
Computational Intelligence (SSCI), 2020, pp. 202–210. and Neural Networks, M. Arbib, Ed. MIT Press, 2002, pp. 405–408.
[21] T. D. Nguyen, S. Marchal, M. Miettinen, H. Fereidooni, N. Asokan, and [44] L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp.
A. Sadeghi, “DÏot: A federated self-learning anomaly detection system 5–32, 2001.
for iot,” in 2019 IEEE 39th International Conference on Distributed [45] S. Jaloudi, “Communication protocols of an industrial internet of things
Computing Systems (ICDCS), 2019, pp. 756–767. environment: A comparative study,” Future Internet, vol. 11, no. 3, p. 66,
[22] B. Li, Y. Wu, J. Song, R. Lu, T. Li, and L. Zhao, “Deepfed: Feder- 2019.
ated deep learning for intrusion detection in industrial cyber-physical [46] F. Shu, H. Lu, and Y. Ding, “Novel modbus adaptation method for
systems,” IEEE Transactions on Industrial Informatics, pp. 1–1, 2020. iot gateway,” in 2019 IEEE 3rd Information Technology, Networking,
[23] Z. Chen, N. Lv, P. Liu, Y. Fang, K. Chen, and W. Pan, “Intrusion Electronic and Automation Control Conference (ITNEC), 2019, pp. 632–
detection for wireless edge networks based on federated learning,” IEEE 637.
Access, vol. 8, pp. 217 463–217 472, 2020. [47] Z. Drias, A. Serhrouchni, and O. Vogel, “Taxonomy of attacks on indus-
[24] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed trial control protocols,” in 2015 International Conference on Protocol
analysis of the kdd cup 99 data set,” in 2009 IEEE Symposium on Engineering (ICPE) and International Conference on New Technologies
Computational Intelligence for Security and Defense Applications, 2009, of Distributed Systems (NTDS), 2015, pp. 1–6.
pp. 1–6. [48] S. Bhatia, N. Kush, C. Djamaludin, J. Akande, and E. Foo, “Practical
[25] R. Panigrahi and S. Borah, “A detailed analysis of cicids2017 dataset modbus flooding attack and detection,” in Proceedings of the Twelfth
for designing intrusion detection systems,” International Journal of Australasian Information Security Conference - Volume 149, ser. AISC
Engineering & Technology, vol. 7, no. 3.24, pp. 479–482, 2018. ’14. Australian Computer Society, Inc., 2014, p. 57–65.
[26] I. Almomani, B. Kasasbeh, and M. AL-Akhras, “Wsn-ds: A dataset
for intrusion detection systems in wireless sensor networks,” Journal of
Sensors, vol. 2016, pp. 1–16, 01 2016.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF WESTERN ONTARIO. Downloaded on May 26,2021 at 10:17:15 UTC from IEEE Xplore. Restrictions apply.