Cryptography and Its Applications in Information Security
Cryptography and Its Applications in Information Security
net/publication/360252838
CITATION READS
1 1,293
3 authors:
William Puech
Université de Montpellier
336 PUBLICATIONS 3,396 CITATIONS
SEE PROFILE
All content following this page was uploaded by René Pierre Lozi on 01 May 2022.
www.mdpi.com/journal/applsci
Cryptography and Its Applications in
Information Security
Cryptography and Its Applications in
Information Security
Editors
Safwan El Assad
René Lozi
William Puech
MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin
Editors
Safwan El Assad René Lozi William Puech
Université de Côte d’Azur University AUniversity of Montpellier
Nantes-Polytech France France
France
Editorial Office
MDPI
St. Alban-Anlage 66
4052 Basel, Switzerland
This is a reprint of articles from the Special Issue published online in the open access journal
Applied Sciences (ISSN 2076-3417) (available at: https://www.mdpi.com/journal/applsci/special
issues/Cryptography Applications Information Security).
For citation purposes, cite each article independently as indicated on the article page online and as
indicated below:
LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. Journal Name Year, Volume Number,
Page Range.
© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative
Commons Attribution (CC BY) license, which allows users to download, copy and build upon
published articles, as long as the author and publisher are properly credited, which ensures maximum
dissemination and a wider impact of our publications.
The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons
license CC BY-NC-ND.
Contents
Fethi Dridi, Safwan El Assad, Wajih El Hadj Youssef, Mohsen Machout and René Lozi
The Design and FPGA-Based Implementation of a Stream Cipher Based on a Secure Chaotic
Generator
Reprinted from: Appl. Sci. 2021, 11, 625, doi:10.3390/app11020625 . . . . . . . . . . . . . . . . . . 79
Evaristo José Madarro Capó, Carlos Miguel Legón Pérez, Omar Rojas,
Guillermo Sosa-Gómez and Raisa Socorro Llanes
Bit Independence Criterion Extended to Stream Ciphers
Reprinted from: Appl. Sci. 2020, 10, 7668, doi:10.3390/app10217668 . . . . . . . . . . . . . . . . . 99
Rong Huang, Fang Han, Xiaojuan Liao, Zhijie Wang and Aihua Dong
A Novel Intermittent Jumping Coupled Map Lattice Based on Multiple Chaotic Maps
Reprinted from: Appl. Sci. 2021, 11, 3797, doi:10.3390/app11093797 . . . . . . . . . . . . . . . . . 119
vi
About the Editors
Safwan El Assad is Associate Professor (HDR) at Nantes University/Polytech Nantes,
IETR (Institut d’Electronique et des Technologies du numéRique) laboratory, UMR CNRS 6164,
VAADER team, France. From 1988 to 2005, his research activities concerned radar imagery
and digital communications. With a background in radar imagery and digital communications,
today his research largely focuses on chaos-based cryptography: Block/Stream ciphers;
Keyed Hash Functions; Authenticated Encryption; Steganography and Watermarking systems.
Awards (6): PEDR (Award for Doctoral Supervision and Research): (1994-1997; 1998-2001),
PES (Award for Scientific Excellence): (2010-2013; 2014-2017), PEDR (2018- 2021; 2021-2025).
https://scholar.google.com/citations?user=69Jk1jQAAAAJ&hl=fr.
René Lozi is Emeritus Professor at University Côte d’Azur, Dieudonné Center of Mathematics,
France. He completed the PhD degree with his French State Thesis (on chaotic dynamical systems)
under the supervision of Prof. René Thom (Fields medalist) in 1983. In 1991, he became Full Professor
at University of Nice and IUFM (Institute for teacher trainees). He has served as Director of this
institute (2001-2006) and as Vice-Chairman of the French Board of Directors of IUFM (2004-2006). He
is member of several editorial boards of international journals. In 1977, he discovered a particular
mapping of the plane having a strange attractor (now, commonly known as ”Lozi map”). Nowadays,
his research areas include complexity and emergence theory, dynamical systems, bifurcations,
control of chaos, cryptography based on chaos, and recently memristors (physical devices for
neurocomputing). He is working in those fields with renowned researchers from many countries.
He received the Dr. Zakir Husain Award 2012 from the Indian Society of Industrial and Applied
Mathematics during the 12th biannual conference of ISIAM at the University of Punjab, Patialia,
January 2015.
William Puech received the diploma of Electrical Engineering from the Univ. Montpellier,
France (1991) and a Ph.D. Degree in Signal-Image-Speech from the Polytechnic National Institute of
Grenoble, France (1997) with research activities in image processing and computer vision. He served
as a Visiting Research Associate to the University of Thessaloniki, Greece. From 1997 to 2008, he has
been an Associate Professor at the Univ. Montpellier, France. Since 2009, he is a full Professor in image
processing at the Univ. Montpellier, France. His current interests are in the areas of image forensics
and security for safe transfer, storage and visualization by combining data hiding, compression,
cryptography and machine learning. He is head of the ICAR team (Image and Interaction) in the
LIRMM and has published more than 45 journal papers and 140 conference papers and is associate
editor for 5 journals (JASP, SPIC, SP, JVCIR and IEEE TDSC) in the areas of image forensics and
security. Since 2017 he is the general chair of the IEEE Signal Processing French Chapter. He was a
member of the IEEE Information Forensics and Security TC between 2018 and 2020 and since 2021 he
is member of the IEEE Image, Video and Multidimensional Signal Processing TC.
vii
applied
sciences
Editorial
Special Issue on Cryptography and Its Applications in
Information Security
Safwan El Assad 1, *, René Lozi 2 and William Puech 3
1. Introduction
Nowadays, mankind is living in a cyber world. Modern technologies involve fast
communication links between potentially billions of devices through complex networks
(satellite, mobile phone, Internet, Internet of Things (IoT), etc.). The main concern posed by
these entangled complex networks is their protection against passive and active attacks
that could compromise public security (sabotage, espionage, cyber-terrorism) and privacy.
To face it, most of the world web traffic (digital multimedia contents such as images,
speech signal, videos, and emails) is protected against security threats, occurring among
different societies and within several societal levels. Even governments (rogue or not) and
some of their official agencies are suspected of promoting and actively participating in
the hacking of other government officials, democratic processes, industrial secrets, and
the citizens.
Citation: El Assad, S.; Lozi, R.; Puech, Thousands of private or official hackers target the sensitive information of citizens,
W. Special Issue on Cryptography industries, and governments. The threat is actual, and it is escalating year after year.
and Its Applications in Information The aim of this Special Issue on “Cryptography and its Applications in Information
Security. Appl. Sci. 2022, 12, 2588. Security” was to address the range of problems related to the security of information in
https://doi.org/10.3390/ networks and multimedia communications and to bring together researchers, practitioners,
app12052588
and industrials interested by such questions. Papers both from theoretical and practical
Received: 21 February 2022 aspects were welcome, including ongoing research projects, experimental results, and recent
Accepted: 25 February 2022 developments related to, but not limited to, the following topics: cryptography; chaos-based
Published: 2 March 2022 cryptography; block and stream ciphers; hash functions; steganography; watermarking;
selective encryption; multimedia data hiding and security; secure FPGA implementation for
Publisher’s Note: MDPI stays neutral
cryptographic primitives; security methods for communications; Wireless Network Security
with regard to jurisdictional claims in
published maps and institutional affil-
(Internet, WSNs, UMTS, WiFi, WiMAX, WiMedia, and others); sensor and mobile ad hoc
iations.
network security; security and privacy in mobile systems, secure cloud computing; security
and privacy in social networks, vehicular networks, Web services; database security and
privacy; intellectual property protection, lightweight cryptography for green computing;
personal data protection for information systems; protocols for security; cryptanalysis, side
Copyright: © 2022 by the authors. channel attack; fault injection attack; and physical layer security for communications.
Licensee MDPI, Basel, Switzerland.
This article is an open access article 2. The Papers
distributed under the terms and In this Special Issue, we received a total of 24 submissions and, after the peer review,
conditions of the Creative Commons accepted and published 8 outstanding papers that span across several interesting topics
Attribution (CC BY) license (https:// on security, relationship between chaos pseudo-random numbers and stream ciphers, and
creativecommons.org/licenses/by/
blockchain technologies.
4.0/).
1
Appl. Sci. 2022, 12, 2588. https://doi.org/10.3390/app12052588 https://www.mdpi.com/journal/applsci
Appl. Sci. 2022, 12, 2588
In the field of security, four papers are presented. The first one suggests employing a
homomorphic encryption (HE) scheme that can directly perform arithmetic operations on
ciphertexts without decryption to protect the model parameters. Using the HE scheme, the
proposed privacy-preserving federated learning (PPFL) algorithm enables a centralized
server to aggregate encrypted local model parameters without decryption. Furthermore,
the proposed algorithm allows each node to use a different HE private key in the same
FL-based system using a distributed cryptosystem [1].
A second paper in this field proposes a new anomaly detection algorithm for the
Intrusion Detection System (IDS), where a machine learning algorithm is applied to detect
deviations from legitimate traffic, which may indicate an intrusion. It involves a novel
approach based on the transformation of the network flow statistics to gray images on
which the Gray-Level Co-occurrence Matrix (GLCM) is applied together with an entropy
measure recently proposed in the literature—2D Dispersion Entropy. This approach is
assessed using the recently public IDS data set CIC-IDS2017. The results show that it is
competitive in comparison to other approaches proposed in the literature on the same data
set [2].
The main objective of the third paper is the classification of the Strongly Asymmetric
Public Key Agreement (SAPKA) algorithms. SAPKA is a class of key exchanges between
Alice and Bob that was introduced in 2011. The greatest difference from the standard
PKA algorithms is that Bob constructs multiple public keys and Alice uses one of these to
calculate her public key and her secret shared key. Therefore, the number of public keys
and calculation rules for each key differ for each user. Although algorithms with high
security and computational efficiency exist in this class, the relation between the parameters
of SAPKA and its security and computational efficiency has not yet been fully clarified. By
attempting algorithm attacks, the authors found that certain parameters are more strongly
related to security. On this basis, they construct concrete algorithms and a new subclass of
SAPKA, in which the responsibility of maintaining security is significantly more associated
with the secret parameters of Bob than those of Alice [3].
The last paper in security designs a secure chaos-based stream cipher (SCbSC) and
evaluates its hardware implementation performance in terms of computational complexity
and its security. The fundamental element of this system is the proposed secure pseudo-
chaotic number generator (SPCNG). The architecture of the proposed SPCNG includes
three first-order recursive filters, each containing a discrete chaotic map and a mixing
technique using an internal pseudo-random number (PRN). The three discrete chaotic
maps, namely, the 3D Chebyshev map (3D Ch), the 1D logistic map (L), and the 1D skew-
tent map (S), are weakly coupled by a predefined coupling matrix M. The mixing technique
combined with the weak coupling technique of the three chaotic maps allows the system to
be protected against side-channel attacks (SCAs) [4].
Linked to the topic of this paper, two other papers analyze the performances of stream
ciphers. In [5], the bit independence criterion, which was proposed to evaluate the security
of the S-boxes used in block ciphers, is assessed and improved. This paper proposes an
algorithm that extends this criterion to evaluate the degree of independence between the
bits of inputs and outputs of the stream ciphers. The effectiveness of the algorithm is
experimentally confirmed in two scenarios: random outputs independent of the input, in
which it does not detect dependence; and in the RC4 ciphers, where it detects significant
dependencies related to some known weaknesses. The complexity of the algorithm is
estimated based on the number of inputs l, and the dimensions, n and m, of the inputs and
outputs, respectively.
Alternatively, in [6], a novel intermittent jumping CML system based on multiple
chaotic maps is proposed. The intermittent jumping mechanism seeks to incorporate the
multi-chaos, and to dynamically switch coupling states and coupling relations, varying
with spatiotemporal indices. Extensive numerical simulations and comparative studies
demonstrate that, compared with the existing CML-based systems, the proposed system
has a larger parameter space, better chaotic behavior, and comparable computational
2
Appl. Sci. 2022, 12, 2588
complexity. These results highlight the potential of the proposal for deployment into an
image cryptosystem.
The third topic highlighted in this Special Issue is the blockchain theory, either for
digital cash or “digital authorization” for museums. Digital cash is a form of money that is
stored digitally. Its main advantage when compared to traditional credit or debit cards is
the possibility of carrying out anonymous transactions. Diverse digital cash paradigms
have been proposed during recent decades, providing different approaches to avoid the
double-spending fraud, or features such as divisibility or transferability. In [7], a new digital
cash paradigm that includes the so-called no-valued e-coins, which are e-coins that can be
generated free of charge by customers, is proposed. This new paradigm has also proven its
validity in the scope of privacy-preserving pay-by-phone parking systems, and the authors
believe it can become a very versatile building block in the design of privacy-preserving
protocols in other areas of research.
The American Alliance of Museums (AAM) recently stated that nearly a third of the
museums in the United States may be permanently closed since museum operations are
facing “extreme financial difficulties”, especially since the outbreak of COVID-19 at the
beginning of this year (2020). The research published in [8] aimed at museums using
the business model of “digital authorization”. It proposes an authorization mechanism
based on blockchain technology protecting the museums’ digital rights in the business
model and the application of cryptography. The signature and time stamp mechanism
achieve non-repudiation and a timeless mechanism, which combines blockchain and smart
contracts to achieve verifiability, non-forgery, decentralization, and traceability, as well as
the non-repudiation of the issue of cash flow with signatures and digital certificates, for the
digital rights of museums in business.
Author Contributions: All the editors have contributed equally. All authors have read and agreed to
the published version of the manuscript.
Funding: This research received no external funding.
Acknowledgments: This issue would not be possible without the contributions of the authors who
submitted their valuable papers. We would like to thank all reviewers and the editorial team of
Applied Sciences for their great work.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Park, J.; Lim, H. Privacy-Preserving Federated Learning Using Homomorphic Encryption. Appl. Sci. 2022, 12, 734. [CrossRef]
2. Baldini, G.; Ramos, J.L. Intrusion Detection Based on Gray-Level Co-Occurrence Matrix and 2D Dispersion Entropy. Appl. Sci.
2021, 11, 5567. [CrossRef]
3. Satoshi Iriyama, S.; Jimbo, K.; Regoli, M. New Subclass Framework and Concrete Examples of Strongly Asymmetric Public Key
Agreement. Appl. Sci. 2021, 11, 5540. [CrossRef]
4. Dridi, F.; El Assad, S.; Youssef, W.E.H.; Machhout, M.; Lozi, R. The Design and FPGA-Based Implementation of a Stream Cipher
Based on a Secure Chaotic Generator. Appl. Sci. 2021, 11, 625. [CrossRef]
5. Madarro-Capó, E.J.; Legón-Pérez, C.M.; Rojas, O.; Sosa-Gómez, G.; Socorro-Llanes, R. Bit Independence Criterion Extended to
Stream Ciphers. Appl. Sci. 2020, 10, 7668. [CrossRef]
6. Huang, R.; Han, F.; Liao, X.; Wang, Z.; Dong, A. A Novel Intermittent Jumping Coupled Map Lattice Based on Multiple Chaotic
Maps. Appl. Sci. 2021, 11, 3797. [CrossRef]
7. Ricard Borges, R.; Sebé, F. A Digital Cash Paradigm with Valued and No-Valued e-Coins. Appl. Sci. 2021, 11, 9892. [CrossRef]
8. Wang, Y.-C.; Chen, C.-L.; Deng, Y.-Y. Authorization Mechanism Based on Blockchain Technology for Protecting Museum-Digital
Property Rights. Appl. Sci. 2021, 11, 1085. [CrossRef]
3
applied
sciences
Article
Privacy-Preserving Federated Learning Using
Homomorphic Encryption
Jaehyoung Park 1 and Hyuk Lim 2, *
1 School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST),
Gwangju 61005, Korea; [email protected]
2 AI Graduate School, Gwangju Institute of Science and Technology (GIST), Gwangju 61005, Korea
* Correspondence: [email protected]
Abstract: Federated learning (FL) is a machine learning technique that enables distributed devices
to train a learning model collaboratively without sharing their local data. FL-based systems can
achieve much stronger privacy preservation since the distributed devices deliver only local model
parameters trained with local data to a centralized server. However, there exists a possibility that a
centralized server or attackers infer/extract sensitive private information using the structure and
parameters of local learning models. We propose employing homomorphic encryption (HE) scheme
that can directly perform arithmetic operations on ciphertexts without decryption to protect the
model parameters. Using the HE scheme, the proposed privacy-preserving federated learning (PPFL)
algorithm enables the centralized server to aggregate encrypted local model parameters without
decryption. Furthermore, the proposed algorithm allows each node to use a different HE private
key in the same FL-based system using a distributed cryptosystem. The performance analysis and
evaluation of the proposed PPFL algorithm are conducted in various cloud computing-based FL
service scenarios.
5
Appl. Sci. 2022, 12, 734. https://doi.org/10.3390/app12020734 https://www.mdpi.com/journal/applsci
Appl. Sci. 2022, 12, 734
can update the global model parameters with the encrypted local gradients based on the
homomorphic operation. Therefore, the distributed devices participating in FL, which
we refer to as clients, do not have to concern about data leakage through local gradients
because they deliver encrypted local gradients to the server. However, the clients must
share the same private key in the FL-based system because homomorphic operations can
only be performed between values encrypted with the same public key. In FL-based systems
where many distributed devices, such as smartphones and Internet of Things (IoT) devices
participate, the same private key for decryption can be distributed to many clients. Suppose
the same private key is shared with many clients. Then, the probability of the private
key being leaked or a malicious participant accessing other participants’ data increases,
which can weaken privacy protections in FL-based systems. As the result, stealing one
client’s private key can nullify the data privacy protection of all clients participating in FL
systems. To overcome this vulnerability, this paper proposes a privacy-preserving federated
learning (PPFL) algorithm that allows a cloud server to update global model parameters by
aggregating local parameters encrypted by different HE keys in the same FL-based system
using homomorphic operations based on a distributed cryptosystem.
This paper is organized as follows. In Section 2, we present related works on FL and
the privacy issues of FL. Section 3 describes the preliminaries for understanding the FL
algorithm and the cryptosystem for homomorphic operations. In Section 4, we describe
the system and attack models for the proposed PPFL algorithm. Next, Section 5 explains
the proposed PPFL algorithm using the distributed cryptosystem based on an additive
homomorphic encryption (AHE) scheme. Afterwards, Section 6 presents a theoretical
analysis of the proposed PPFL algorithm, and Section 7 presents experimental results
to verify the performance of the proposed PPFL scheme. Finally, Section 8 concludes
the paper.
2. Related Work
FL is one possible solution for preserving privacy in the machine learning field because
the clients participating in the training process deliver only local model parameters trained
with local data to a centralized server. McMahan et al. demonstrated the feasibility of FL
by conducting empirical evaluation in various FL scenarios in [1]. Since then, many studies
have been conducted to improve FL performance for learning accuracy, fairness, robustness,
security, and privacy in various environments, such as IoT, edge, and cloud computing
in [4–6]. In [7], a lightweight federated multi-task learning framework was proposed to
provide fairness among participants and robustness against a poisoning attack that reduces
learning accuracy. In [8], an FL framework using device-to-device communication was
proposed to overcome the degradation in energy and learning efficiency due to frequent
uplink transmissions between participants and physically distant central servers.
The studies on FL can be classified according to how they collect and process data for
FL. In a case where the data have the same feature space and a different sample space, it is
classified as horizontal FL, and in a case where the data have a different feature space and
the same sample space, it is classified as vertical FL in [9]. In vertical FL, data alignment
must be performed for vertical data utilization by sharing several different feature spaces.
In this process, privacy is not protected because row data exchange may be required. For
preserving data privacy, HE-based vertical FL algorithms were implemented by utilizing
a trusted third party [4,9–11]. An approach to collaboratively train a high-quality tree
boosting model was proposed to simplify FL-based systems by omitting third parties and
showed that the performance of the proposed scheme was as accurate as the performance
of centralized learning techniques in [12]. Horizontal FL is an algorithm in which multiple
devices train a learning model using local data with the same feature space and share
the trained model data to train a global model, and the scheme presented in [1] was
a representative horizontal FL. The horizontal FL can be implemented without a data
alignment process because it has the same feature space. Although many studies have been
conducted for the development of FL, privacy threats still exist in FL. It was shown that
6
Appl. Sci. 2022, 12, 734
sensitive private data could be leaked through the local gradients in [2,13], and participants’
data can be inferred through a generative adversarial network using the global and the
local model parameters in [3].
Several studies have been conducted to solve the privacy issues associated with the
local model parameters in an FL-based system in [2,13–15]. Shokri and Shmatikov proposed
a privacy-preserving deep learning (PPDL) algorithm where several distributed participants
collaborate to train a deep learning model using local data; they established a trade-off
relationship between practicality and security for the number of clients participating in
the training process in [13]. Moreover, Aono et al. suggested a PPDL algorithm that
encrypts local model parameters using HE schemes to protect the local and global model
parameters in [2]. In the algorithm proposed in [2], strict key management is required and
reliable channels for conveying ciphertexts must be established because all participants use
the same private key for HE. In [16], HE-based federated learning was proposed, and its
overhead was analyzed. However, all clients participating in training still use the same key
in the system. In [14], based on Shamir’s t-out-of-n Secret Sharing in [17], they presented
an algorithm that allows the server to perform updates using local model parameters
containing noise that can be canceled out through the cooperation of the participants in
an FL-based system. The scheme proposed in [14] can prevent leakage of local model
parameters due to the noise contained in the local parameter but can be vulnerable to
insider attacks because participants must actively cooperate. Recently, Xu et al. offered a
technique in which participants verify the integrity of the updated results in the system
that updates the global model parameters using the Secret Sharing scheme in [15].
The algorithms using the HE scheme in [2] and Secret Sharing in [14] have shown that
neural networks can be safely trained without personal information leakage in FL scenarios.
In [2], all training process participants owned the same private key, although the distributed
deep learning system using the HE scheme was designed to protect shared data. For this
reason, all channels between participants and servers must be protected using transport
layer security or secure socket layer. However, as the number of participants increases, the
cost to establish secure channels becomes very high. In addition, since the probability of
one participant’s private key being leaked is the same as the probability of all participant’s
private keys being leaked in this system, the risk of private key leakage increases as the
number of participants grows. In the proposed system, each participant can own different
private keys on the same FL-based system. In [14], at least half of the participants must
guarantee their honesty for privacy preservation. If the number of participants in FL is large,
this assumption is reasonable, but there can be a variety of FL scenarios. The proposed
system allows participants to preserve data privacy regardless of the number of honest
participants and can be utilized as a solution to build a flexible PPFL-based system.
This paper presents a PPFL algorithm based on a distributed cryptosystem using
the AHE scheme to protect the local and global model parameters. The participant uses
the HE scheme to encrypt the local model parameters with its private key, and the cloud
server updates the global model parameters with the local model parameter encrypted with
different keys based on the distributed cryptosystem. The proposed PPFL-based system
can achieve robust privacy protection because the proposed algorithm can allow each node
to use a different private key for the HE scheme in the same FL-based system. Furthermore,
a highly flexible FL-based system can be built using our algorithm because clients only
need to encrypt and decrypt model parameters to protect them.
3. Preliminary
3.1. Federated Learning
In FL, multiple distributed servers or devices with local data train a machine learning
model without exchanging local data. Distributed servers or devices share only local model
parameters obtained by training a global learning model delivered from a centralized server
with local data, allowing them to participate in the training process without concern about
data leakage. The centralized server aggregates locally trained model parameters to update
7
Appl. Sci. 2022, 12, 734
the global model and delivers the updated global parameters to distributed servers or
devices to perform the training process again. This procedure is repeated until convergence
is achieved.
According to the data distribution characteristics, FL can be categorized into horizontal
federated learning, vertical federated learning, and federated transfer learning in [9].
Horizontal and vertical FL algorithms are applied when local datasets have the same
feature space and different sample spaces and when local datasets have different feature
spaces and the same sample space, respectively. Federated transfer learning is applied to a
scenario where the local datasets have varying features and minimal overlapping samples.
In this case, federated transfer learning utilizes the transfer learning techniques in [18] for
FL-based systems. We consider horizontal FL in this paper. In other words, we assume
that the local datasets have the same feature space and different sample spaces, and we
consider an FL scenario in which many clients, including smartphones and IoT devices,
participate in the training process.
where Dski (·) is a decryption function using a private key sk i , E pki (·) is an encryption
function using a public key pk i , and mi is a plaintext. The cloud server can perform
homomorphic addition operations without decryption using (1). Subsequently, fully ho-
momorphic encryption (FHE) capable of both addition and multiplication operations was
established in [20] to overcome the limitations of PHE, which is challenging to implement
various homomorphic operations in [21]. FHE enables a variety of operations to be imple-
mented using addition and multiplication operations. These HE technologies have led to
the development of PPML algorithms in the cloud and machine learning fields.
8
Appl. Sci. 2022, 12, 734
(1) (2) (N )
[ psk i , psk i , · · · , psk i s ] for distributed servers can be obtained by splitting the
private key sk i , where Ns is the number of distributed servers [23]. We select δ that
satisfies δ ≡ 0 mod sk i and δ ≡ 1 mod K2 at the same time and select y random num-
bers { a1 , a2 , . . . , ay } from Z∗sk K2 . Then, we use these values to define the polynomial
i
y ( j)
p( x ) = δ + ∑i=1 ai xi . The partial private key psk i is obtained by calculating the
polynomial p( x j ) using a non-zero value x j from Z∗sk K2 .
i
• Encryption: Function that generates a ciphertext E pki (m) ∈ Z∗K2 for a plaintext m ∈ ZK ,
using a public key pk i , where the key size K is p · q. For simplicity, the ciphertext
E pki (m) can be represented by [m]i ;
• Decryption: Function that decrypts a ciphertext [m]i using a private key sk i and
returns m;
• Partial decryption: Function that generates a partially decrypted ciphertext by par-
(k)
tially decrypting [m]i using a partial private key psk i , k ∈ {1, · · · , Ns − 1}, as shown
in Figure 1. For simplicity, the partially decrypted ciphertext PD (k) ([m]i ) can be
psk i
denoted by PD k ([m]i );
• Combined decryption: Function that obtains and returns m using ( Ns − 1) partially
decrypted ciphertexts PD k ([m]i ) for ∀k ∈ {1, · · · , Ns − 1} and the partially private key
(N )
psk i s . Note that a vector PD([m]i ) signifies [ PD1 ([m]i ), PD2 ([m]i ), · · · , PD Ns −1 ([m]i )]
in Figure 1.
FLSKHUWH[W
ܲܦ௦ ሺభሻ ሺܧ ሺ݉ሻሻ
SODLQWH[W
ܲܦ௦ ሺమሻ ሺܧ ሺ݉ሻሻ ܦܥ௦ ሺಿೞሻ ࡼࡰሺሾሿ ሻ
Using the DHC, we have established a PPFL-based system in which the parties can
jointly perform a global model update based on homomorphic operations to preserve the
data privacy of the participants in the FL training process. The proposed PPFL algorithm is
explained in detail in Section 5.
9
Appl. Sci. 2022, 12, 734
for the private key delivery are built using secure sockets layer or transport layer security
protocol. Since the private key delivery is performed intermittently, the possibility of an
attacker stealing the private key is extremely low in the system. The cloud server and
computation provider collaborate with each other to update the global model using the
model parameters encrypted with different private keys from clients. Once the cloud
server receives a set of model parameters from a client, it adds a random noise encrypted
with the client’s public key to the set of model parameters, partially decrypts it with the
client’s partial private key, and delivers it to the computation provider. The computation
provider server obtains the partial decrypted sets of model parameters for the clients and
decrypts them using the other partial private keys of the clients. Finally, the computation
provider performs the model aggregation and encrypts it with the public key for each
client, and returns it to the cloud server. The cloud server removes the random noise from
the encrypted global model and sends it to each client. The detailed update process is
described in Section 5.3.
Figure 2 depicts a simple system model comprising a key generation center (KGC),
cloud server (CS), computation provider (CP), and multiple clients. The KGC is a trusted
organization that performs authentication procedures for clients and servers and generates
key pairs. The CS is responsible for securely combining the trained parameters on the clients
and can select clients at every iteration for FL. The CP communicates directly with the CS
and provides computational resources for requests of the CS. A single CP or multiple CPs
can exist in the system, and the CPs and the CS perform cooperative encryption described
in Section 3.3. Clients own each private key for decryption and perform local training with
local datasets. In this system model, we make the following assumptions:
• The CS, CP, and the clients may attempt to abuse each others’ data;
• Both the CS and CP are not simultaneously compromised by attackers;
• The CS and the CP do not cooperate to access client information.
3XEOLF SDUWLDOSULYDWHNH\
-G[IGPGTCVKQPEGPVGT %QORWVCVKQPRTQXKFGT
0LGGOHUHVXOWV
3XEOLFSULYDWH
NH\V
%NQWFUGTXGT
*OREDOSDUDPHWHUV
/RFDOSDUDPHWHUV
10
Appl. Sci. 2022, 12, 734
11
Appl. Sci. 2022, 12, 734
trained local model parameters. Thereafter, the clients transmit the encrypted local model
parameters to the CS. The CS updates the global model parameters with the local model
parameters encrypted with different keys by exploiting the partial homomorphic decryption
capabilities of CPs. As a result, the proposed PPFL algorithm ensures data confidentiality
between the CS and the clients, as well as data confidentiality among the clients because
each client has its own private key and does not send the private key to other third parties.
The detailed procedure of the proposed PPFL is described in the following subsections.
12
Appl. Sci. 2022, 12, 734
addition, where r is a random integer number, and then the CS can obtain E pki (m + r ). The
CS then forwards the ciphertext to the CPs. The (Ns − 1) CPs perform partial decryptions,
and the other CP performs the combined decryption to obtain the sum of the local value
and random noise (m + r ). As explained in Figure 3, the CP can calculate the average of
the sum of the local value and random noise (m ave + r ave ) when receiving the sum from
multiple clients. The sum’s average is encrypted and sent back to the CS. Finally, the CS
can remove the average of random values from the sum’s average through homomorphic
addition and obtain the encrypted average local value E pki (m ave ) since the CS has the
random values.
ܧ ሺ݉ሻ ܧ ݉ ܧ ሺݎሻ ܲܦ௦ భ ሺܧ ሺ݉ ݎሻሻ ܦܥ௦ ሺಿೞሻ ࡼࡰሺሾ ࢘ሿ ሻ
Algorithm 1 describes the proposed secure averaging local model algorithm where
one CS and one CP exist. In the proposed algorithm, local model parameters encrypted
with different keys are input, and global model parameters encrypted with different keys
(1) (2)
are output. The CS and CP have the partial private keys psk i and psk i , respectively. The
detailed procedure of the secure averaging local model vector algorithm is as follows:
1. The CS receives the encrypted local model vectors from the clients participating in the
training process. Note that the local model vector encrypted with the public key of
the i-th client is represented as [Wli ]i . Thereafter, the CS generates Nc random vectors
with the same size as the local model vector and encrypts them using the client’s
public key, as shown in lines 2–3 of Algorithm 1;
2. The CS performs homomorphic addition operations with the encrypted local model
vectors using the encrypted random vectors in line 5. The result of homomorphic
addition between [Wli ]i and [ Ri ]i is represented as [Si ]i . Then, the CS partially decrypts
(1)
the result of the homomorphic addition using the partial private key psk i in line 6.
This process is repeated for Nc local model vectors. Subsequently, the CP sends the
partially decrypted vectors [ PD1 ([S1 ]1 ), PD1 ([S2 ]2 ), · · · , PD1 ([S Nc ] Nc )] to the CP, as
shown in lines 8–9;
3. As shown in lines 10–11, the CP partially decrypts the partially decrypted vectors
(2)
using the partial private key psk i and obtains [(Wl1 + R1 ), · · · , (WlNc + R Nc )]. After-
wards, the CP adds all the decrypted vectors and divides the sum by the number of
clients Nc to obtain a vector containing the average parameters in line 12. The result
is represented as Wsum ;
4. The CP encrypts Wsum using the public keys of the Nc clients in lines 13–15 and sends
the encrypted vectors [[Wsum ]1 , [Wsum ]2 , · · · , [Wsum ] Nc ] to the CS in line 16;
5. Finally, the CS calculates the encrypted average global model vectors for the clients by
performing the homomorphic addition operation with the encrypted sum of random
noises [ ∑kN−c Rk ]i , as shown in lines 17–19.
13
Appl. Sci. 2022, 12, 734
After updating the global model vector by performing the proposed secure averaging
local model vector algorithm, the CP sends the updated global model vector to the clients
for the next federated round. The clients execute the local training process using the
updated global model vector as shown in Section 5.2 after decrypting the encrypted global
model vector using its own private key. Thereafter, they send the newly trained local model
vector to the CS, and then the CS and CP work together to update the global model vector.
These procedures are repeated until convergence is achieved.
6. Performance Analysis
6.1. Computational Overhead
6.1.1. Computational Overhead on the Client’s Side
In the proposed PPFL algorithm, additional encryption operations are performed to
protect the trained local model parameters, and extra decryption operations are performed
14
Appl. Sci. 2022, 12, 734
to reflect the global model parameters to the learning model on the client’s side. In the PCK
scheme used in the proposed algorithm, the exponentiation operation has a dominant effect
on encryption and decryption. The exponentiation operation gr requires 1.5 × Nr multipli-
cations, where g is a generator of order ( p − 1)(q − 1)/2, r ∈ ZK is a random number, and
Nr is the length of r in the DHC scheme in [21]. Thus, the computational complexity of the
encryption operation in the proposed PPFL algorithm is given as O( Nr · Nw ), where Nw is
the number of elements of the local model vector. Similarly, the computational complexity
of the decryption in the proposed PPFL algorithm is also represented as O( Nr · Nw ).
6.2.2. Communication Overhead between the Cloud Server and the Computation Provider
In order to perform the proposed secure aggregation operation for local model vectors,
we exploit the partial homomorphic decryption capabilities of CPs. The CS sends partially
decrypted ciphertexts to the CP and receives ciphertexts from the CP in the proposed
PPFL-based system. The length of the partially decrypted ciphertext is also 2 × K bits
in [21]. As shown in Algorithm 1, the amount of information communicated between the
CS and the CP increases as the number of clients and model parameters increases. Thus, the
communication overhead between the CS and the CP can be represented as O( Nc · Nw ) bits.
15
Appl. Sci. 2022, 12, 734
tional overhead comparison between the PPDL system and the proposed system is shown
in Section 7.1, and the communication overhead between the CS and the CP is shown in
Table 1.
16
Appl. Sci. 2022, 12, 734
7. Performance Evaluation
In this section, we have developed the proposed algorithm using Python and evaluated
the performance on a workstation (3.6 GHz quad-core processor and 8 GB RAM) in terms
of computation and communication overhead.
1600
Time spent for encryption
Time spent for decryption
1400
1200
Running time (ms)
1000
800
600
400
200
0
0 2000 4000 6000 8000 10000 12000 14000 16000
Key size
Figure 4. Running time to execute homomorphic encryption and decryption with respect to the
key size.
Figure 5 shows the running times measured for performing the proposed secure
averaging local model algorithm in Algorithm 1 with respect to the number of clients in
the cryptosystem with different key sizes. The convolutional neural network with 105,506
parameters was used for the simulation study, and the data length used for representing
weights was set to 16 bits. As the number of clients increases, the number of homomorphic
operations increases as the number of parameters to protect using the HE scheme increases,
and thus the running time increases linearly. In Figure 5, the running time increases as the
key size increases. In fact, if the key size is larger, the total number of ciphertexts to be
delivered is smaller. However, as shown in Figure 4, the running time of homomorphic
operations increases exponentially as the key size increases. As a result, the total running
time of Algorithm 1 increases as the key size increases. Nevertheless, as the key size
increases, the security level of the system increases. This is because the higher the key size,
the greater the number of cases is required to break the cryptosystem. Thus, because the
computational burden and security gain have a trade-off relationship, we can select an
appropriate key size according to system requirements.
17
Appl. Sci. 2022, 12, 734
14000
K = 1024
K = 2048
12000 K = 3072
K = 7680
10000 K = 15360
Running time (s)
8000
6000
4000
2000
0
5 10 15 20 25 30
Number of clients
Figure 5. Running time to execute the proposed secure averaging local model algorithm with respect
to the number of clients in the cryptosystem with different key sizes.
Figure 6 shows the running time for performing the proposed algorithm with respect
to the neural network size of the federated learning. The number of clients was set to 10,
and the data length used for representing weights was set to 16 bits for the simulation study.
As the number of parameters increases, the running time increases because the amount of
information to be processed by the homomorphic operation increases.
4500
K = 1024
4000 K = 2048
K = 3072
3500 K = 7680
K = 15360
Running time (s)
3000
2500
2000
1500
1000
500
0
0 20000 40000 60000 80000 100000
Number of parameters
Figure 6. Running time to execute the proposed secure averaging local model algorithm with respect
to the neural network size in the cryptosystem with different key sizes.
18
Appl. Sci. 2022, 12, 734
processes are performed at the servers to make clients have different keys in the proposed
system. Despite the greater computational overhead of the proposed algorithm, the security
intensity of FL systems is significantly improved because the clients use different private
keys. Therefore, the proposed system can be deployed in a more adversarial environment
where there exist many malicious clients and they are difficult to be identified. In future
work, we will research how to reduce the overhead in PPFL while retaining the same strong
security level.
1400
Proposed PPFL
PPDL
1200
1000
Running time (s)
800
600
400
200
0
5 10 15 20 25 30
Number of clients
Figure 7. Running time to execute the proposed PPFL and the PPDL using the Paillier cryptosystem
with respect to the number of clients.
19
Appl. Sci. 2022, 12, 734
required to improve security level in FL were theoretically analyzed, and the performance
of the proposed PPFL algorithm in terms of overhead was evaluated via simulations. In the
future, our research focuses on how to further reduce the computation and communication
costs in the proposed PPFL algorithm while retaining privacy preservation of clients, and
also focuses on how to determine an appropriate number of clients participating in FL to
expedite the learning and to reduce latency of FL-based services.
Author Contributions: Conceptualization, J.P. and H.L.; methodology, J.P.; investigation, J.P.; formal
analysis, J.P. and H.L.; validation, J.P. and H.L.; writing—original draft preparation, J.P.; writing—
review and editing, H.L.; and supervision, H.L. All authors have read and agreed to the published
version of the manuscript.
Funding: This work was supported by Institute of Information and Communications Technology
Planning and Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2021-0-00379,
Privacy risk analysis and response technology development for AI systems).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from
decentralized data. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL,
USA, 20–22 April 2017; pp. 1273–1282.
2. Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans.
Inf. Forensics Secur. 2017, 13, 1333–1345.
3. Hitaj, B.; Ateniese, G.; Perez-Cruz, F. Deep models under the GAN: Information leakage from collaborative deep learning.
In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3
November 2017; pp. 603–618.
4. Zhang, C.; Xie, Y.; Bai, H.; Yu, B.; Li, W.; Gao, Y. A survey on federated learning. Knowl.-Based Syst. 2021, 216, 106775. [CrossRef]
5. Khan, L.U.; Saad, W.; Han, Z.; Hossain, E.; Hong, C.S. Federated learning for internet of things: Recent advances, taxonomy, and
open challenges. IEEE Commun. Surv. Tutor. 2021, 23, 1759–1799. [CrossRef]
6. Lin, J.C.W.; Srivastava, G.; Zhang, Y.; Djenouri, Y.; Aloqaily, M. Privacy-preserving multiobjective sanitization model in 6G IoT
environments. IEEE Internet Things J. 2020, 8, 5340–5349. [CrossRef]
7. Li, T.; Hu, S.; Beirami, A.; Smith, V. Ditto: Fair and robust federated learning through personalization. In Proceedings of the
International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 6357–6368.
8. Lin, F.P.C.; Hosseinalipour, S.; Azam, S.S.; Brinton, C.G.; Michelusi, N. Semi-decentralized federated learning with cooperative
D2D local model aggregations. IEEE J. Sel. Areas Commun. 2021, in press. [CrossRef]
9. Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol.
(TIST) 2019, 10, 1–19. [CrossRef]
10. Ou, W.; Zeng, J.; Guo, Z.; Yan, W.; Liu, D.; Fuentes, S. A homomorphic-encryption-based vertical federated learning scheme for
rick management. Comput. Sci. Inf. Syst. 2020, 17, 819–834. [CrossRef]
11. Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. Batchcrypt: Efficient homomorphic encryption for cross-silo federated learning.
In Proceedings of the 2020 USENIX Annual Technical Conference (USENIXATC 20), Online, 15–17 July 2020; pp. 493–506.
12. Cheng, K.; Fan, T.; Jin, Y.; Liu, Y.; Chen, T.; Papadopoulos, D.; Yang, Q. Secureboost: A lossless federated learning framework.
IEEE Intell. Syst. 2021, in press. [CrossRef]
13. Shokri, R.; Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the ACM SIGSAC Conference on Computer and
Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 1310–1321.
14. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure
aggregation for privacy-preserving machine learning. In Proceedings of the ACM SIGSAC Conference on Computer and
Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191.
15. Xu, G.; Li, H.; Liu, S.; Yang, K.; Lin, X. Verifynet: Secure and verifiable federated learning. IEEE Trans. Inf. Forensics Secur. 2019,
15, 911–926. [CrossRef]
16. Fang, H.; Qian, Q. Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning. Future Internet
2021, 13, 94. [CrossRef]
17. Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [CrossRef]
18. Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [CrossRef]
20
Appl. Sci. 2022, 12, 734
19. Paillier, P. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the International Conference
on the Theory and Application of Cryptology and Information Security, Singapore, 14–18 November 1999; pp. 223–238.
20. Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the Annual ACM Symposium on Theory of
Computing, Bethesda, MD, USA, 31 May–2 June 2009; pp. 169–178.
21. Liu, X.; Deng, R.H.; Choo, K.K.R.; Weng, J. An efficient privacy-preserving outsourced calculation toolkit with multiple keys.
IEEE Trans. Inf. Forensics Secur. 2016, 11, 2401–2414. [CrossRef]
22. Katz, J.; Lindell, Y. Introduction to Modern Cryptography; CRC Press: Boca Raton, FL, USA, 2020.
23. Liu, X.; Choo, K.K.R.; Deng, R.H.; Lu, R.; Weng, J. Efficient and privacy-preserving outsourced calculation of rational numbers.
IEEE Trans. Dependable Secur. Comput. 2018, 15, 27–39. [CrossRef]
24. Barker, E.; Barker, E.; Burr, W.; Polk, W.; Smid, M. Recommendation for Key Management: Part 1: General; National Institute of
Standards and Technology, Technology Administration: Gaithersburg, MD, USA, 2006.
25. Bresson, E.; Catalano, D.; Pointcheval, D. A simple public-key cryptosystem with a double trapdoor decryption mechanism and
its applications. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information
Security, Taipei, Taiwan, 30 November–4 December 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 37–54.
21
applied
sciences
Article
Intrusion Detection Based on Gray-Level Co-Occurrence Matrix
and 2D Dispersion Entropy
Gianmarco Baldini 1, *, Jose Luis Hernandez Ramos 1 and Irene Amerini 2
Abstract: The Intrusion Detection System (IDS) is an important tool to mitigate cybersecurity threats
in an Information and Communication Technology (ICT) infrastructure. The function of the IDS is to
detect an intrusion to an ICT system or network so that adequate countermeasures can be adopted.
Desirable features of IDS are computing efficiency and high intrusion detection accuracy. This paper
proposes a new anomaly detection algorithm for IDS, where a machine learning algorithm is applied
to detect deviations from legitimate traffic, which may indicate an intrusion. To improve computing
efficiency, a sliding window approach is applied where the analysis is applied on large sequences
of network flows statistics. This paper proposes a novel approach based on the transformation of
the network flows statistics to gray images on which Gray level Co-occurrence Matrix (GLCM) are
applied together with an entropy measure recently proposed in literature: the 2D Dispersion Entropy.
This approach is applied to the recently public IDS data set CIC-IDS2017. The results show that the
Citation: Baldini, G.; Hernandez proposed approach is competitive in comparison to other approaches proposed in literature on the
Ramos, J.L.; Amerini, I. Intrusion same data set. The approach is applied to two attacks of the CIC-IDS2017 data set: DDoS and Port
Detection Based on Gray-Level Scan achieving respectively an Error Rate of 0.0016 and 0.0048.
Co-Occurrence Matrix and 2D
Dispersion Entropy. Appl. Sci. 2021, Keywords: intrusion detection systems; security; machine learning; communication
11, 5567. https://doi.org/10.3390/
app11125567
23
Appl. Sci. 2021, 11, 5567. https://doi.org/10.3390/app11125567 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 5567
This paper focuses on an anomaly detection approach where the network flows data
is collected in windows of fixed size, which are then converted to gray images on which the
Gray level Co-occurrence Matrix (GLCM) is calculated. Then, the features (e.g., contrast) of
the GLCM are used as an input to a machine learning algorithm for the threat detection. In
addition, the 2D Dispersion Entropy (2DDE) recently introduced in [4] is also calculated as
additional feature of the GLCM. To the knowledge of the authors, this approach is novel in
IDS literature both from the point of view of the application of GLCM and the application
of 2D Dispersion Entropy. The application of the sliding window and the GLCM allows a
significant dimensionality reduction. First of all, the number of samples of the data set is
reduced by the size of the sliding window (WS in the rest of this paper). For example, the
data from the IDS is processed in windows of size WS = 100 ∗ number of features of the
data set (NF = 78 for the data set used in this paper). Then, the window data is converted
to a grayscale image, which implies a further dimensionality reduction because the output
of GLCM is a matrix of size Q F ∗ Q F where Q F is the quantization factor of GLCM. Then,
the GLCM features (e.g., contrast, Shannon entropy) plus the 2DDE applied to GLCM is
calculated to implement an additional dimensionality reduction step. Finally, the reduced
data set is provided as an input to a machine learning algorithm. The application of
the Sequential Feature Selection (SFS) algorithm (a wrapper feature selection algorithm)
further reduces the number of features. The challenge is to preserve the discriminating
characteristics in the data set, which allows to detect with significant accuracy the attack.
The rationales for the approach proposed in this paper are following: the first reason
is related to the choice of using the GLCM beyond the need for dimensionality reduction
as explained above. The idea is that the sequential structure of the network flows, in
case of an intrusion, is altered in comparison to the legitimate traffic. Since the GLCM is
created by calculating how often pairs of pixels with a specific value and offset occur in
the image, the underlying idea of the approach is that numbers of pairs of pixels will be
altered when an attack is implemented. Such changes will be reflected in the frequencies
of the number of pairs, which (in turn) will have an impact on GLCM features (e.g.,
contrast) or information theory measures like entropy. The second reason for the proposed
approach is that the classical Shannon entropy measure is only based on the histogram
of GLCM elements while it would also be valuable to evaluate the sequences of GLCM
elements since they may provide further information on the presence of the attack. For this
reason, the 2D Dispersion Entropy (2DDE) was introduced in the study. As described in
Section 3.4 later in this paper, 2DDE allows to analyze irregularity of images on the basis
of the frequency of patterns in the image, which can provide more information than the
classical Shannon entropy.
This study uses the CIC-IDS2017 data set [5], which has been recently published (2017)
and it has been increasingly used by the IDS research community.
The results shown in this paper demonstrate that this approach manages to remain
competitive in terms of detection performance in comparison to more sophisticated and
computing demanding approaches based on Deep Learning (DL) applied to the same data
set [6,7].
To summarize, the contributions of this paper are following:
• GLCM is applied to an IDS problem where the network traffic features are transformed
to grayscale images on which GLCM is applied. An extensive evaluation of the GLCM
hyperparameters on detection accuracy is implemented. To the knowledge of the
authors this is the first time that the GLCM in combination with 2DDE is used for
the IDS problem. This is also the first time that the authors submitted this study for
review and the authors did not publish this work before.
• 2D Dispersion Entropy (2DDE) is used as additional GLCM feature. We demonstrate
that the use of this entropy measure contributes significantly to the capability of the
proposed approach to detect a cybersecurity attack.
• The study uses the recent IDS CIC-IDS2017 data set instead of older data sets, which
may not be representative any longer of modern networks.
24
Appl. Sci. 2021, 11, 5567
We highlight that the approach is based only on the network flow features and it does
not attempt to perform a deep-packet inspection on the network traffic. In addition, it is
limited in scope to two specific attacks of the CIC-IDS2017 data set: DDoS and Port Scan
attack since they are the ones with the most significant number of samples in the data set
and they are the ones where the research community has given much attention [7–10], which
is relevant for the comparison of the results of this paper with literature (see Section 4).
The structure of this paper is the following: Section 2 provides the literature review.
Section 3 describes the overall workflow of the approach, the concept of GLCM, the defini-
tion of 2D Dispersion Entropy and the materials (i.e., CICIDS2017 data set) used to evaluate
the approach. In addition, Section 3 describes the machine learning algorithms adopted
for the detection and the evaluation metrics. Section 4 presents the results, including the
findings from the hyperparameters optimization phase and the comparison to the other
approaches used in literature. Finally Section 5 concludes this paper.
2. Related Works
IDS have been proposed in literature for more than 20 years. As described in [1],
IDS performs the essential function to detect unauthorized intruders and attacks to a
wide scope of electronic devices and systems: from computers, to network infrastructures,
ad-hoc networks an so on. From that seminal survey, many different types of IDS have
been proposed and various classifications of IDS can be found in literature. One early
classification in [1] defines two main IDS categories: offline IDS where the analysis of logs
and audit records is performed some time after the traffic network operation (e.g., the
analysis is executed the day after the network or computer system activity) and the online
(or real-time) IDS where the analysis is performed directly on the traffic or immediately
after the traffic features are calculated (e.g., average duration of the packets or average
time of the connection). For example, the online IDS performs the analysis on a single
or a set of observations (e.g., network flows) at the time after an initial training phase,
while the offline IDS analyzes all the observations of the day before. More recent surveys
like [11–13] provide different taxonomies for IDS. For example, IDSs can be classified
in the category of signature detection or anomaly detection. In signature detection, the
intrusion is detected when the system or network behavior matches an attack signature
stored in the IDS internal databases. Signature-based IDSs have the advantage that they
can be very accurate and effective at detecting known threats, and their mechanism is
easy to understand. On the other side, signature-based IDSs are ineffective to detect new
attacks and variants of known attacks, because a matching signature for these attacks is still
unknown. In anomaly detection, the activities of a system at an instant (e.g., an observation
or a set of observations of network traffic) are compared against the normal behavior profile
calculated in a training phase against legitimate traffic. Machine Learning (ML) or DL can
be used to evaluate how traffic samples are different from legitimate traffic and they can
be used to classify the network traffic in the proper category. The disadvantages of the
anomaly detection approach are the significant computing effort, the difficulty to define
the proper model and the potential high number of False Positives (FP) [14].
The method proposed in this paper is anomaly detection, where a dimensionality
reduction is performed to improve the detection time and accuracy. The dimensionality
reduction is implemented using a sliding window approach where the initial data samples
(the network flows data) are collected in windows of size WS (this is the name of the
parameter used in the rest of this paper). Then, features are calculated on the window
set of data. This approach has been already used in literature to achieve dimensionality
reduction [2,3]. In the rest of this section, we identify some key studies with a specific focus
on IDS approaches based on the sliding window concept and/or the use of entropy mea-
sures. We also report on studies where image-based approaches are used in combination
with ML or DL.
Shannon entropy is usually adopted as a feature calculated on the windowed set of
data. The reason is that intrusion attacks have been demonstrated to alter the entropy
25
Appl. Sci. 2021, 11, 5567
of the network flows traffic. For example, the authors in [15] have proposed a detection
method called D-FACE to differentiate legitimate traffic and DDoS attacks. The method
compares the Shannon entropy calculated on the source IP data of the normal traffic flows
with the traffic in a specific time window (e.g., the observation). This entropy difference is
called Information Distance (ID) and is used as the detection metric when the calculated
entropy goes beyond thresholds based on legitimate traffic. In another example, the
authors of [10] have used a sophisticated approach to evaluate the difference between
legitimate traffic and anomalous traffic potentially linked to a DDoS attack by using
Shannon entropy. Then, the authors employ a Kernel Online Anomaly Detection (KOAD)
algorithm using the entropy features to detect input vectors that were suspected to be DDoS.
Another IDS approach based on sliding window and conditional entropy is proposed in [16]
where anomalies related to various attacks including DDoS are detected in a two steps
approach. The maximum entropy method is first used to create a normal model in which
the classes of network packets are distributed and have the best uniform distribution. In
a second step, conditional entropy is then applied to determine the difference between
the distribution of packet classes in current traffic compared to the distribution found as
a result of the maximum entropy method. The authors in [17] have also used a sliding
window approach combined with Shannon entropy to detect Denial of Service Router
Advertisement Flooding Attacks (DSRAFA). A fixed sliding window of 50 packets was
used and a threshold mechanism was adopted to identify traffic anomalies which could
indicate the attack.
The data presented in a sliding window can also be transformed to enhance the
detection accuracy. With the advent of DL and Convolutional Neural Network (CNN) in
particular, an approach adopted by some authors is to convert the batch data of a sliding
window into an image, which is then provided as an input to a CNN based detection
algorithm. This approach is proposed recently in [18] where the data of the network traffic
flows is transformed to images which are given as input to CNN combined with Long Short
Term Memory (LSTM). A similar approach is adopted in this paper with the difference
that DL is not used since it can be quite time-consuming and a more conventional texture
analysis approach is used together with a novel entropy measure. Another DL approach
is proposed by the authors in [19] where a conditional variational autoencoder is used
for intrusion detection in IoT. The conversion of flow features to grayscale images is also
adopted in [20] where the authors propose a method which extracts 256 features from the
flow and maps them into 16 ∗ 16 grayscale images, which are then used in an improved
CNN to classify flows. On the other side, none of the papers investigated by the authors
adopt other tools for image analysis for IDS like the GLCM adopted in this study. This may
be due to the consideration that DL has become state of art in image processing even if it
comes at the cost of a significant computational effort.
Then, the approach presented in this paper combines the image-based concept of [18,20]
where the set of network flows are combined in images following the studies [10,15]
where an information theory approach (e.g., entropy measure) is used in combination
with conventional machine learning. We show in the Results Section 4 that this approach
manages to provide competitive detection results in a time efficient way in comparison to
studies using the same CICIDS2017 data set used in this paper.
26
Appl. Sci. 2021, 11, 5567
2. The sliding window data (of size WS ∗ 78) is converted to gray images by rescaling
the values of the network flows features. The rescaling is implemented by converting
the original values of the network flows in the sliding window to the range 0–256
(for each network flow feature) to obtain 256 levels of gray. A linear conversion is
used. Examples of the resulting gray images for the Legitimate traffic and the Port
Scan traffic are shown in Figure 2, where the y-axis represents the id of the network
flow feature, while the x-axis represents the flow id. The sliding window is applied in
sequential order regardless of the IP origin as it was created in the public data set [5]
used in this paper.
3. The GLCM is applied to the gray images with different values of the GLCM hyparam-
eters. See Section 3.3 for the definition of GLCM and hyperparameters. One of the
important hyperparameters is the quantization factor Q F . In other words, different
GLCMs are created for each of the distances and directions considered in Section 3.3
(even for different values of GD ) and for the value of the quantization factor Q F . The
resulting size of the GLCM is Q F ∗ Q F .
4. The GLCM features (e.g., contrast) are calculated. In addition, the 2DDE is also
calculated on the images. The definition of the 2DDE is presented in Section 3.4.
5. The ML algorithm is applied to the features calculated in the previous step. The
description of the algorithm used in this study and the related hyperparameters are
described in the Section 3.5.
Legitimate/Normal traffic
Attack (DDoS and PortScan) Recalculate with
new window Gray-level co-
grayscale occurrence
Transform images matrices (GLCM)
to grayscale
images
Collect traffic
flows with
Sliding
Window of
2 3
size WS*78
Network flows
features.
Windowed Application of Machine 4
1 Data Learning algorithms
5 2D Dispersion
Network traffic flows
Entropy
Data set CIC-IDS2017 data set
and GLCM features
Finally, the hyperparameters of the GLCM and of the ML algorithm are tuned using
the Error Rate (ER) as evaluation metric. The definition of ER and the other evaluation
metrics are provided in Section 3.6.
6 6
12 12
18 18
24 24
Feature
Feature
30 30
36 36
42 42
48 48
54 54
60 60
66 66
72 72
78 78
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200
Network flow Network flow
(a) Grayscale image of the Legitimate network (b) Grayscale image of the Port Scan network
flows for WS = 200. flows for WS = 200.
Figure 2. An example of the grayscale images with WS = 200 for legitimate and Port Scan net-
work flows.
27
Appl. Sci. 2021, 11, 5567
3.2. Materials
To evaluate the proposed approach, the publicly available CICIDS2017 data set de-
scribed in [5] is used. This data set was used because it is relatively recent in comparison
to older data set like the KDD-99 data set, whose limitations are known and discussed
in [14,21]. These limitations have prompted the research community to generate even
simulated data sets like the ones proposed in [22]. The CICIDS2017 data set is based on a
real network where intrusion attacks have been implemented. Then, it satisfies one of the
requirements for data sets identified in [21]. As described in [5], the test bed to implement
the attacks was divided into two completely separated networks: a Victim-Network and
the Attack-Network. In the Victim-Network, the creators of the CICIDS2017 data set have
included routers, firewalls, switches, along with the different versions of the common three
operating systems: Windows, Linux and Macintosh. The Attack-Network is implemented
by one router, one switch and four PCs, which have the Kali and Windows 8.1 operating
systems. The Victim-Network consists three servers, one firewall, two switches and ten
PCs interconnected by a domain controller (DC) and active directory. The dataset contains
normal traffic (i.e., legitimate traffic with no attacks) and traffic with the most up-to-date
common attacks for five days. We selected two types of attacks in this study: the DDoS
attack and the PortScan attack. These attacks are chosen because they are quite represen-
tative of intrusion attacks and because they have the largest number of samples in the
CICIDS2017 data set. Both attacks were generated on the last day of the data set. The
DDoS traffic in this dataset was generated with a tool to flood UDP and TCP requests to
simulate network layer DDoS attacks, and HTTP requests to simulate application-layer
DDoS attacks. The Portscan attack was executed from all the Windows machines by the
main switches. The dataset is completely labeled and includes 78 network traffic features,
which were extracted using the CICFlowMeter software package described in [5]. Note
that the CICflowmeter outputs 84 features including the label (see [23] for a description
of all the features), but we removed features 1 (Flow Id), 2 (Source IP), 3 (Source Port),
4 (Destination IP), 5 (Destination Port) thus obtaining the 78 features used in this paper,
since the last field is used as the label.
Two separate data sets are created from the original CICIDS2017 data set: one data
set containing only the legitimate traffic and the Distributed Denial of Service (DDoS)
network flows and another data set containing only the legitimate traffic and the Port Scan
network flows. The two data sets were created by selecting from the whole data set only
the network flows labelled as legitimate traffic and the specific attack: DDoS or PortScan.
All the network flows from the other attacks were removed from the other data set.
Table 1 shows the number of legitimate/benign traffic samples and the attack samples
for the DDoS and the PortScan attacks.
Table 1. Number of samples for the legitimate/benign traffic and the DDoS and PortScan attacks in
the CICIDS2017 data set considered in the study.
As described in Section 3.5 later in this paper, the data set is subdivided in folds,
which contain exclusive portions of the data set containing both legitimate traffic and
traffic related to the intrusion attack. In this study, a number of folds equal to 3 was
selected to ensure to have enough samples of the attack since the CICIDS2017 data set is
unbalanced like many other intrusion data sets: the number of traffic samples related to
the intrusion are usually much less than the legitimate traffic ones. The application of the
approach proposed in this paper is applied separately to each fold and then the values are
averaged. The optimization step is also performed on averaging the results from all the
folds. This technique of subdividing the data set is one of the guidelines for the application
28
Appl. Sci. 2021, 11, 5567
[0 1] 0o
GLCM
with distance =1
A third parameter (Symmetric or Not Symmetric) is the order of values which can
be counted only once or twice. When the hyperparameter is set to Symmetric the GLCM
is calculated by counting the pairings twice. When the hyperparameter is set to Not
Symmetric the GLCM is calculated by counting the pairings only once.
29
Appl. Sci. 2021, 11, 5567
1 2 3 4 5 6 7 8
1 1 5 6 8 1 1 2 0 0 1 0 0 0
2 0 0 1 0 1 0 0 0
2 3 5 7 1
3 0 0 0 0 1 0 0 0
4 5 7 1 2 4 0 0 0 0 1 0 0 0
5 1 0 0 0 0 1 2 0
8 5 1 2 5
6 0 0 0 0 0 0 0 1
4x5 grayscale image 7 2 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
GLCM
(QF=8, [0 1] (00)
In the original GLCM definition, it is possible to calculate the GLCM along all the
possible directions, but an evaluation of the data set by the authors in this specific IDS
context has shown that the additional directions not described in Figure 3 are duplications
of the directions already identified and they would add unneeded computing efforts as
they would grow the number of features on which the ML has to be applied. A quantitative
confirmation that the angles shown in Figure 3 have an higher detection performance than
using all the angles of the GLCM is provided in Section 4. Then, in the rest of this paper,
we will use the 2-tuples [0 GD ], [− GD 0], [− GD − GD ], [− GD GD ].
As described in the Introduction Section 1, the idea to use GLCM in the context of
IDS is that the sequential structure of the network flows in case of an intrusion is altered
in comparison to the legitimate traffic. Since the GLCM is created by calculating how
often pairs of pixels with a specific value and offset occur in the image, the underlying
idea of the approach is that numbers of pairs of pixels will be altered when an attack is
implemented. The challenge is that it is not known a priori how the choice of values of
the hyperparameters influences the detection accuracy of the intrusion attack, since this
information depends on the context (e.g., the topology of the network, the type of traffic
and the type of attack). Then, an optimization process has to be performed, which is
described in detail in Section 4.1.
30
Appl. Sci. 2021, 11, 5567
x −(t−μ)2
1 i,j
yi,j = √ e 2σ2 dt (1)
σ 2π −∞
zm,c
k,l = zk,l , zk,l +1 , zk,l +2 , ..., zk,l +(m
c c c c
,
Q F −1)
m Q ×m Q
c F F
2DDE(m, c) = − ∑ p(πv
0 ,v1 ,...,v(mQF-1 ) ×(mQF −1)
)
π =1
× ln p πv 0 ,v1 ,...,v(m ×(mQ −1) (4)
QF-1 ) F
As in one dimension dispersion entropy, the value of the parameters m and c should
be tuned to achieve an optimal performance (in this case, the detection of the attack). On
the other side, the parameters m and c are bound [4] by the size of the time series on which
2DDE has to operate. As described in [27] even for the one dimension dispersion entropy,
to work with reliable statistics when calculating dispersion entropy, it is suggested that the
number of potential dispersion patterns cm is smaller than the length of the signal. In the
two dimensional case, the rule reported in [4] and adapted for this case where the GLCM
is square of size Q F ∗ Q F , is that (cmQ F )2 < ( Q F − mQ F − 1)2 , which limits the space of the
values of m and c to few values as the range of Q F . Considering that the range of Q F spans
from 12 to 48 in this study, the combinations of c = 2, m = 3 and c = 3, m = 2 are chosen. It
must also be taken in consideration that higher values of m increase the computing time,
which is not desirable for a large data set like the one used in this study.
As pointed out in the Introduction, the rational for using 2DDE in this study is
that 2DDE allows to analyze irregularity of images on the basis of the frequency of the
dispersion patterns in the image [4], which can provide more information than the classical
Shannon Entropy. Since an intrusion attack usually disrupts the regularity of the structure
31
Appl. Sci. 2021, 11, 5567
of legitimate traffic, the application of 2DDE can provide a significant discriminating power
for the detection of the attack.
32
Appl. Sci. 2021, 11, 5567
TP + TN
ER = 1 − (5)
(TP + FP + FN + TN)
where TP is the number of True Positives, TN is the number of True Negatives, FP is the
number of False Positives and FN is the number of False Negatives.
To complement the accuracy metric, the True Positive Rate (TPR) and the False Positive
Rate (FPR) are used, which are defined in the following equations:
TP
TPR = (6)
(TP + FN)
FP
FPR = (7)
(FP + TN)
Another method to evaluate the performance of the approach proposed in this paper,
is the Receiver Operative Characteristics (ROC) curve which is created by plotting the
True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings.
A metric based on the ROC curve is the Equal Error Rate (EER), which is the point on the
ROC curve that corresponds to have an equal probability of miss-classifying a positive or
negative sample.
33
Appl. Sci. 2021, 11, 5567
and Homogeneity is the inverse difference moment on the basis of the terms described
in [31].
Table 2. List of features used in this study (GD is the GLCM distance and the 2-tuples indicate the
angles used to build the GLCM).
As it is shown in Table 2 the GLCM is calculated on the gray image (created from the
sliding window) for different values of the angle for a specific value of the distance GD .
Then, the related features for each specific GLCM are calculated. For example, one GLCM
is calculated for the angle [0 GD ] while another GLCM is calculated for the angle [− GD
− GD ].
The optimal set of features are selected using the forward sequential feature selection.
In the forward sequential search algorithm, optimal features are added to a candidate subset
while evaluating the criterion. Since an exhaustive comparison of the criterion value at all
subsets of the 64 features from Table 2 (repeated for all the values of the hyperparameters)
is typically infeasible, the forward sequential search moves only in the direction of growing
from an initial feature (the one with the lowest ER when all the features are considered).
The best ten features are used to calculate the final metrics of evaluation: ER, FPR, FNR.
The number of ten has been adopted because it was the optimal value between the need
to limit the number of features for the application of ML and the increase of detection
accuracy (beyond ten features, the improvement in detection accuracy was minimal).
Apart from the application of the ML algorithms, the difference of the discriminating
power of 2DDE in comparison to the other features to detect an attack can be visualized by
the trends of the features in comparison to the network flows features. Figure 5a,b show
respectively the trend of GLCM-2DDE and GLCM-Entropy (i.e., Shannon Entropy) for
the Port Scan attack. The blue plot shows the trend of the specific feature while the bar
graph (purple bars superimposed on the plot) identifies the windows where the attack
34
Appl. Sci. 2021, 11, 5567
is implemented and labelled. It can be seen in Figure 5a that the values of GLCM-2DDE
(called simply 2DDE in the rest of this paper) are notably higher in correspondence to the
Port Scan attack than the normal legitimate traffic. This difference is less evident for the
GLCM-Entropy feature. These differences in values are the reason why the performance of
GLCM-2DDE is higher than the GLCM-Entropy when ML is applied.
We highlight that Figure 5 and the previous paragraph are only used for informational
purposes to provide to the reader with a visual recognition of the difference of the trends
in the data set once two different entropy measures are applied to the network flows data.
Figure 5 is not used to select features for the classification phase because the SFS is used for
this purpose as described in Section 4.
(a) Trend of the GLCM and 2DDE feature with m = 2 and c = 3 (FID = 6) on the
CICIDS2017 data set (DDoS and legitimate traffic only).
(b) Trend of the GLCM and Shannon Entropy feature (FID = 5) on the CI-
CIDS2017 data set (DDoS and legitimate traffic only).
Figure 5. Trends of two features on the CICIDS2017 data set (DDoS and legitimate traffic only) and
WS = 100. The purple bars indicate the labels of the DDoS attack.
4. Results
4.1. Optimization
This sub-section provides the results on the optimization of the hyperparameters
described in the previous sections.
A grid approach was used to determine the optimum values of the hyperparameters.
While, other methods (e.g., gradient, meta-heuristics algorithms) could be more efficient, it
should be considered that the ranges of values for each hyperparameter are quite limited.
In addition, the intention is to show in an explicit way the impact of each hyperparameter
for the detection performance. The metric is used to determine the optimal values of the
hyperparameters.
35
Appl. Sci. 2021, 11, 5567
The summary of the hyperparameters used in this study, the optimal values and the
range of the hyperparameters are shown in Table 3. In the rest of this sub-section and
related figures, we show how a specific hyperparameter impacts the detection accuracy
of the threat both for DDoS attack and Port Scan attack. For each presented result, the
other hyperparameters are set to the values identified in Table 3. The Decision Trees (DT)
ML algorithm was used to generate the results provided in this sub-section. As shown in
Section 4.3 the DT algorithm has a higher detection performance than the SVM and Naive
Bayes algorithms.
Table 3. Summary of the hyperparameters in the proposed approach and related optimal values.
The following figures describe the results for the evaluation of the proposed approach
for different values of the hyperparameters and for the different features used in the study.
In most cases, the evaluation of a single hyperparameter is provided while the other
hyperparameters are set to the values described in Table 3 unless otherwise noted.
Figure 6a,b show respectively for the Port Scan and the DDoS attacks, the impact
of the GLCM distance GD for different values of the window size WS . These results are
obtained using all the 64 features identified in Table 2. It can be noted that the optimal
value of WS is 100 network flows, as the ER increases with larger values of WS . This may
due to the reasons that the difference between legitimate traffic and the traffic related to
the attack are more evident when the WS is relatively small. On the other side, WS = 100 is
the lower limit of WS to allow the GLCM to operate on a grayscale picture large enough to
obtain meaningful values. Figure 6a shows that a value of GD = 1 is optimal to detect the
Port Scan attack, while Figure 6a shows that a value of GD = 2 is optimal for the DDoS
attack. These results seem to indicate that there is no need to use values of GD larger than
2, which would also be more computing intensive.
Then, the impact of the quantization factor Q F was evaluated. As described before,
the quantization factor in the GLCM definition is an important factor in the application of
GLCM. A large value of Q F provides an higher granularity which can be beneficial in the
application of the ML algorithm for the detection of the threat. On the other side, a large
value of Q F is more computing expensive for the calculation of the GLCM features and
2DDE as the resulting GLCM matrix are larger (the GLCM size is Q F ∗ Q F ). This is an
important trade-off, which was investigated for each specific feature and for each attack.
Figure 7a,b shows the impact of the Q F parameter on the detection accuracy respec-
tively for the Port Scan and the DDoS attack for the first 8 features (only the first 8 features
are provided in these figures for reasons of space, but subsequent figures will consider
all features). The value of WS is set to 100 since the previous Figure 6 has shown that
WS = 100 is the optimal value for attack detection. Figure 7a,b provide two important
results: the first is that they identify the optimal value of the Q F parameter (Q F = 40 for
the Port Scan attack and Q F = 44 for the DDoS attack). The second is that they show that
the 2DDE features have a better performance than the other features. This result justifies
the assumption done in this paper for the application of 2DDE to the problem of IDS.
36
Appl. Sci. 2021, 11, 5567
Port Scan
0.016
0.014
0.012
0.01 GD=1
GD=2
ER
0.008
GD=3
GD=4
0.006
0.004
0.002
0
100 200 300 400 500
WS
(a) Error Rate (ER) dependence on GLCM distance GD for Port Scan attack.
10-3 DDoS
8
5 GD=1
GD=2
ER
4
GD=3
GD=4
3
0
100 200 300 400 500
WS
(b) Error Rate (ER) dependence on GLCM distance GD for DDoS attack.
Figure 6. Dependence on GLCM distance GD and Window size WS using best selected features. DT
algorithm is used.
Figure 7 shows only the first 8 features. Then, a more extensive analysis of the
detection performance of each of the 64 features was carried on by setting the optimal value
of the other hyperparameters (GD , Q F and WS ). The results are shown in Figure 8a,b where
the ER is reported for each feature identified with the FID identifier. To better visualize
the features related to 2DDE a red bar is used in the Figures. Figure 8a,b show that the
2DDE is able to obtain a consistent high detection accuracy in comparison to the other
features for all the 64 features. In particular, for both attacks, the values of m = 2 and c = 3
in the 2DDE definition provides a better performance than the values of m = 3 and c = 2
in the 2DDE definition. This result shows the higher detection performance of 2DDE in
comparison to the other features (e.g., Shannon entropy or variance). The results shown
in these figures also give an indication on the GLCM angle, which is most performing. In
general, the GLCM distance and angle defined by the 2-tuple [0 GD ] (which corresponds
to FID = 1 . . . 8) provides better results (in terms of detection accuracy) than the other
2-tuples.
37
Appl. Sci. 2021, 11, 5567
Port Scan
0.16
0.14 FID=1,Contrast
FID=2,Energy
0.12
FID=3,Homogeneity
ER 0.1 FID=4,Correlation
FID=5,Shannon Entropy
0.08
FID=6,2DDE (m=2,c=3)
0.06 FID=7,2DDE (m=3,c=2)
0.04 FID=8,Sum of Variances
0.02
0
QF=12
QF=16
QF=20
QF=24
QF=28
QF=32
QF=36
QF=40
QF=44
QF=48
QF
(a) Error Rate (ER) vs. quantization factor of the GLCM Q F for the Port Scan attack with
WS =100 for the first 8 features (FID = 1 . . . 8).
DDoS
0.14
0.12
FID=1,Contrast
0.1 FID=2,Energy
FID=3,Homogeneity
0.08 FID=4,Correlation
ER
FID=5,Shannon Entropy
0.06
FID=6,2DDE (m=2,c=3)
0
QF=12
QF=16
QF=20
QF=24
QF=28
QF=32
QF=36
QF=40
QF=44
QF=48
QF
(b) Error Rate (ER) vs. quantization factor of the GLCM Q F for the DDoS attack with WS =
100 for the first 8 features (FID = 1 . . . 8).
Figure 7. Dependence on GLCM distance GD and WS using the first 8 features (FID = 1 . . . 8). DT
algorithm is used.
The importance of 2DDE in comparison to other features for the IDS problem is also
visible, once SFS is applied to select the optimal set of features on the basis of the value of
hyperparameters already set. The results of the application of SFS is presented in Table 4,
where the 10 best features are shown respectively for the DDoS and the Port Scan attack. In
Table 4, the 2DDE features are highlighted in red. It can be seen that the 2DDE features are
substantially present among the 10 best features, which shows the the application of 2DDE
to this specific problem is an important element to achieve an higher detection accuracy of
the attack.
Table 4. Ten best features obtained for the Port Scan and the DDoS attack using the SFS approach.
DT algorithm is used.
38
Appl. Sci. 2021, 11, 5567
Port Scan
0.14
0.12
0.1
0.08
ER
0.06
0.04
0.02
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
FID
(a) Error Rate (ER) vs. all features for the PortScan attack with WS = 100, Q F = 40 and GD = 1.
DDoS
0.14
0.12
0.1
0.08
ER
0.06
0.04
0.02
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
FID
(b) Error Rate (ER) vs. all features for the DDoS attack with WS = 100, Q F = 44 and GD = 2.
Figure 8. Error Rate (ER) relation with all features with WS = 100. The features related to 2DDE are
highlighted with a red bar for improved visualization. DT algorithm is used.
39
Appl. Sci. 2021, 11, 5567
0.012 10-3
7
0.01 6
0.008 5
ER
0.006
ER
3
0.004
2
0.002
1
0 0
All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All
Number of samples Number of samples
(a) ER for the PortScan attack for different sizes (b) ER for the DDoS attack for different sizes of
of the data set. ’All’ means the whole data set. the data set. ’All’ means the whole data set.
0.1 0.06
0.05
0.08
0.04
0.06
FPR
FPR
0.03
0.04
0.02
0.02
0.01
0 0
All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All
Number of samples Number of samples
(c) FPR for the PortScan attack for different sizes (d) FPR for the DDoS attack for different sizes of
of the data set. ’All’ means the whole data set. the data set. ’All’ means the whole data set.
10-3 10-3
5 3
2.5
4
2
3
FNR
FNR
1.5
2
1
1
0.5
0 0
All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All All/20 All/18 All/16 All/14 All/12 All/10 All/8 All/6 All/4 All/2 All
Number of samples Number of samples
(e) FNR for the PortScan attack for different sizes (f) FNR for the DDoS attack for different sizes of
of the data set. ’All’ means the whole data set. the data set. ’All’ means the whole data set.
Figure 9. ER, FPR and FNR for the PortScan and DDoS attack for different sizes of the whole data set.
To complete the previous results, the ROCs for the DDoS and the PortScan attacks are
presented respectively in Figure 10a,b. Since the FPR is relatively limited (because the data
set is quite unbalanced), a more detailed figure of the same ROCs (i.e., zoom of the previous
figures) is presented in Figure 10a,b respectively for the DDoS and the PortScan attacks.
The values of the EER for each value of WS are also reported. The results from the ROCs
confirm the previous result that the optimal value of the window size is WS = 100 because
an increase of WS produces slightly worst results in terms of ROCs and EER. It can also
been seen that the detection of the PortScan attack is slightly worse than the DDoS attack.
This may be due to the reason that PortScan attacks are more difficult to distinguish from
legitimate traffic than the DDoS attacks when the entropy measures are applied (especially
in the CIC-IDS2017 data set). The structure of the sequences of network flows features
in the DDoS attacks can be quite different from legitimate traffic (e.g., since a flooding of
messages is implemented) while the PortScan attack traffic may resemble legitimate traffic.
The weakness of the proposed approach in achieving an optimal FPR is also discussed
in the comparison with the literature results in Section 4.3. We note that the proposed
approach manages to achieve a very competitive FNR instead.
40
Appl. Sci. 2021, 11, 5567
1 1
0.9 0.9
0.8 0.8
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False positive rate False positive rate
(a) ROCs and related EERs for the PortScan attack. (b) ROCs and related EERs for the DDoS attack.
1 1
0.999 0.99
True positive rate
0.995 0.95
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
False positive rate False positive rate
(c) Detailed view of the ROC for the Port Scan (d) Detailed view of the ROC for the DDoS attack
attack for different values of WS . for different values of WS .
Figure 10. ROCs and related EERs for the PortScan and DDoS attack for different values of WS . DT
with optimal hyperparameter values from Table 3 and optimal set of features from Table 4. The
bottom figures show the detailed view of the ROCs.
41
Appl. Sci. 2021, 11, 5567
Table 5. Summary table of the ER, FPR and FNR results obtained with this approach (different
machine learning algorithms) and the results from literature.
The results obtained with the DDoS attack are confirmed by the results obtained by
the PortScan attack. The obtained FNR is better than the results obtained in literature
while the ER is also smaller than the results presented in other studies. In particular, our
approach achieves a similar ER to the results in [9], which uses a DL approach (i.e., LSTM).
On the other side, the FPR obtained with this approach is higher than the results obtained
in literature. Another result shown in Table 5 is that the Decision Tree algorithm has a
better detection performance than the SVM and Naive Bayes algorithms. This result is
consistent with [5] where the DT provided the optimal detection accuracy.
An evaluation of the use of all the GLCM angles was also implemented to validate the
adoption of only a limited set of GLCM angles as described in Section 3.3. The results are
provided in Table 6 using the Decision Tree algorithm. The results in Table 6 show that a
subset of the GLCM angles (as selected in this study) provides a better performance than
using all angles since the ERs for the subset are smaller than the ERs for all the GLCM
angles. The results are consistent for different values of WS and for both attacks of PortScan
and DDoS.
42
Appl. Sci. 2021, 11, 5567
high (63 s for the DDoS attack and 83 s for the PortScan attack). These calculated times are
based on WS = 100 since this was the window size with the minimum ER and the optimal
selection of features presented in Table 4. In this study, it was used a laptop with Intel i7
85550U CPU running at 1.8 GHz with 16 GBytes of RAM and no GPU.
Table 6. Comparison on the set of GLCM angles using Error Rate (ER): subset of angles used in this
study in comparison to the use of all the GLCM angles.
5. Conclusions
This study proposes a novel approach for IDS based on anomaly detection which is
based on the transformation of the network flows metrics into grayscale images. Then, the
Gray-Level Co-occurrence Matrices (GLCM) are calculated on the grayscale images and
features are calculated on the GLCM. Beyond the application of well known GLCM Haralick
features (i.e., contract, homogeneity, entropy), this paper proposes the novel application
of 2D Dispersion Entropy (2DDE) recently proposed in literature. The results show that
the application of 2DDE to GLCM significantly enhances the detection accuracy of the
proposed IDS. The approach is applied to the recently published CICIDS2017 data set for
two specific attacks: DDoS and Port Scan. The results of this approach are compared with
the results obtained by other studies on the same CICIDS2017 data set obtaining an Error
Rate (ER) which is higher or comparable with the results obtained with more sophisticated
approach based on Deep Learning, which requires considerable more computing resources
than our proposed approach. In addition, the False Negative Rate (FNR) obtained with
our approach is significantly better than all the other results obtained in literature. On
the other side, the False Positive Rate (FPR) is slightly worse than the results obtained in
literature. This may due to the possibility that the transformation of the network flows
features to gray level images and then GLCM-base features has the tendency to lose the
specific characteristics of the attack related traffic in comparison to the normal traffic.
Future developments will try to improve the FPR by adopting improvements of the
proposed approach in different directions. One direction would be to use Fuzzy Gray-
Level Co-occurrence Matrices since it has demonstrated a superior performance in some
43
Appl. Sci. 2021, 11, 5567
applications, but it has not been used in IDS problems. Another direction would be the
application of non linear GLCM where the quantization factor is calculated in a non linear
way. The significant number of hyperparameters to tune in the approach (both in the
GLCM definition and 2D dispersion entropy definition) is also a challenge to mitigate for a
practical deployment of this approach. One possibility to resolve the challenge would be to
investigate the application of meta-heuristics algorithms (e.g., particle swarm optimization)
to automatically tune the hyperparameters. Another possibility would be to investigate the
hyperparameters optimization in other data sets to generalize the selection of the optimal
values. Finally, the combination of GLCM together with Deep Learning algorithms will
also be considered. For example, Convolutional Neural Networks (CNN) could be applied
to the GLCM representations rather than the initial gray-scale images derived directly from
the network flows statistics.
Author Contributions: Conceptualization, G.B., J.L.H.R. and I.A.; methodology, G.B.; writing G.B.,
J.L.H.R., I.A.; funding acquisition, G.B., J.L.H.R. All authors have read and agreed to the published
version of the manuscript
Funding: This work has been partially supported by the European Commission through project
SerIoT funded by the European Union H2020 Programme under Grant Agreement No. 780139. The
opinions expressed in this paper are those of the authors and do not necessarily reflect the views of
the European Commission.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: This study used the public data set described in [5].
Acknowledgments: We are thankful to Anne Humeau-Heurtier and the other authors of [4] to
graciously provide the MATLAB code for the implementation of the 2D Dispersion Entropy.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
44
Appl. Sci. 2021, 11, 5567
References
1. Lunt, T.F. A survey of intrusion detection techniques. Comput. Secur. 1993, 12, 405–418. [CrossRef]
2. Liao, H.J.; Lin, C.H.R.; Lin, Y.C.; Tung, K.Y. Intrusion detection system: A comprehensive review. J. Netw. Comput. Appl. 2013,
36, 16–24. [CrossRef]
3. Dromard, J.; Roudière, G.; Owezarski, P. Online and scalable unsupervised network anomaly detection method. IEEE Trans.
Netw. Serv. Manag. 2016, 14, 34–47. [CrossRef]
4. Azami, H.; da Silva, L.E.V.; Omoto, A.C.M.; Humeau-Heurtier, A. Two-dimensional dispersion entropy: An information-theoretic
method for irregularity analysis of images. Signal Process. Image Commun. 2019, 75, 178–187. [CrossRef]
5. Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characteri-
zation; ICISSP: Funchal, Portugal, 2018; pp. 108–116.
6. de Souza, C.A.; Westphall, C.B.; Machado, R.B.; Sobral, J.B.M.; dos Santos Vieira, G. Hybrid approach to intrusion detection in
fog-based IoT environments. Comput. Netw. 2020, 180, 107417. [CrossRef]
7. Yu, X.; Li, T.; Hu, A. Time-series Network Anomaly Detection Based on Behaviour Characteristics. In Proceedings of the
2020 IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China, 11–14 December 2020;
pp. 568–572.
8. Al-Sawwa, J.; Ludwig, S.A. Performance evaluation of a cost-sensitive differential evolution classifier using spark–Imbalanced
binary classification. J. Comput. Sci. 2020, 40, 101065. [CrossRef]
9. Hossain, M.D.; Ochiai, H.; Fall, D.; Kadobayashi, Y. LSTM-based Network Attack Detection: Performance Comparison by
Hyper-parameter Values Tuning. In Proceedings of the 2020 7th IEEE International Conference on Cyber Security and Cloud
Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), New York,
NY, USA, 1–3 August 2020; pp. 62–69.
10. Çakmakçı, S.D.; Kemmerich, T.; Ahmed, T.; Baykal, N. Online DDoS attack detection using Mahalanobis distance and Kernel-
based learning algorithm. J. Netw. Comput. Appl. 2020, 168, 102756. [CrossRef]
11. Moustafa, N.; Hu, J.; Slay, J. A holistic review of network anomaly detection systems: A comprehensive survey. J. Netw. Comput.
Appl. 2019, 128, 33–55. [CrossRef]
12. Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw.
Comput. Appl. 2017, 84, 25–37. [CrossRef]
13. Liu, H.; Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019,
9, 4396. [CrossRef]
14. Sommer, R.; Paxson, V. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of
the 2010 IEEE Symposium on Security and Privacy, Berleley/Oakland, CA, USA, 16–19 May 2010; pp. 305–316.
15. Behal, S.; Kumar, K.; Sachdeva, M. D-FACE: An anomaly based distributed approach for early detection of DDoS attacks and
flash events. J. Netw. Comput. Appl. 2018, 111, 49–63. [CrossRef]
16. Radivilova, T.; Kirichenko, L.; Alghawli, A.S. Entropy Analysis Method for Attacks Detection. In Proceedings of the 2019 IEEE
International Scientific-Practical Conference Problems of Infocommunications, Science and Technology (PIC S&T), Kiev, Ukraine,
8–11 October 2019; pp. 443–446.
17. Shah, S.B.I.; Anbar, M.; Al-Ani, A.; Al-Ani, A.K. Hybridizing entropy based mechanism with adaptive threshold algorithm to
detect ra flooding attack in ipv6 networks. In Computational Science and Technology; Springer: Berlin/Heidelberg, Germany, 2019;
pp. 315–323.
18. Zhang, Y.; Chen, X.; Jin, L.; Wang, X.; Guo, D. Network intrusion detection: Based on deep hierarchical network and original flow
data. IEEE Access 2019, 7, 37004–37016. [CrossRef]
19. Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A.; Lloret, J. Conditional variational autoencoder for prediction and feature
recovery applied to intrusion detection in iot. Sensors 2017, 17, 1967. [CrossRef]
20. Zhou, H.; Wang, Y.; Lei, X.; Liu, Y. A method of improved CNN traffic classification. In Proceedings of the 2017 13th International
Conference on Computational Intelligence and Security (CIS), Hong Kong, China, 15–18 December 2017; pp. 177–181.
21. McHugh, J. Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as
performed by lincoln laboratory. ACM Trans. Inf. Syst. Secur. (TISSEC) 2000, 3, 262–294. [CrossRef]
22. Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A. Variational data generative model for intrusion detection. Knowl. Inf. Syst.
2019, 60, 569–590. [CrossRef]
23. Abdulhammed, R.; Musafer, H.; Alessa, A.; Faezipour, M.; Abuzneid, A. Features dimensionality reduction approaches for
machine learning based network intrusion detection. Electronics 2019, 8, 322. [CrossRef]
24. Vijayanand, R.; Devaraj, D. A novel feature selection method using whale optimization algorithm and genetic operators for
intrusion detection system in wireless mesh network. IEEE Access 2020, 8, 56847–56854. [CrossRef]
25. Maseer, Z.K.; Yusof, R.; Bahaman, N.; Mostafa, S.A.; Foozy, C.F.M. Benchmarking of machine learning for anomaly based
intrusion detection systems in the CICIDS2017 dataset. IEEE Access 2021, 9, 22351–22370. [CrossRef]
26. Baldini, G.; Giuliani, R.; Steri, G.; Neisse, R. Physical layer authentication of Internet of Things wireless devices through
permutation and dispersion entropy. In Proceedings of the 2017 Global Internet of Things Summit (GIoTS), Geneva, Switzerland,
6–9 June 2017; pp. 1–6.
45
Appl. Sci. 2021, 11, 5567
27. Rostaghi, M.; Azami, H. Dispersion entropy: A measure for time-series analysis. IEEE Signal Process. Lett. 2016, 23, 610–614.
[CrossRef]
28. Shawe-Taylor, J.; Cristianini, N. Support Vector Machines; Cambridge University Press: Cambridge, UK, 2000; Volume 2.
29. Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop Empirical Methods in
Artificial Intelligence, Seattle, WA, USA, 4–6 August 2001; pp. 41–46.
30. Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973,
SMC-3, 610–621. [CrossRef]
31. Haralick, R.M. Statistical and structural approaches to texture. Proc. IEEE 1979, 67, 786–804. [CrossRef]
46
applied
sciences
Article
New Subclass Framework and Concrete Examples of Strongly
Asymmetric Public Key Agreement
Satoshi Iriyama 1,† , Koki Jimbo 1,†, * and Massimo Regoli 2,†
1 Information Science Department, Tokyo University of Science, 2641, Yamazaki, Noda, Chiba 278-8510, Japan;
[email protected]
2 DICII, Engineering Faculty Via del Politecnico, Universitá di Roma Tor Vergata, 1, 00133 Roma, Italy;
[email protected]
* Correspondence: [email protected]
† These authors contributed equally to this work.
Abstract: Strongly asymmetric public key agreement (SAPKA) is a class of key exchange between
Alice and Bob that was introduced in 2011. The greatest difference from the standard PKA algorithms
is that Bob constructs multiple public keys and Alice uses one of these to calculate her public key and
her secret shared key. Therefore, the number of public keys and calculation rules for each key differ
for each user. Although algorithms with high security and computational efficiency exist in this class,
the relation between the parameters of SAPKA and its security and computational efficiency has
not yet been fully clarified. Therefore, our main objective in this study was to classify the SAPKA
algorithms according to their properties. By attempting algorithm attacks, we found that certain
parameters are more strongly related to the security. On this basis, we constructed concrete algorithms
and a new subclass of SAPKA, in which the responsibility of maintaining security is significantly
more associated with the secret parameters of Bob than those of Alice. Moreover, we demonstrate 1.
Citation: Iriyama, S.; Jimbo, K.; insufficient but necessary conditions for this subclass, 2. inclusion relations between the subclasses of
Regoli, M. New Subclass Framework SAPKA, and 3. concrete examples of this sub-class with reports of implementational experiments.
and Concrete Examples of Strongly
Asymmetric Public Key Agreement. Keywords: public key exchange; security; asymmetric; asymmetric algorithm; cryptography; frame-
Appl. Sci. 2021, 11, 5540. https:// work; limited computational power; computationally biased
doi.org/10.3390/app11125540
47
Appl. Sci. 2021, 11, 5540. https://doi.org/10.3390/app11125540 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 5540
(PQC), has become widespread. PKAs and public key cryptographies based on lattice prob-
lems such as the shortest vector problem (SVP), closest vector problem, and learning with
errors (LWE) are among the most well known methods. Among these, SVP-based PKA and
public key cryptographies, including NTRU prime [6], NTRU-HRSS-KEM [7], and NTRU
Encrypt [8], module LWE-based PKA such as CRYSTALS–Kyber [9], and ring LWE-based
PKA including NewHope [10] are leading approaches in this research area and have been
considered as candidates for the NIST (National Institute of Standards and Technology)
standardization of PQC systems [11,12]. When the parameters are properly selected, the
above algorithms are considered to be resilient against attacks that use quantum computers
and sufficiently computationally efficient to be used in practice.
However, there has been substantial discussion regarding the security of such algo-
rithms. For example, in certain LWE-based algorithms, even if sufficiently large parameters
are selected, a possible weakness has been observed [13–15]. Moreover, the notion that
the difficulty in solving ring LWE is equivalent to that of solving the LWE (the difficulty
of LWE is discussed in [14,16]) has not yet been proven; thus, other cases of weakened
security [9,10] may arise. Weak parameters for NTRU-type PKA algorithms and public key
cryptography are also reported in [17]. Owing to these uncertainties in the parameter set-
tings to maintain security even in an ideal situation (i.e., without assuming limited memory
and computational power), the preparation of secure communication infrastructure with
these new-generation algorithms for less capable devices has resulted in greater difficulty
and insecurity. Although there is no doubt that these algorithms will offer significant bene-
fits even after the post-quantum computer era, security analysis of these algorithms should
continue until users can be provided with a “guide” that explains how to set parameters to
ensure secure PKA and public key cryptography according to the needs and environments
of users.
1
C ( T, ED ) := T × .
ED
Thus, it is given by time (s, ms, or another unit). Let PKA be a set of PKA algorithms
and let TA : PKA × N → N be a function that shows the calculation steps required for Alice
to calculate the N ∈ N bit length of the secret shared key (SSK) of an algorithm Alg ∈ PKA,
which increases monotonically for N. TB denotes the calculation steps for Bob in a similar
manner.
Next, we consider Alg ∈ PKA, which has the following relation:
for all N ∈ N. In this case, we suppose that the maximal computational cost that
Bob is allowed to incur for the SSK calculation of Alg ∈ PKA, denoted by Cmax B , is
B
Cmax = TB ( Alg, N0 ) × E1B , which is achieved when the bit length of the SSK is some N0 ∈ N,
and Bob can compute their SSK for all N ∈ N to satisfy:
1 1
CB ( TB ( Alg, N ), EB ) := TB ( Alg, N ) × ≤ Cmax
B
:= TB ( Alg, N0 ) × , (2)
EB EB
48
Appl. Sci. 2021, 11, 5540
For the same Alg ∈ PKA, we assume that EB > E A and the maximal computational
cost that Bob is allowed to incur Cmax B is the same as (2), where N0 is the smallest bit size of
SSK to maintain security. Let Alice’s computational cost for calculating her SSK of N ∈ N
bits be C A ( TA ( Alg, N ), E A ) := TA ( Alg, N ) × E1 , and if Alice needs to calculate her SSK
A
within the cost CmaxB as well as Bob, the following relation must be satisfied:
1 1
C A ( TA ( Alg, N ), E A ) = TA ( Alg, N ) × ≤ Cmax
B
= TB ( Alg, N0 ) × ,
EA EB
where the equality holds for some N1 ∈ N, but in this case, N1 < N0 must be satisfied
because TA is a monotonically increasing function for the bit size of the SSK and E1B < E1 .
A
This observation indicates that they must either use an N1 -bit SSK, which is obviously
less secure than when using an N0 -bit SSK, or let Alice incur a cost of TA ( Alg, N0 ) × E ,
1
A
B . Most PKA algorithms, including the DH algorithm, satisfy (1),
which is larger than Cmax
and there are many cases in which EB > E A in modern society where IoT techniques are
continually being developed; thus, this situation is inevitable in the near future, if not
immediate.
We consider determining an algorithm denoted by Alg A< B ∈ PKA, where
being satisfied for all N ∈ N is one solution to the above undesirable situation. We denote
the maximal computational cost that Bob is allowed to incur as Cmax,A B
< B , which is defined
as Cmax,A< B := TB ( Alg A< B , N0 ) × EB , where N0 ∈ N and it is the smallest bit size of SSK
B 1
to maintain security. In addition to the above TA = TB case, we suppose that both Bob and
Alice must calculate her SSK within the maximal computational cost that Bob is allowed to
B
incur Cmax,A < B . In this case, Alice can calculate all N bits of the SSK to satisfy
1 1
C A ( TA ( Alg A< B , N ), E A ) = TA ( Alg A< B , N ) × ≤ Cmax,A
B
< B = TB ( Alg A< B , N0 ) × .
EA EB
The equality holds when TA ( Alg A< B , N1 ) × E1 = TB ( Alg A< B , N0 ) × E1B holds for
A
some N1 ∈ N. In this case, it should be noted that N1 = N0 is achieved; that is, Alice
B
calculating the SSK of N0 bits within time Cmax,A < B is possible, provided that
TA ( Alg A< B , N0 ) E
≤ A (4)
TB ( Alg A< B , N0 ) EB
holds, which is impossible when (1), because in this case, the left-hand side is equal to 1,
but the right-hand side is less than 1. As EEAB is given (we may say that EEAB is a communi-
cation environment in which the algorithm is used), (4) is not always achieved for some
Alg A< B ∈ PKA and N0 ∈ N. Conversely, we can determine the minimal environment EEAB
< B time using Alg A< B ∈ PKA
B
where Alice and Bob can calculate N0 bits SSK within Cmax,A
by simply calculating the left-hand side of (4).
Based on the above considerations, our research goals are as follows:
1. Constructions of Alg A< B ∈ PKA.
2. For Alg A< B ∈ PKA, the determination of TA ( Alg A< B , N0 ) and TB ( Alg A< B , N0 ) for
any N0 ∈ N.
3. The construction of the PKA class to which Alg A< B ∈ PKA belongs and the introduc-
tion of conditions for PKA algorithms to be members of this class.
As mentioned above, goal 2 provides a lower bound of EEAB to calculate the SSK of
B
N0 bits within time Cmax,A < B . Goal 3 provides instructions on how to construct PKA
algorithms to possess the relation (3). Thus, improving algorithms such as those of [9,10]
to possess this property may be possible by attempting to fix their parameters according to
49
Appl. Sci. 2021, 11, 5540
the class conditions. We do not attempt to improve these algorithms in this study, but this
subject is worthy of consideration and will be one of our most important future works.
We consider that these goals are achievable by fully utilizing the characteristics of the
PKA framework known as strongly asymmetric public key agreement (SAPKA) [18]. The
characteristics, high level of generality, and asymmetry of the key agreement process of
SAPKA are explained in Section 1.2, along with its definition, and concrete methods that
are derived from the characteristics are explained in Section 2.
Note that this study is not focused on how to construct secure PKA algorithms against
any types of theoretical attacks; rather, it investigates how to reduce Alice’s computational
complexity while maintaining the security of one given PKA algorithm. Our main theorems
(in Section 5) do not provide any instructions on how to enhance the security of PKA
algorithms, and resilience against attack such as man-in-the-middle (MITM) attack is not
discussed in this paper (we consider that these topics should be discussed after existence
of Alg A< B is proven and mentioned in Section 7 ).
x1 ◦ x2 ( y ) = x3 ◦ x4 ( y ) (5)
for all y ∈ S , where ◦ denotes the map composition. Equation (5) is a condition for Alice
and Bob to calculate the same SSK (see the key agreement process in Figure 1). The key
agreement process of SAPKA can be described as in Figure 1, and every secret/public key
is displayed in Table 1.
50
Appl. Sci. 2021, 11, 5540
Key Parameter
Secret keys x1 , x2 , x3 , x4 , N1 : S → S
y B,1 = N1−1 ◦ x4 : S → S
Bob Public keys
y B,2 = x1 ◦ x2 : S → S
SSK κ B = x3 ◦ x4 ( x A ) ∈ S
Secret key xA ∈ S
Alice Public key y A = N1−1 ◦ x4 ( x A ) ∈ S
SSK κ A = x1 ◦ x2 ( x A ) ∈ S
As can be observed from Figure 1 and Table 1, Bob’s public keys are described by
the map compositions and not by the element of S . Sending a map means sending the
calculation rule of the map in combination with a set of parameters, which is the domain
of the map. Thus, Alice simply follows the rules of y B,1 and y B,2 to calculate y A and κ A ,
and to calculate these, she must first receive y B,1 and y B,2 from Bob. Regardless of the
x A that Alice selects (provided that x A ∈ S ), the equality of κ A and κ B holds, because
the compatibility condition (5) holds for all elements of S . The generality mentioned
above arises from the fact that there are only several restrictions for the secret keys of
Bob, namely x1 , x2 , x3 , x4 , N1 , and semi-group S . As the restrictions are only those in (5)
and invertible regarding N1 , Bob has substantial freedom in terms of the choices of these
maps and the algebraic structure. By fixing these maps and S concretely, various PKA
algorithms can be described, including the most well known of these, namely the DH
algorithm (presented in [18]). In this study, we do not attempt to describe new-generation
algorithms such as [7,9,10] in the form of SAPKA. However, we are optimistic that these can
be described because S can be selected as not only scalars but also matrices, for example,
with numerous options for x1 , x2 , x3 , x4 , N1 .
Another notable characteristic of SAPKA is the asymmetry of the key agreement
process. In this case, the asymmetry means that the number of public keys calculated by
Alice and Bob differ, and thus, the two perform essentially different operations. Owing
to this characteristic, an eavesdropper (Eve) must attempt attacks against a maximum of
two public keys to obtain the secret information of either Bob or Alice. This may allow
Alice to select her secret key from a set of small bit sizes and to reduce her computational
complexity in certain cases. In Section 2, we explain Eve’s strategies for recovering the SSK
from public keys, an observation from her strategies, and the research method derived
from this observation.
51
Appl. Sci. 2021, 11, 5540
Eve’s Strategy 1
If x4 is an invertible map, y B,1 is also invertible; thus, she attempts to determine x A
from (6) and (8):
x A = y− −1 −1
B,1 ( y A ) = x4 ◦ N1 ◦ N1 ◦ x4 ( x A ).
1
(9)
Subsequently, she can calculate
κ A = x1 ◦ x2 ( x A ).
x1 ◦ x2 ( x E ) = x3 ◦ x4 ( x E ) = x3 ◦ N1 ◦ N1−1 ◦ x4 ( x E ) = x3 ◦ N1 (y A ) = κ B . (10)
The second equation of (10) is obtained from the compatibility condition (5), and the
final equation is obtained from the definition of κ B (see Figure 1 or Step 5 of Section 2.4).
Eve’s Strategy 2
First, Eve attempts to obtain a map N1 from (6) and then attempts to obtain a map
x3,E : S → S that satisfies
Suppose that Bob constructs y B,1 and y B,2 to satisfy the following two requirements:
Then, it is difficult for Eve to obtain x A from (9) and to proceed to (12) in real time;
that is, Eve cannot obtain the SSK in real time. It should be noted that the secret key x A that
Alice selects is not strongly related to Eve’s breaking complexity owing to Requirement 1.
This means that Bob may take substantially more responsibility for maintaining security
than Alice. In this case, Alice can select her secret key space as a small one in terms of the
bit size, provided that an exhaustive search for x A is difficult in real time, and we expect
that this will reduce Alice’s computational complexity for y A and κ A .
2.2. Methods
According to this observation, the methods that can be established for the research
goal can be derived as follows:
1. Introduce algorithms from SAPKA (Section 3).
2. Estimate the breaking complexity of these algorithms under the assumption that the
secret key space of Alice is smaller than that of Bob and verify whether or not Alice’s
small key space reduces the breaking complexity of the algorithms (Sections 4.2 and 4.3).
3. For algorithms for which Alice’s small key space does not reduce the breaking com-
plexity, estimate their computational complexity to calculate the keys of both Alice
and Bob (Sections 4.4 and 4.5).
4. Construct the SAPKA subclass for which the algorithms possess the relation that the
complexity of Alice for her keys is smaller than that of Bob (we call this subclass the
“main SAPKA subclass” until we give definition of it) and introduce the necessary
conditions for the subclass by generalizing the results of 3 (Sections 5.1 and 5.2).
52
Appl. Sci. 2021, 11, 5540
In Section 6, we implement some of the algorithms in Section 3 and report the exper-
imental results. We can concretely observe what can be offered by the algorithms of the
subclass constructed in Section 5.1.
At the beginning of Section 5, we add restrictions to the SAPKA framework, partic-
ularly for the public keys of Bob, before discussing our main themes. We explain these
restrictions briefly below.
Key Parameter
Secret keys maps x1 , x2 , x3 , x4 , N1 : S → S and elements x B , n B ∈ S
y B,1 (y) = N1−1 ◦ x4 (y) = n− 1
B y ( y ∈ S)
Bob Public keys
y B,2 (y) = x1 ◦ x2 (y) = x B y (y ∈ S)
SSK κ B = x3 ◦ x4 ( x A ) = x B x A
Secret key xA ∈ S
Alice Public key y A = N1−1 ◦ x4 ( x A ) = n− 1
B xA
SSK κ A = x1 ◦ x2 ( x A ) = x B x A
κ A = x1 ◦ x2 ( x A ) = x B x A
κ B = x3 ◦ N1 (y A ) = x3 ◦ N1 ◦ N1−1 ◦ x4 ( x A ) = x B x A .
t(n) = an
53
Appl. Sci. 2021, 11, 5540
K B : = MS × MS × MS × MS × M S.
x1 ◦ x2 ( y ) = x3 ◦ x4 ( y ), (13)
y B,1 := N1−1 ◦ x4
y B,2 := x1 ◦ x2
and sends (y B,1 , y B,2 ) to Alice.
Step 1A Alice selects her secret key x A from S .
Step 2A Alice calculates her public key y A as follows:
y A := y B,1 ( x A ) = N1−1 ◦ x4 ( x A )
54
Appl. Sci. 2021, 11, 5540
κ A := y B,2 ( x A ) = x1 ◦ x2 ( x A ).
κ B := x3 ◦ N1 (y A ) = x3 ◦ N1 ◦ N1−1 ◦ x4 ( x A ) = x3 ◦ x4 ( x A ).
55
Appl. Sci. 2021, 11, 5540
The components are the same as those in Section 3.1. Bob selects their secret keys
x B , n B ∈ S . The key agreement process of the NDH algorithm is described by the
following steps.
κ A := y B,2 ( x A ) = x1 ◦ x2 ( x A ) = ( g xB ) x A = g xB x A .
κ B := x3 ◦ N1 (y A )
= x3 ◦ N1 ◦ N1−1 ◦ x4 ( x A )
= x3 ◦ N1 ( gnB x A )
−1
= x3 ( g n B n B x A )
= ( gx A )xB
= gxB x A .
cM if M is a scalar
c◦ M := Mij ; i, j ∈ {1, · · · , d}.
ij cij if M is a matrix
56
Appl. Sci. 2021, 11, 5540
• x2 := id
( x ) a,l
• x3 (y) a,b := ∏l ∈{1,...,d} (y)l,bB
• x4 ( y ) : = g ◦ y
( N −1 ) a,l
• N1 (y) a,b := ∏l ∈{1,...,d} (y)l,b B ,
where y ∈ S . The compatibility condition holds for all y ∈ S and a, b ∈ {1, . . . , d}:
(y)
x1 ◦ x2 (y) a,b = ∏ ( g◦ xB ) a,l l,b = ∏ g( xB )a,l (y)l,b
l ∈{1,...,d} l ∈{1,...,d}
Step 2B Bob constructs their public keys y B,1 , y B,2 as a map for each
( N ) a,l
y B,1 (y) a,b := N1−1 ◦ x4 (y) a,b = ∏ ( g◦y )l,b B
l ∈{1,...,d}
(y)
y B,2 (y) a,b := x1 ◦ x2 (y) a,b = ∏ ( g◦ xB ) a,l l,b
l ∈{1,...,d}
for all a, b ∈ {1, . . . , d}. In this case, y B,1 (y) a,b can also be expressed as follows:
( N ) a,l (y)
y B,1 (y) a,b = ∏ ( g◦y )l,b B = ∏ ( g◦ NB ) a,l l,b .
l ∈{1,...,d} l ∈{1,...,d}
Bob sends the maps (y B,1 , y B,2 ) to Alice. This is equivalent to sending the matrices
g◦ NB , g◦ xB ∈ S .
Step 1A Alice selects her secret key x A ∈ S .
Step 2A Alice calculates her public key y A for all a, b ∈ {1, . . . , d} as follows:
( x )l,b
y A a,b := y B,1 ( x A ) a,b = N1−1 ◦ x4 ( x A ) a,b = ∏ ( g◦ NB ) a,lA (14)
l ∈{1,...,d}
= ( g◦ NB x A ) a,b
and sends it to Bob.
Step 3A Alice calculates the SSK denoted by κ A for all a, b ∈ {1, . . . , d} as follows:
( x )l,b
κ A a,b := y B,2 ( x A ) a,b := x1 ◦ x2 ( x A ) a,b = ∏ ( g◦ xB ) a,lA = ( g◦ xB x A ) a,b . (15)
l ∈{1,...,d}
Step 3B Bob calculates the SSK denoted by κ B for all a, b ∈ {1, . . . , d} as follows:
( x NB−1 ) a,l −1 −1
= ∏ ( g◦ NB x A )l,bB = g∑l ∈{1,...,d} ( NB x A )l,b ( xB NB ) a,l
= g( xB NB NB x A ) a,b
l ∈{1,...,d}
57
Appl. Sci. 2021, 11, 5540
Step 2B Step 3A
Step 2A Step 3B
Remark 1. Let V be a vector space of V := Zdp . Each of the above maps x1 , x2 , x3 , x4 , N1 can also
be considered as the map V → V and the condition
(y) ( x B ) a,l
x1 ◦ x2 ( y ) a = ∏ ( g◦ xB ) a,l l = ( g◦ xB y ) a = ∏ ( g◦y )l = x3 ◦ x4 ( y ) a . (17)
l ∈{1,...,d} l ∈{1,...,d}
is satisfied for all y ∈ V and a ∈ {1, . . . , d}, where x B , NB are the same as above. Thus, Alice can
select her secret key x A from V and the computational complexity for calculating y A , κ A , and κ B is
obviously reduced compared with the case when x A ∈ S = M(d, Z p ). The precise breaking and
computational complexities for the CSE of both the x A ∈ V and x A ∈ S cases are investigated in
the following section.
4.1. Assumptions
Let n and m be numbers in N, and n > m. We define two functions | · | : N → N and
| · |S : S → N, which represent the bit lengths of the input. The difference between the two
functions is the domains. The input of the first one is from N, and the second one is from S .
The semi-groups of the above two algorithms are constructed based on a prime number
p, where | p| = n. In this case, we can express the bit size of the element in S as |y|S ≤ n
when S = Z p and |y|S ≤ d2 n when S = M(d, Z p ). We construct subsets denoted by S of
S for both the S = Z p and S = M(d, Z p ) cases, respectively, as follows:
58
Appl. Sci. 2021, 11, 5540
Prior to investigating the breaking complexity of the algorithms in Sections 3.2 and 3.3
under the assumption and a certain condition for n and m, we demonstrate that not only
the complexity, but also the order of complexity (a definition of order is provided in
Definition 5) to compute one discrete logarithm problem (DLP) differ between the cases
when an exponent is selected from a larger bit length set and from a smaller bit length set.
The required settings are as follows:
• a prime number p, where | p| = n;
• a semi-group S := Z p ;
• a subset S := {y ∈ S : |y|S ≤ m} ⊂ S ; and
• a map x g : S → S , x g (y) := gy , where g is a primitive element of S .
We prove that if n = s(m) > m for some monotonically increasing function s : N → N
of not a linear and monic polynomial, the calculation of the DLP, namely
x−
g ◦ x g ( y ) = log g g = y,
1 y
is easier when y ∈ S than when y ∈ S in terms of the complexity and order (Proposition 1).
First, we name this input y of x g according to the size of set y as per the following definition:
Definition 4. Let n and m be numbers in N (n > m), and let n be the bit size of each element of
Z p . For a given map x g : Z p → Z p , defined as
x g (y) := gy ,
where g is an element in S , if the set y belongs to S , we call this y an m-bit logarithm. If the set
is S , y is known as an n-bit logarithm.
The complexity of obtaining the n-bit logarithm refers to the complexity of calculating
x−
g ◦ x g ( y ) when y ∈ S (for the m-bit logarithm, y ∈ S ).
1
We represent the time complexity using the function T : N → R+ , where the input is
the bit size and the output is the multiplication steps. The complexity of calculating the
m
DLP of the m-bit logarithm is given by T (m) = 2 2 [19,20]. In this case, the complexity
s(m)
of calculating the n-bit logarithm is T (n) = T (s(m)) = 2 2 . Obviously, T (n) > T (m)
when n = s(m) > m; however, we should also consider how the complexity increases as
m increases. We can compare the growth rate of T (m) and T (s(m)) using Landau’s big-O
notation with the following definition:
is satisfied for all m ≥ m0 , we state that T (m) has an order of g(m) time complexity, and we
describe it as
T (m) ∈ O( g(m)). (19)
Therefore, for two complexity functions T, T : N → R+ , where T (m) ≤ T (m) for all
m ∈ N and t(m) ∈ O( g(m)), T (m) ∈ O( g (m)), if we wish to state that the growth rate of
T is higher than that of T (the order of T is higher than T), we can simply describe it as
follows:
O( g (m)) ⊂ O( g(m))
or
g (m) > g(m)
59
Appl. Sci. 2021, 11, 5540
n = s(m) = hm + b,
m s(m)
where h, b ∈ N, h = 1. If O(2 2 ) = O(2 2 ) holds in this case, there must exist the constants
c, m0 ∈ N such that for all m ≥ m0 ,
s(m) hm+b m
2 2 =2 2 ≤ c2 2 (20)
m(h−1)+b
is satisfied. However, in this case, 2 2 ≤ c holds, so that c cannot be a constant. Thus,
in this case,
s(m) m
O (2 2 ) ⊂ O (2 2 )
holds. When the case s is over a one-degree polynomial, it is obvious that there is no
c, m0 ∈ N to satisfy (20) for all m ∈ m0 .
If s is a linear and monic polynomial described as
n = s(m) = m + b,
where b ∈ N, we have
s(m) m+b m m
O (2 2 ) = O (2 2 ) = O (2b 2 2 ) = O (2 2 ).
Therefore, if Alice and Bob attempt a key exchange using the DH algorithm, where
Bob’s secret key x B is
xB ∈ S
and Alice’s secret key x A is
xA ∈ S ,
Eve should attempt to obtain x A , which is easier for her to obtain than x B . The security
of the algorithms in Sections 3.2 and 3.3 also depends on the difficulty of the DLP, but
these algorithms may not be the same as DH owing to one of Bob’s public keys y B,1 and
especially the map N1 . In Section 2.1, we discussed the requirement for y B,1 to allow Alice
to select her secret key from a small bit length set without reducing the breaking complexity.
For both algorithms, Requirement 2 is satisfied if Bob selects n B , as it is an n-bit logarithm
for x g with the algorithm of Section 3.2, and similarly, NB is selected with the algorithm of
Section 3.3. Of course, in this case, the number n must be sufficiently large to ensure that
the calculation of the n-bit logarithm is not achieved in real time.
60
Appl. Sci. 2021, 11, 5540
At this point, we focus on verifying whether the other algorithms in Sections 3.2 and 3.3
satisfy Requirement 1. of Section 2.1. Subsequently, we attempt to determine the polyno-
mial s; that is, how small m can be and the computational complexity of these algorithms
for each key of Alice and Bob when m is selected to be as small as possible (Section 4.4).
As she does not know n B , it appears that she must obtain the n-bit logarithm n B by
calculating logg gnB = n B before obtaining the m-bit logarithm x A . However, (21) can be
described as
loggnB gnB x A
y− 1 −1 −1 −1
B,1 ( y A ) = x4 ◦ N1 ( y A ) = n B log g y A = n B = loggnB gnB x A = x A .
loggnB g
x g ( y ) : = x gn B ( y ) = ( gn B )y ,
y− −1
B,1 = x g , we can construct x g using only g , which is public. For this attack, Eve only
1 nB
needs to calculate x −
g
1
(y A ) = x−
g
1
◦ x g ( x A ) to obtain the m-bit logarithm x A . Therefore, it
can be said that Alice selecting a small bit length set S reduces the breaking complexity for
Eve; that is, Requirement 1 of Section 2.1 is not satisfied in this case.
61
Appl. Sci. 2021, 11, 5540
(y)
• y B,2 (y) a,b := x1 ◦ x2 (y) a,b = ∏l ∈{1,...,d} ( g◦ xB ) a,l l,b ,
where a, b ∈ {1, . . . , d}, and the following elements:
• g◦ x B
• g◦ NB
• y A = N1−1 ◦ x4 ( x A ) = g◦ NB x A ,
to calculate
−1
y− −1
B,1 ( y A ) a,b = x4 ◦ N1 ( g
1 ◦ NB x A
) a,b = logg ( g◦ NB NB x A
) a,b = logg ( g x A ) a,b = ( x A ) a,b (22)
for all a, b ∈ {1, . . . , d}. It appears that she first needs to obtain an element NB by calcu-
lating logg ( g◦ NB ) a,b for all a, b ∈ {1, . . . , d}, the complexity of which is equivalent to the
complexity for calculating d2 n-bit logarithms for the map x g , because she does not know
NB , so as to obtain NB−1 . After obtaining NB , she can proceed to the final two equalities
of (22), and the complexity of calculating logg ( g x A ) a,b = ( x A ) a,b for all a, b ∈ {1, . . . , d} is
equivalent to the complexity of calculating d2 m-bit logarithms for the map x g . We now
verify that Eve really needs to calculate at least the same amount as to obtain d2 n-bit
logarithms by calculating logg ( g◦ NB ) a,b for all a, b ∈ {1, . . . , d}. In Attack 1, we attempt to
find other descriptions for map y B,1 , as in Section 4.2. In Attack 2, we attempt an attack in
which Eve avoids calculating d2 DLPs for NB , and we compare the breaking complexity
of the attack with an exhaustive search for x A ∈ S . In Attack 3, we attempt to determine
the least complexity that Eve requires to obtain x A . Hereafter, we refer to the logarithm of
matrix M ∈ M(d, Z p ) as follows:
Attack 1
Using the known element g◦ NB , Eve attempts the following calculation for all a, b ∈
{1, . . . , d}:
logg◦ NB (y A ) = logg◦ NB g◦ NB x A . (23)
However, unlike in Section 4.2, the relation logg◦ NB g◦ NB x A = x A does not hold. She
obtains X ∈ S instead of x A ∈ S from (23):
logg◦ NB g◦ NB x A = logg◦ NB ( g◦ NB )◦ X = X.
NB • X = NB x A ,
Attack 2
We introduce an attack in which Eve does not calculate logg g◦ NB = NB . The public
key of Alice y A is described as
⎛ ( x A )l,1 ( x A )l,d ⎞
∏l ∈{1,...,d} ( g◦ NB )1,l ... ∏l ∈{1,...,d} ( g◦ NB )1,l
⎜ .. .. ⎟
yA = ⎜
⎝ .
..
. .
⎟
⎠
( x A )l,1 ( x A )l,d
∏l ∈{1,...,d} ( g◦ NB )d,l ... ∏l ∈{1,...,d} ( g◦ NB )d,l
62
Appl. Sci. 2021, 11, 5540
(b)
Step 1 For row b ∈ {1, . . . , d} of y A , Eve creates a matrix Wb = (wa,l ) ∈ M(d, Z p ) to
satisfy
⎛ ( x )1,b ( x )2,1 ( x )d,b ⎞ (b) (b) ⎛ (b) ⎞
( g◦ NB )1,1A ( g◦ NB )1,2A . . . ( g◦ NB )1,dA w1,1 w1,2 . . . w1,d
⎜ ⎟ ⎜ ⎟
⎜
⎝ ... ⎟=⎜
⎠ ⎝ ... ⎟
⎠
( x )1,b ( x )2,1 ( x )d,b (b) (b) (b)
( g◦ NB )d,1A ( g◦ NB )d,2A . . . ( g◦ NB )d,dA wd,1 wd,2 . . . wd,d .
(b)
v a,l := log( g◦ NB ) wa,l . (25)
a,l
is true.
Step 4 If (26) is true for all l ∈ {1, . . . , d}, Eve can recover row b ∈ {1, . . . , d} of x A :
As NB is an invertible matrix, there exists exactly one ( X1,b , X2,b , . . . , Xd,b )t ∈ Zdp
to satisfy system (24) for a given y A , g◦ NB of row b ∈ {1, . . . , d}; thus, it must be
( x A1,b , x A2,b , . . . , x Ad,b )t . If (26) is false in some l ∈ {1, . . . , d}, Eve returns to Step 1.
In Step 2, Eve needs to calculate at least two DLPs of m-bit logarithms; thus, she
m
requires 2 · 2 2 multiplications. As Z p forms a group under multiplication, for any arbitrary
(b)
element e in Z p , there exists a unique wa,d ∈ Z p such that:
(b)
ewa,d = (y A ) a,b
2nd(d−1)+ 2 +1 = 2s(m)d(d−1)+ 2 +1 ,
m m
which is much larger than the total multiplications required for an exhaustive search on
each row of x A ∈ S , namely:
m2dm
Attack 3
Suppose that Eve attempts to determine x A directly from the formula of Alice’s public
key y A :
( x )l,b
(y A ) a,b = ∏ ( g◦ NB ) a,lA . (27)
l ∈{1,...,d}
63
Appl. Sci. 2021, 11, 5540
( X )l,b
(y A ) a,b = ∏ ( g◦ NB ) a,l (28)
l ∈{1,...,d}
for all a, b ∈ {1, . . . , d} and given y A , g◦ NB . The least complexity for finding X from (28) is
expressed by the following theorem:
Theorem 1. The complexity of solving (28) for X ∈ S is equal to or greater than the complexity of
obtaining NB from logg g◦ NB .
Proof. Suppose that Eve can determine X = x A ∈ S from (28) for a given g◦ NB and y A
with a complexity denoted by the function T : N → R+ defined as T (m) := p(m), where
p is a polynomial. Under this assumption, Eve can solve system (28) even when x A ∈ S
(thus, X ∈ S ) because, in this case, the complexity is T (s(m)) = p(s(m)), which is also
a polynomial. It should be noted that solving (28) is equivalent to solving the following
system for all a, b ∈ {1, . . . , d}:
( X )ta,l
(y A )ta,b = ∏ ( g◦ NB )t l,b (29)
l ∈{1,...,d}
( x A )ta,l
= ∏ ( g◦ NB )t l,b
l ∈{1,...,d}
for all a, b ∈ {1, . . . , d}. This emphasizes that Eve can solve the system of both variable
matrices by left multiplication and right multiplication on the Schur exponent of a given
matrix. Therefore, once she has obtained X ∈ S and calculates g◦ X within polynomial time,
the following system can also be solved for an unknown NB ∈ S :
( N ) a,l
(y A ) a,b = ∏ ( g◦X )l,b B (30)
l ∈{1,...,d}
64
Appl. Sci. 2021, 11, 5540
Requirements
n Sufficiently large for one DLP
m 2dm must be sufficiently large for an exhaustive search
As mentioned in Section 4, the complexity of one n-bit logarithm denoted by the function
T : N → R+
s(m)
is T (n) = T (s(m)) = 2 2 , and the complexity of an exhaustive search for x A of each row
denoted by the function T : N → R+ is T (m) = 2dm . (In fact, Eve must solve d2 DLPs
or d exhaustive searches for a vector space Z2dm , but d2 and d are so trivial for exponential
functions that we consider these complexities for only one number.) If Alice and Bob would
like the breaking complexity for one DLP and that for an exhaustive search for one row of
x A to be equalized, the polynomial s : N → N mentioned at the beginning of Section 4.3;
that is, the relation between n and m, is obtained by the following theorem:
Theorem 2. If the time complexity for obtaining one n-bit logarithm is equal to the complexity of
an exhaustive search within a vector space of size 2dm , the polynomial s : N → N for m ∈ N is
obtained by
s(m) = 2dm.
Proof. We obtain
s(m)
T (n) = T (s(m)) = T (m) ⇐⇒ = dm. (31)
2
With this relation, what does the computational complexity for Alice and Bob become?
g◦ X (32)
Ty A (m) := d3 m (34)
Tκ A (m) := d3 m. (35)
From the calculations of g◦ x B , g◦ NB ,
and (16), Bob’s computational complexities for
y B,1 , y B,2 , and their SSK, denoted by the functions TyB,1 , TyB,2 , Tκ B : N → R+ of n, become
65
Appl. Sci. 2021, 11, 5540
Tκ B (n) := d3 n (38)
As n = s(m) = 2dm, TyB,1 , TyB,2 , and TyκB can be described as functions of m as follows:
Ty A (m) + Tκ A (m) = 2d3 m < 4d3 m + 2d4 m = TyB,1 (s(m)) + TyB,2 (s(m)) + Tκ B (s(m)). (39)
When d is given, Ty A (m) + Tκ A (m) and TyB,1 (s(m)) + TyB,2 (s(m)) + Tκ B (s(m)) have
the same order of complexity. 2d4 and 2d3 are considered as coefficients.
Of course, we can consider the total complexity for Alice and Bob by the functions
TA , TB : N → R+ of d, which are defined as
TA (d) := 2md3
Remark 2. As mentioned in Remark 1, Alice can select her secret key x A not only as a matrix, but
also as a vector, whereas x B and NB are selected as matrices. Let V be a vector space of V := Zdp ,
where |y| ≤ dn (y ∈ V ) and a subset V of V := {y ∈ V : |y| ≤ dm} ⊂ V , and suppose that x A
is selected from V . In this case, the breaking strategy against this algorithm for Eve is as follows:
• solving d2 DLPs for NB and d DLPs for x A , and
• an exhaustive attack for x A within V , the size of which is 2sm .
That is, the breaking complexity is equivalent to the case when x A ∈ S in terms of the order of
complexity. Thus, there is also no problem in selecting n and m from the relation of Theorem 2 in
this case. The computational complexities for y A , κ A , and κ B , which are denoted by the functions
Ty A ,vec , Tκ A ,vec , and Tκ B ,vec : N → R+ , are
Ty A ,vec (m) := d2 m
Tκ A ,vec (m) := d2 m
Tκ B ,vec (n) := d2 n
for a given d. It can be observed that the complexities are reduced from (34), (35), and (38). Moreover,
the calculation of (33) is independent for each element a ∈ {1, . . . , d}, and thus, it can be computed
in parallel. The calculation is also parallelized for each l ∈ {1, . . . , d}. In Section 6, we report the
results of the implementational experiments of SEDH in the case of x A ∈ V . To determine how
rapidly the calculation is performed, particularly for Alice, we compare the calculation speed with
that of the usual DH algorithm.
5. Generalization
In this section, all investigations are conducted under the following assumptions:
• the numbers n, m ∈ N and a not linear and monic polynomial s, where n = s(m) > m
for all m ∈ N;
66
Appl. Sci. 2021, 11, 5540
• a number p, where | p| = n;
• a semi-group S , where |y|S ≤ kn (y ∈ S , k ∈ N);
• a subset S := {y ∈ S : |y|S ≤ km} ⊂ S ; and
• xA ∈ S .
In the above, | · |S is the same notation as that in Section 4.1. Furthermore, the constant
k is uniquely determined by the properties of S , such as the dimensions of the matrices
and vectors. For example, k = 1 when S = Z p and k = d2 when S = M(d, Z p ).
As mentioned in Section 2.3, we first construct a SAPKA subclass, in which the security
of each algorithm is ensured by a non-easily invertible map (Definition 1). Thereafter, we
attempt to construct the main SAPKA subclass mentioned in Section 2.2. We introduce
inclusion relations between other subclasses and necessary conditions for algorithms to
belong to the main subclass in Section 5.2.
We present the definition of the subclass based on the non-easily invertible map.
x1 ◦ x2 ( y ) = x3 ◦ x4 ( y ) (40)
x1 ◦ x2 ( y ) = f g ◦ x1 ◦ x2 ( y ) (41)
/ S ∨ N1−1 ◦ x4 (e) ∈
x1 ◦ x2 ( e ) ∈ /S (43)
are satisfied for all y ∈ S , where e is an identity element of S , the quintuple C is a member of the
SAPKA f g class, and we express this relation as
C ∈ SAPKA f g .
From (40) and (41), and the compatibility condition of (13), the equation
x1 ◦ x2 ( y ) = f g ◦ x1 ◦ x2 ( y ) = f g ◦ x3 ◦ x4 ( y ) = x3 ◦ x4 ( y ) (44)
Table 4. Public keys and SSK of Alice and Bob for SAPKA f g class.
Key Parameter
y B,1 (y) = N1−1 ◦ x4 (y) = f g ◦ N1−1 ◦ x4 (y) (y ∈ S),
Public keys
y B,2 (y) = x1 ◦ x2 (y) = f g ◦ x1 ◦ x2 (y) (y ∈ S)
Bob
SSK κ B = x3 ◦ N1 (y A ) = f g ◦ x3 ◦ N1 ◦ N1−1 ◦ x4 ( x A ) ∵ (44)
Public key y A = N1−1 ◦ x4 ( x A ) = f g ◦ N1−1 ◦ x4 ( x A )
Alice
SSK κ A = x1 ◦ x2 ( x A ) = f g ◦ x1 ◦ x2 ( x A )
67
Appl. Sci. 2021, 11, 5540
Moreover, (41) and (42) ensure that the maps x1 ◦ x2 and N1−1 ◦ x4 and thus the
element x A , cannot be disclosed by Eve within polynomial time. If x1 ◦ x2 (e) ∈ S and
N1−1 ◦ x4 (e) ∈ S , the complexities for calculating
x1 ◦ x2 (ye) = yx1 ◦ x2 (e) = yx3,E ◦ N1,E ◦ N1−1 ◦ x4 (e) = x3,E ◦ N1,E ◦ N1−1 ◦ x4 (ye)
or
x1 ◦ x2 (ye) = x1 ◦ x2 (e)y = x3,E ◦ N1,E ◦ N1−1 ◦ x4 (e)y = x3,E ◦ N1,E ◦ N1−1 ◦ x4 (ye)
holds for all y ∈ S . Thus, only if Eve knows the calculation rule of map x3 ◦ N1 can she
obtain a map x3,E ◦ N1,E to satisfy
Of course, if the maps x1 , x2 , x3 , x4 , N1 are not linear, this attack may be impossible,
but if (43) holds and n is sufficiently large to make the calculation of f g−1 ◦ f g (y) (y ∈
/ S)
difficult in real time, Eve cannot proceed to (46). Thus, Requirement 2 of Section 2.1 is
achieved for all algorithms in the SAPKA f g class.
68
Appl. Sci. 2021, 11, 5540
• x1 ( y ) : = x B y
• x2 := id
• x3 ( y ) : = x B y
• x4 (y) := id
• N1 := NB−1 y
• f g ( y ) : = g◦y ,
(40) holds, and (41) and (42) are satisfied according to Table 6.
Definition 7. For the quintuple C such that C ∈ SAPKA f g , if N1 = x4 = id, we state that C is
a member of the symmetric SAPKA f g class, and we express this relation as
C ∈ SAPKA f g ,symmetry .
where y ∈ S . This means that Bob sending y B,1 is equivalent to simply informing Alice of
the construction rule of y A . Alice can calculate y A without any secret/public information
from Bob. Symmetry means that the number of public keys that both Alice and Bob con-
struct is essentially one, and they can construct their public keys without any public/secret
information from the other.
For CNDH in Section 3.2, if the parameter n B is equal to 1, CNDH describes the usual
DH. We denote this case of CNDH as CDH , and we obtain
69
Appl. Sci. 2021, 11, 5540
Definition 8. For a given non-easily invertible map f g : S → S , if the set to which the input
y belongs is S , we refer to the input y as a km-bit f g−1 element. If the set to which the input y
belongs is S , we refer to the input y as a kn-bit f g−1 element.
The complexity of obtaining the kn-bit f g−1 element means the complexity of calculat-
ing f g−1 ◦ f g (y) when y ∈ S (for the km-bit f g−1 element, y ∈ S ).
We define the following functions of bit size that demonstrate the complexity of calcu-
lating each public key and SSK as well as the complexity for Eve to obtain a certain key:
• TyB,1 : N → R+ : the complexity required for Bob to construct y B,1 ;
• TyB,2 : N → R+ : the complexity required for Bob to construct y B,2 ;
• Ty A : N → R+ : the complexity required for Alice to calculate y A ;
• Tκ A : N → R+ : the complexity required for Alice to calculate her SSK κ A ;
• Tκ B : N → R+ : the complexity required for Bob to calculate their SSK κ B ;
• TEve x A : N → R+ : the least complexity required for Eve to obtain Alice’s secret key
x A from public informations, where the input of TEve x A is m; and
• T f −1 : N → R+ : the complexity required to compute f g−1 ◦ f g (y) for map f g , where
g
the input of T f −1 is m if y ∈ S and n if y ∈ S .
g
Furthermore, for TEve x A , T f −1 , we define the following functions to describe their order:
g
• h Eve x A : N → R+
to obtain the relation TEve x A ∈ O(h Eve x A ), and
• h f −1 : N → R+ to obtain the relation T f −1 ∈ O(h f −1 ).
g g g
Example 1.
• If TyB,1 , TyB,2 , and Tκ B are defined as functions of n, Ty A and Tκ A are defined as functions of
m, and the polynomial s is concretely determined, we can compare the total computational
complexity for Alice and Bob as per Section 4.5.
• The least complexity for Eve to obtain x A in Section 3.3 is determined as
s(m)
TEve x A (m) ≥ d2 2 2 .
• The complexity of obtaining the d2 n-bit f g−1 element, where f g is the non-easily invertible
map of CSE as defined in Table 6, is expressed as
s(m)
T f −1 (n) = T f −1 (s(m)) = d2 2 2 .
g g
In general, the complexity of obtaining the kn-bit f g−1 element for a given non-easily invertible
map f g : S → S is described by T f −1 (n) = T f −1 (s(m)), and that of the km-bit f g−1 element is
g g
Tf (m). We can compare not only the complexity but also the order of complexity of the two using
−1
g
the map h f −1 defined above, as per the following Remark 3.
g
Remark 3. When the order of complexity for computing the km-bit f g−1 element for map f g is
equal to O(2am ), where a ∈ R+ \ {0}:
70
Appl. Sci. 2021, 11, 5540
holds, and the equality holds if and only if s is a linear monic polynomial for some t ∈ N:
n = s(m) = m + t.
This can be proven in a similar manner to Proposition 1, and in this case, s is restricted as a
non-linear and monic polynomial, so the equality never holds.
When the order of complexity for computing the km-bit f g−1 element is greater than O(2am ),
although the sufficient and necessary condition of s for the equality of (47) may not be the same,
h −1 (s(m))
fg
holds because, for all m, the ratio h −1 ( m )
is much larger and cannot be a constant in this
f g
case. This means that when s is a not linear and monic polynomial, the problem of obtaining the
kn-bit f g−1 element by computing f g−1 ◦ f g (y) (y ∈ S) and the problem of obtaining the km-bit
f g−1 element by computing f g−1 ◦ f g (y) (y ∈ S) belong to different orders of complexity classes.
Furthermore, needless to say, T f −1 (s(m)) > T f −1 (m) holds.
g g
holds, C is a member of the computationally unbiased SAPKA f g class, and we write this relation as
C ∈ SAPKA f g ,A= B .
Relation (49) indicates that the breaking complexity of an algorithm that is constructed
from C ∈ SAPKA f g ,A< B is equal to or greater than the complexity that is required to obtain
the kn-bit f g−1 element, even if x A is selected from S . Furthermore, (51) indicates that
when C ∈ SAPKA f g ,A= B , the breaking complexity is equal to or less than that of obtaining
the km-bit f g−1 element. As noted in Remark 3, not only does T f −1 (s(m)) > T f −1 (m)
g g
hold, but the orders of complexity are also different, so there cannot exist a C such that
C ∈ SAPKA f g ,A< B ∩ SAPKA f g ,A= B . Thus, we obtain Lemma 1 in Section 5.2.
According to the attack introduced in Section 4.3, we obtain the following results.
Theorem 3. For CSE in Section 3.3, if Attack 2 of Section 4.3 and the exhaustive attack for x A ∈ S
are only attacks without calculating logg g◦ NB = NB or there exist other attacks but the order of
complexity is equal to or greater than O(2dm ),
71
Appl. Sci. 2021, 11, 5540
Proof. According to Attack 3 in Section 4.3 and the descriptions of TEve x A (m) and T f −1 (s(m))
g
in Example 1, we obtain
s(m)
TEve x A (m) ≥ d2 2 2 = T f −1 (s(m)).
g
TyB,1 (s(m)) + TyB,2 (s(m)) + Tκ B (s(m)) = 4md3 + 2md4 > Ty A (m) + Tκ A (m) = 2md3
y A = f g ( x A ).
for all y ∈ S , m ∈ N.
Proof. We prove the contrapositive. We assume that for C ∈ SAPKA f g ,A< B , there exists a
map f g : S → S to satisfy
f g ◦ N1−1 ◦ x4 (y) = f g (y) (54)
Tf −1 ≤ T f −1 (55)
g g
y A = f g ◦ N1−1 ◦ x4 ( x A ) = f g ( x A ).
72
Appl. Sci. 2021, 11, 5540
This indicates that x A is a km-bit f g−1 element for a map f g . In combination with (55), the
relation
TEve x A (m) = T f −1 (m) ≤ T f −1 (m)
g g
Corollary 2.
CNDH ∈ SAPKA f g ,A= B
Remark 4. Theorem 1 of Section 4.3 implies that there does not exist a map f g to satisfy f g (y) =
f g ◦ N1−1 ◦ x4 (y) with T f −1 ≤ T f −1 as for algorithm of CSE .
g g
Corollary 3.
SAPKA f g ,symmetry ⊂ SAPKA f g ,A= B .
Moreover, CNDH is related to CNDH ∈ / SAPKA f g ,symmetry , but CNDH ∈ SAPKA f g ,A= B
according to Corollary 2. Thus, SAPKA f g ,symmetry ⊂ SAPKA f g ,A= B is obtained.
SAPKA
SAPKA
SAPKA
SAPKA
SAPKA ,symmetry
73
Appl. Sci. 2021, 11, 5540
as well as the necessary conditions (52) and (53) as sufficient conditions for the SAPKA f g ,A< B
class have not yet been confirmed. These will be investigated in our subsequent paper.
6. Implementational Experiments
In this section, we report the simple experimental results. All experiments were
performed in the following environment:
• OS: Windows 10 Pro
• Processor: 2.90 GHz Intel Core i7-10700
• RAM: 8 GB
• Language: JAVA
Tables 7–9 indicate the computational complexity for Alice (y A , κ A ) and Bob (y B,1 , y B,2 , κ B )
according to d and m while n was fixed as 1024, 3072, and 5120 bits. As can be observed
from Tables 7–9, the calculation time for Bob increased as d increased. This is because the
calculation complexities of (32) and (33) were dependent only on d, whereas n was fixed.
For Alice’s computation, there were cases in which the calculation time decreased as d
increased (Tables 8 and 9). This can be explained by (32) and (33) for each a ∈ {1, . . . , d},
being independent, and an increase in d resulted in a decrease in m. Therefore, these values
can be computed in parallel with a smaller m when d increases. When d > 4, the calculation
time will increase despite the decrease in the size of m owing to a shortage in processors
for efficient parallel computing.
Moreover, the difference in the order of complexity for Alice and Bob can be observed
from Figure 4, where m was fixed as 128 bits and d increased from 2 to 16.
Table 7. Calculation time (ms) for Alice and Bob when n = 1024.
Table 8. Calculation times (ms) for Alice and Bob when n = 3072.
Table 9. Calculation times (ms) for Alice and Bob when n = 5120.
74
Appl. Sci. 2021, 11, 5540
!
Figure 4. Comparison of calculation times (ms) for Alice and Bob according to d (m = 128).
We verified how rapidly the calculation of Alice could be achieved by comparing the
speed with that of the usual DH. Table 10 and Figure 5 compare the calculation times of
SEDH and DH for y A and κ A for the same security level (n) from 1024 to 7168 bits. The
calculation speeds in the following table and figure are the averages of 50 calculations.
Table 10. Comparison of calculation times (ms) for Alice with DH.
Time (ms)
SEDH(Alice)
DH(Alice)
Length of n (bits)
7. Conclusions
In this study, we have estimated the breaking complexity and computational com-
plexity of two SAPKA class algorithms. One of these potentially possesses the property
75
Appl. Sci. 2021, 11, 5540
whereby the security is less dependent on the secret key space of Alice than that of Bob. This
property allows Alice to calculate her SSK with less complexity than Bob in the algorithm.
Moreover, we generalized these algorithms and constructed SAPKA subclasses ac-
cording to the security/efficiency properties. We constructed several subclass algorithms
in which the above property holds. We expect that algorithms in this class will aid in
constructing secure communication infrastructures, even with a small capability of devices
for one side (Alice’s computational asymmetry). The necessary conditions for this class
and inclusion relations with other SAPKA subclasses were also investigated.
In the following Future Work section, we provide several unclear points, which include
new problems obtained from this study (indicated by •) and problems not discussed in
this paper but that must be investigated for practical use of presented algorithms and
frameworks (indicated by ◦).
Future Works
• Is it true that SAPKA f g ,A< B ∪ SAPKA f g ,A= B = SAPKA f g ? (Remark 5)
• Is it true that if the relations (52) and (53) hold for C ∈ SAPKA f g , C is a member of
the SAPKA f g ,A< B class? That is, are (52) and (53) also sufficient conditions for the
SAPKA f g ,A< B class? As mentioned in Section 2.2, this paper does not provide how to
construct secure PKA algorithms resilient to any types of theoretical attacks. However,
if sufficient conditions for the SAPKA f g ,A< B class are found, algorithms that cannot
be broken at least within polynomial time can be easily constructed because those are
protected thanks to non-easily mappable f g (page 23). Of course, conditions might
differ depending on required security level, but considering the sufficient conditions
surely helps us provide how to construct secure PKA algorithms.
• Does S need to be selected as a multiplicative semi-group/group? For example, the
algorithms in Section 3.3 are effective even in vector space, which forms an additive
group (Remark 1). We should investigate how we can construct PKA algorithms
on algebraic structures other than multiplicative semi-groups and then compare the
breaking and computational complexities with the original one.
◦ We should attempt to describe algorithms such as LWE-based or NTRU-type algo-
rithms with the SAPKA parameters and verify whether these algorithms belong to
the SAPKA f g ,A< B class. Otherwise, would it be possible to modify these to belong to
the SAPKA f g,A< B class?
◦ It is well known that key agreement algorithms are vulnerable against man-in-the-
middle (MITM) attack. Obviously, also algorithms in SAPKA are also not exceptional
without any measures. Moreover, the number of public keys especially for Bob is
larger than symmetric type algorithms, so they way to manage public keys for SAPKA
algorithms must be discussed more. Adaptability with techniques such as digital
signature [21,22] and public key certifications [23] must be well discussed.
◦ As mentioned in [1], the SSKs that Alice and Bob calculate should be ephemeral, and
each key must have enough randomness. To practically use algorithms introduced
in this paper, if public keys and SSK are sufficiently random to protect Eve from
deducing SSK must be investigated.
Author Contributions: Conceptualization, K.J., S.I. and M.R.; Formal analysis, K.J. and S.I.; Imple-
mentation, K.J.; Writing—original draft, K.J.; Writing—review and editing, K.J., S.I. and M.R. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
76
Appl. Sci. 2021, 11, 5540
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Shannon, C.E. Communication Theory of Secrecy Systems. Bell Syst. Tech. J. 1949, 28, 656–715. [CrossRef]
2. Diffie, W.; Hellman, M. New directions in cryptography. IEEE Trans. Inf. Theory 1976, 22, 644–654. [CrossRef]
3. Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM
1978, 21, 120–126. [CrossRef]
4. Adrian, D.; Bhargavan, K.; Durumeric, Z.; Gaudry, P.; Green, M.; Halderman, J.A.; Heninger, N.; Springall, D.; Thomé, E.;
Valenta, L.; et al. Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice. In Proceedings of the 22nd ACM SIGSAC
Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 5–17.
5. Shor, P.W. Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer. SIAM J.
Comput. 1997, 26, 1484–1509. [CrossRef]
6. Bernstein, D.J.; Chuengsatiansup, C.; Lange, T.; van Vredendaal, C. NTRU Prime: Reducing Attack Surface at Low Cost.
In Selected Areas in Cryptography—SAC 2017; Lecture Notes in Computer Science; Adams, C., Camenisch, J., Eds.; Springer: Cham,
Switzerland, 2017; Volume 10719. [CrossRef]
7. Hülsing, A.; Rijneveld, J.; Schanck, J.; Schwabe, P. High-Speed Key Encapsulation from NTRU. In Cryptographic Hardware and
Embedded Systems—CHES 2017; Lecture Notes in Computer Science; Fischer, W., Homma, N., Eds.; Springer: Cham, Switzerland,
2017; Volume 10529. [CrossRef]
8. Hoffstein, J.; Pipher, J.; Silverman, J.H. NTRU: A ring-based public key cryptosystem. In Algorithmic Number Theory—ANTS 1998;
Lecture Notes in Computer Science; Buhler, J.P., Ed.; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1423. [CrossRef]
9. Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehle, D. CRYSTALS—Kyber: A
CCA-Secure module-lattice-based KEM. In Proceedings of the 2018 IEEE European Symposium on Security and Privacy (EuroS
and P), London, UK, 24–26 April 2018; pp. 353–367. [CrossRef]
10. Alkim, E.; Ducas, L.; Poppelmann, T.; Schwabe, P. Post-quantum key exchange—A new hope. In Proceedings of the 25th USENIX
Security Symposium (USENIX Security 16), Austin, TX, USA, 10–12 August 2016; pp. 327–343.
11. Post-Quantum Cryptography Competition Round 2 Submissions. Available online: https://csrc.nist.gov/projects/post-quantum-
cryptography/round-2-submissions (accessed on 16 March 2021).
12. PQC Standardization Process: Third Round Candidate Announcement. Available online: https://csrc.nist.gov/News/2020/pqc-
third-round-candidate-announcement (accessed on 16 March 2021).
13. Micciancio, D.; Regev, O. Worst-case to average-case reductions based on Gaussian measures. J. Comput. 2007, 37, 267–302.
[CrossRef]
14. Regev, O. On lattices, learning with errors, random linear codes, and cryptography. J. ACM 2009, 56, 1–40. [CrossRef]
15. Laine, K.; Lauter, K. Key Recovery for LWE in Polynomial Time. IACR Cryptology ePrint Archive 2015, no. 176. Available online:
https://eprint.iacr.org/2015/176.pdf (accessed on 14 June 2021).
16. Peikert, C. Public-key cryptosystems from the worst-case shortest vector problem. In Proceedings of the forty-first annual ACM
symposium on Theory of computing, Bethesda, MD, USA, 31 May–2 June 2009; pp. 333–342.
17. Coppersmith, D.; Shamir, A. Lattice Attacks on NTRU. In Advances in Cryptology—EUROCRYPT ’97; Lecture Notes in Computer
Science; Fumy, W., Ed.; Springer: Berlin, Germany, 1997; Volume 1233. [CrossRef]
18. Accardi, L.; Iriyama, S.; Regoli, M.; Ohya, M. Strongly Asymmetric Public Key Agreement Algorithms; Technical Report ISEC2011-20;
IEICE: Tokyo, Japan, 2011; pp. 115–121.
19. Pollard, J. Monte Carlo Methods for Index Computation (modp). Math. Comput. 1978, 32, 918–924.
20. Pohlig, S.; Hellman, M. An improved algorithm for computing logarithms over GF ( p) and its cryptographic significance
(Corresp.). IEEE Trans. Inf. Theory 1978, 24, 106–110. [CrossRef]
21. Lamport, L. Constructing Digital Signatures from a One-Way Function; Technical Report CSL-98; SRI International: Menlo Park, CA,
USA, 1979; Volume 238.
77
Appl. Sci. 2021, 11, 5540
22. Merkle, R.C. A Certified Digital Signature, Conference on the Theory and Application of Cryptology; Springer: New York, NY, USA, 1989.
23. Cooper, D.; Santesson, S.; Farrell, S.; Boeyen, S.; Housley, R.; Polk, W. Internet X. 509 Public Key Infrastructure Certificate and
Certificate Revocation List (CRL) Profile, RFC 5280. 2008; pp. 1–151. Available online: https://datatracker.ietf.org/doc/html/
rfc5280 (accessed on 14 June 2021).
78
applied
sciences
Article
The Design and FPGA-Based Implementation of a Stream
Cipher Based on a Secure Chaotic Generator
Fethi Dridi 1,2 , Safwan El Assad 2, *, Wajih El Hadj Youssef 1 , Mohsen Machhout 1 and René Lozi 3
1 Electronics and Microelectronics Laboratory (EmE), Faculty of Sciences of Monastir, University of Monastir,
5019 Monastir, Tunisia; [email protected] (F.D.); [email protected] (W.E.H.Y.);
[email protected] (M.M.)
2 IETR (UMR 6164) Laboratory, CNRS, University of Nantes, F-44000 Nantes, France
3 J. A. Dieudonné (UMR 7351) Laboratory, CNRS, University of Cote d’Azur, 06103 Nice, France;
[email protected]
* Correspondence: [email protected]
Abstract: In this study, with an FPGA-board using VHDL, we designed a secure chaos-based
stream cipher (SCbSC), and we evaluated its hardware implementation performance in terms of
computational complexity and its security. The fundamental element of the system is the proposed
secure pseudo-chaotic number generator (SPCNG). The architecture of the proposed SPCNG includes
three first-order recursive filters, each containing a discrete chaotic map and a mixing technique
using an internal pseudo-random number (PRN). The three discrete chaotic maps, namely, the
3D Chebyshev map (3D Ch), the 1D logistic map (L), and the 1D skew-tent map (S), are weakly
coupled by a predefined coupling matrix M. The mixing technique combined with the weak coupling
technique of the three chaotic maps allows preserving the system against side-channel attacks
(SCAs). The proposed system was implemented on a Xilinx XC7Z020 PYNQ-Z2 FPGA platform.
Logic resources, throughput, and cryptanalytic and statistical tests showed a good tradeoff between
efficiency and security. Thus, the proposed SCbSC can be used as a secure stream cipher.
Citation: Dridi, F.; El Assad, S.;
El Hadj Youssef, W.; Machhout, M.; Keywords: chaos-based stream cipher; SPCNG; 3D chebyshev; logistic; skew-tent; FPGA; perfor-
Lozi, R. The Design and FPGA- mance
Based Implementation of a Stream
Cipher Based on a Secure Chaotic
Generator. Appl. Sci. 2021, 11, 625.
https://doi.org/10.3390/app1102 1. Introduction
0625
The protection of information against unauthorized eavesdropping and exchanges
is essential, in particular for military, medical, and industrial applications. Nowadays,
Received: 30 November 2020
cryptographic attacks are more and more numerous and sophisticated; consequently, new
Accepted: 5 January 2021
effective and fast techniques of information protection have appeared or are under develop-
Published: 11 January 2021
ment. In this context, recent works have focused on designing new chaos-based algorithms,
Publisher’s Note: MDPI stays neutral
which provide reliable security while minimizing the cost of hardware and computing time.
with regard to jurisdictional clai-ms
Chaos theory was first discovered in the computer system by Edward Lorenz in 1963 [1].
in published maps and institutio-nal A chaotic system, although deterministic and not truly random, has unpredictable behavior,
affiliations. due to its high sensitivity to initial conditions and control parameters which constitute
the secret key. It can generate an aperiodic analog signal whenever its phase space is
continuous (i.e., with an infinity of values). However, when its phase state is discrete (with
a finite set of values), its orbits must be periodic, even with a very long period.
Copyright: © 2021 by the authors. In the field of chaos-based digital communication systems, the chaotic signal has been
Licensee MDPI, Basel, Switzerland. one of the main concerns in recent decades and is widely used to secure communication.
This article is an open access article
In chaos-based cryptography, discrete chaotic maps are used in most chaotic systems (en-
distributed under the terms and
cryption, steganography, watermark, hash functions) to generate pseudo-random chaotic
conditions of the Creative Commons
sequences with robust cryptographic properties [2–10]. In a stream cipher, the pseudo-
Attribution (CC BY) license (https://
random number generator (PRNG) is the most important component since all the security
creativecommons.org/licenses/by/
of the system depends on it. For this, a new category of pseudo-chaotic number generator
4.0/).
79
Appl. Sci. 2021, 11, 625. https://doi.org/10.3390/app11020625 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 625
(PCNG) has been recently built to secure stream-data [11–14]. These PCNGs use combined
chaotic maps because single chaotic maps are not secure for use in stream ciphers.
In 2017, M. Abu Taha et al. [15] designed a novel stream cipher based on an efficient
chaotic generator; the results obtained from the cryptographic analysis and of common
statistical tests indicate the robustness of the proposed stream cipher. In 2018, Ons et al. [16]
developed two new stream ciphers based on pseudo-chaotic number generators (PCNGs)
that integrate discrete chaotic maps and use the weak coupling and switching technique
introduced by Lozi [17]. Indeed, the obtained results show that the proposed stream ciphers
can be used in practical applications, including secure network communication.
In 2019, Ding et al. [18] proposed a new lightweight stream cipher system based on
chaos—a chaotic system—and two nonlinear feedback shift registers (NFSRs) are used.
The results show that the stream cipher has good cryptographic characteristics. In 2020,
Abdelfatah et al. [19] proposed several efficient multimedia encryption techniques based
on four combined chaotic maps (Arnold Map, Lorenz Map, Chebyshev Map, and logistic
Map) using serial or parallel connections. With the rapid growth of Internet of Things
(IoT) technology that connects devices with low power consumption and low computing
resources, the hardware implementation of chaotic and non-chaotic ciphers is more suitable
than a software implementation. Note that few chaotic encryption systems are realized in
the hardware [20–22].
In this study, we designed an efficient chaos-based stream cipher (SCbSC) using a
proposed secure PCNG. Then, we addressed the hardware implementation and evaluated
the performance in terms of resilience against cryptanalytic attacks and in terms of hard-
ware metrics (areas, throughput, and efficiency). The proposed system uses three weakly
coupled chaotic maps (3D Chebyshev, logistic, and skew-tent) and integrates a masking
technique in the recursive cells to resist side-channel attacks (SCAs). Its implementation on
a Xilinx XC7Z020 PYNQ-Z2 FPGA hardware platform achieves a throughput of 1.1 Gbps
at an operating frequency of 37.25 MHZ.
The main contributions of the proposed chaotic system are: First all, the introduction of
some countermeasures to fix side channel attacks (SCAs) which is done using the masking
technique on the recursive cells, and to fix division and conquer attacks on the initial
vector (IV), using a weakly coupling matrix. Second, its hardware implementation on a
Xilinx XC7Z020 PYNQ-Z2 FPGA platform and evaluation of its performance in terms of
computational complexity and security.
The remainder of this paper is organized as follows. The next Section 2 presents
the architecture of the proposed secure chaos-based stream cipher. Section 3 presents
the hardware implementation on the Xilinx XC7Z020 PYNQ-Z2 FPGA platform of the
proposed secure pseudo-chaotic number generator (SPCNG) and analyzes its performance.
Section 4, investigates the performance of the proposed SCbSC in terms of hardware metrics
and cryptanalytic analysis. Finally, Section 5 summarizes the whole paper.
80
Appl. Sci. 2021, 11, 625
(1&5<37,21 '(&5<37,21
&LSKHU7H[W &LSKHU7H[W
3ODLQWH[W &L &L 3ODLQ7H[W
3L 3L
.H\VWUHDP .H\VWUHDP
6HFUHW.H\ ,9 6HFUHW.H\ ,9
. .
kL
x Z-1
XLCM(n)
x Z-1 PRNL
KL × XLC(n-1)
kL
IVL
+ Logistic XL(n)
XLC(n)
XL0
XS0
IVS XS(n) XSC(n)
+ SkewT
IV [M] X(n)
KS × XSC(n-1) PRNS
XSCM(n)
K x
kS kS Z-1 Z-1
x
XT0
IVT XTI(n) XTIC(n)
+ 3D Ch
XT(n)
Q(n)
KT × XTIC(n-1)
LFSR
XTICM(n)
x Z-1 PRNT
kT
kT
x Z-1
The M-matrix weak coupling technique creates an interdependence between the three
chaotic maps that avoids an attacker using the divide and conquer approach on the first
81
Appl. Sci. 2021, 11, 625
128-bit IV. Indeed, for each new sample calculation, an attacker must take into account
the three chaotic maps together. Besides, the use of the logistic map and especially the
3D Chebyshev map (which we have discretized) adds robustness to the system against
algebraic attacks. Finally, the three recursive one-delay cells are protected against SCAs by
using a mixing technique based on three internal pseudo-random numbers: PRNL, PRNS,
and PRNT respectively, shown in red.
The proposed SPCNG takes as input an initial vector (IV) and a secret key (K). The IV
of the system provides the initial vectors of the three chaotic maps, IVL, IVS, and IVT; the
initial condition XS0 of the skew-tent map; and the initial seeds X0_L, X0_S, and X0_T (of
128 bits each) of the three pseudo-random numbers PRNL, PRNS, and PRNT. The output
of each PRN is of size N = 32 bits. The secret key K provides the initial conditions and
parameters of the SPCNG listed in Table 1.
Symbol Definition
The initial conditions of the chaotic maps: logistic and 3D
XL0 and XT0
Chebyshev respectively, ranging from 1 to 2 N − 1.
The initial conditions of the delayed values in recursive
XLC1, XSC1, and XTIC1 cells: logistic, skew-tent, and 3D Chebyshev respectively,
in the range [1, 2 N − 1].
The initial value Q0 of the Linear Feedback Shift Register
(LFSR) defined by:
Q0
Q(n) = x32 + x22 + x2 + x + 1.
Note that XLC1, XSC1, and XTIC1 mean XLC (−1), XSC (−1), and XTIC (−1).
The models of the discrete logistic, skew-tent, and 3D Chebyshev maps are respectively
given by:
• The discrete logistic map [25]:
⎧
⎪ XL(n−1)[2 N − XL(n−1)]
⎪
⎨ N − 2 i f XL(n − 1) = 3 × 2 N −2 , 2 N
2
XL(n) = (1)
⎪
⎪
⎩
2N −1 otherwise
82
Appl. Sci. 2021, 11, 625
x ( n −1)
p i f 0 ≤ x ( n − 1) ≤ p
x (n) = 1− x ( n −1) (4)
1− p i f p ≤ x ( n − 1) ≤ 1
Figure 3. (a) Mapping of the 3D Chebyshev map (3D Ch); (b) its attractor.
83
Appl. Sci. 2021, 11, 625
Figure 4. (a) Histogram of the 3D Ch; (b) histogram of the 3D Ch in parallel with a linear feedback
shift register (LFSR).
where XL(0), XS(0) and XT (0) are the initial values (inputs) of the three chaotic maps
defined as follows:
XL(0) = ( IVL + XL0 + KL × XLC1)
XS(0) = ( IVS + XS0 + KS × XSC1) (10)
XT (0) = ( IVT + XT0 + KT × XTIC1)
Afterward, for n ≥ 2 and n ≤ Ns , we calculate the samples by the following
relations:
XL(n) = Logistic mod KL × XLC (n − 1), 2 N (11)
XS(n) = SkewT mod KS × XSC (n − 1), 2 N , Ps (12)
XT (n) = 3D Ch mod KT × XTIC (n − 1), 2 N (13)
where Ns is the number of the desired samples, and XLC (n − 1), XSC (n − 1), and
XTIC (n − 1) are the unmasked inputs of the three chaotic maps.
The coupling system is defined by the following relation:
⎡ ⎤ ⎡ ⎤
XLC (n) XL(n)
⎣ XSC (n) ⎦ = M × ⎣ XS(n) ⎦, (14)
XTIC (n) XTI (n)
where: ⎡ ⎤
M11 ε 12 ε 13
M = ⎣ ε 21 M22 ε 23 ⎦, (15)
ε 31 ε 32 M33
with M11 = (2 N − ε 12 − ε 13 ), M22 = (2 N − ε 21 − ε 23 ), and M33 = (2 N − ε 31 − ε 32 ).
XL(n), XS(n), and XT (n) are the outputs of the chaotic maps: logistic, skew-tent, and
3D Chebyshev respectively, and
84
Appl. Sci. 2021, 11, 625
where XLCM (n), XSCM (n), and XTICM(n) represent the masked outputs of the recur-
sive cells: logistic, skew-tent, and 3D Chebyshev, respectively, and PRNL(n), PRNS(n),
and PRNT (n) are random integer values generated by the Xorshift pseudo-random num-
ber generator of random integer values, in the range [1, 2 N − 1]. To get the same output
X(n) for the same secret key and the same IV, the masking operations are reversed at the
inputs of the chaotic maps.
Note that PRNs are based on Xoshiro’s RNG, which was developed by David Black-
man and Sebastiano Vigna [29] in 2019, which serves as a parameter module for PRNs.
The Xoshiro construction itself is based on the Xorshift concept invented by George
Marsaglia [30]. Therefore, the masking operation is an effective countermeasure to protect
the implementation against power analysis-based side-channel attacks (SCAs) [31,32]. Note
that the VHDL implementation of these PRNs produce 32 bits at each clock cycle.
Algorithm 1 summarizes the full operation of the proposed SPCNG.
Samples generation;
M11 = (2 N − 12 − 13 );
M22 = (2 N − 21 − 23 );
M33 = (2 N − 31 − 32 );
while n ≥ 2 and n < Ns do
Internal State;
Unmasking operations;
XLC (n − 1) × KL = XLCM (n − 1) × KL ⊕ PRNL(n − 1) × KL;
XSC (n − 1) × KS = XSCM (n − 1) × KS ⊕ PRNS(n − 1) × KS;
XTIC (n − 1) × KT = XTICM (n − 1) × KT ⊕ PRNT (n − 1) × KT;
" #
XL(n) = Logistic mod(KL × XLC (n − 1), 2 N ) ;
" #
XS(n) = SkewT [mod(KS × XSC (n − 1), 2 ), Ps ] ;
N
" #
XT (n) = 3D Ch mod(KT × XTIC (n − 1), 2 ) ;
N
85
Appl. Sci. 2021, 11, 625
Design entry
IV Spcng.VHD
K
Behavioral Output
Testbench Simulation X(n)
Tb_spcng.VHD (Xsim)
Design
Check syntaxe
Synthesis
Optimize Design
XDC file
Translate
Design Implementation Map
Place & Route
Timing
Similation
Fmax
86
Appl. Sci. 2021, 11, 625
Behavioral
Simulation
Post-
Implementation
Timing
Simulation
Finally, we generated a programming file (BIT) to program the Xilinx device PYNQ-
Z2 FPGA.
1
Max.Freq. = [Mhz]. (19)
T − W NS
where T = 8 ns is the target clock period (F = 1/T = 125 Mhz) and WNS is the worst
negative slack of the clock signal in the intra-clock paths section.
Throughput
E f f iciency = [Mbps/Slices]. (21)
Slices
The proposed SPCNG versions were implemented on a Xilinx XC7Z020 PYNQ-Z2
FPGA hardware platform.
87
Appl. Sci. 2021, 11, 625
Table 2. Comparison of the proposed SPCNG design versions on ZYNQ PYNQ Z2 FPGA.
Versions
Chaotic Multiplexing XOR Operation
Without LFSR With LFSR Without LFSR With LFSR
LUTs 3744/7.04% 3763/7.07% 3586/6.74% 3599/6.77%
Area FFS 1066/1% 1130/1.06% 1064/1% 1128/1.06%
Resources used Slices * 1079/8.11% 1087/8.17% 1031/7.75% 1029/7.74%
DSPs 25/11.36% 25/11.36% 22/10% 22/10%
WNS [ns] −18.968 −19.062 −19.632 −18.018
Max. Freq. [Mhz] 37.08 36.95 36.18 38.43
Speed
Throughput 1186.59 1182.46 1158.07 1229.91
[Mbps]
Efficiency [Mbps/Slices] 1.09 1.08 1.12 1.19
NIST Successful Successful Successful Successful
* Note: Each slice contains four LUTs with 6 inputs and eight FFs.
The four SPCNG versions have the same general structure but are completely different
in their output function and slightly different in their internal state. The differences between
the versions of columns 1 and 2 on the one hand, and the versions of columns 3 and 4 on
the other hand, are in the output function used, as shown in Table 2. Indeed, versions 1
and 2 use a chaotic multiplexing technique as output function, where the sequence X(n) is
controlled by a chaotic sample Xth (n) and a threshold Tth is defined as follows:
$
XSC (n) i f 0 < Xth < Tth
X (n) = (22)
XTIC (n) otherwise
with Xth (n) = XLC (n) ⊕ XSC (n) and Tth = 0.8 × 2 N .
Version 2, compared to version 1, contains a LFSR in parallel with the 3D Chebyshev
map. Version 4 is the one shown in Figure 2, and version 3 is the same as version 4,
but without the LFSR. Moreover, all SPCNG versions successfully passed the 15 NIST
tests. However, versions without LFSR did not pass certain sub-tests. For the chaotic
multiplexing technique, we found only one failed sub-test out of 148 non-overlapping
template sub-tests, and for the XOR operation, we found three failed sub-tests out of
148 non-overlapping template sub-tests. Therefore, based on all results in Table 2, we
chose version 4, which is the best (in terms of resources used, throughput, and efficiency)
compared to other versions, to be used in the SCbSC system.
88
Appl. Sci. 2021, 11, 625
nature of it is highlighted by the histogram and figures of the uniform and uncorrelated
distribution of its iterates (Figures 7 and 8).
Figure 7. (a) Mapping of a sequence X(n) of 3,125,000 samples, generated by the proposed SPCNG
and the mapping of 1000 samples taken randomly from X(n) in (b).
Figure 8. Histogram.
89
Appl. Sci. 2021, 11, 625
where:
• Nc = 1000: number of classes.
• Oi : number of calculated samples in the ith class Ei .
• Ei = Ns /Nc : expected number of samples of a uniform distribution.
• Ns: the number of samples produced—here, Ns = 3,125,000
After that step, we obtain: χ2ex = 909.46 < χ2th ( Nc − 1; α) = 1073.64 (for Nc = 1000
and α = 0.05). The experimental value of the chi-square test is less than the theoretical one,
asserting the histogram’s uniformity. This test was performed on 100 different sequences
using 100 different secret keys, and all sequences were uniform.
This means that the proposed SPCNG produces indistinguishable sequences of integer
random sequences.
90
Appl. Sci. 2021, 11, 625
LUTs 3631/6.83%
The comparison of the hardware metrics of several chaotic and non-chaotic systems
(from eSTREAM project phase-2 focus hardware profile) is summarized in Table 5. This
comparison is difficult to interpret due to the differences in characteristics of the FPGAs
tested—particularly for the clock rate parameter. However, considering the clock rate of the
FPGA board and the efficiency achieved, we can make this comparison. Thus, the SCbSc
system presents competitive hardware metrics compared to those obtained from most
other chaotic and non-chaotic systems, except the Trivium cipher. However, since 2007,
different types of attacks have been applied to eSTREAM ciphers, thereby revealing some
weaknesses, in particular on Trivium cipher [34,35]. Indeed, in Trivium AND gates are the
only nonlinear elements to prevent attacks that exploit, among other things, the linearity of
linear feedback shift registers.
Table 5. Hardware metrics usage comparison of several chaotic and non-chaotic systems.
91
Appl. Sci. 2021, 11, 625
A robust cryptosystem should also be sensitive to the secret key; that is, changing
a one bit in the secret key must produce a completely different encrypted image. This
sensitivity is conventionally measured by two parameters which are the NPCR (number
of pixel change rate) and the UACI (unified average changing intensity) [39]. Besides,
instead of those two parameters which operate on the bytes, we use the Hamming distance
HD which operates on the bits (in our opinion HD is more precise than NPCR and UACI
parameters). The expressions of these parameters are given below, with C1 and C2 being
the two ciphered images of the same plain image P.
1
NPCR =
M×N ∑ D(i, j) × 100% (25)
i,j
$
1 i f C1 (i, j) = C2 (i, j)
D (i, j) = (26)
0 i f C1 (i, j) = C2 (i, j)
where M and N are the width and height of C1 and C2 . The NPCR measures the percentage
of different pixel numbers between two ciphered images.
1
M × N × 255 ∑
U ACI = |C1 (i, j) − C2 (i, j)| × 100% (27)
i,j
which measures the average intensity of differences between the two images.
Nb
1
HD (C1 , C2 ) =
Nb ∑ (C1 (i) ⊕ C2 (i)) (28)
i =1
Table 6. Number of pixel change rate (NPCR), unified average changing intensity (UACI), and
HD values.
92
Appl. Sci. 2021, 11, 625
Figure 9. Result of Lena image. (a) Lena image, (b) histogram of Lena image, (c) encrypted Lena,
and (d) histogram of encrypted Lena.
Figure 10. Result of Pepper image. (a) Pepper image, (b) histogram of Pepper image, (c) encrypted
Pepper, and (d) histogram of encrypted Pepper.
Figure 11. Result of Baboon image. (a) Baboon image, (b) histogram of Baboon image, (c) encrypted
Baboon, and (d) histogram of encrypted Baboon.
Figure 12. Result of Barbara image. (a) Barbara image, (b) histogram of Barbara image, (c) encrypted
Barbara, and (d) histogram of encrypted Barbara.
Figure 13. Result of Boats image. (a) Boats image, (b) histogram of Boats image, (c) encrypted Boats,
and (d) histogram of encrypted Boats.
93
Appl. Sci. 2021, 11, 625
It was observed that the histograms of the ciphered images are very close to the
uniform distribution and are completely different from the plain images. We applied
the chi-square test, using Equation (23), on ciphered images to statistically confirm their
uniformity. Nc = 28 = 256 is the number of levels, Oi is the calculated occurrence frequency
of each gray level i ∈ [0, 255] in the histogram of the ciphered image, and Ei is the expected
occurrence frequency of the uniform distribution, calculated by Ei = image size in bytes/Nc .
The distribution of the histogram tested is uniform if it satisfies the following condition:
χ2ex < χ2th ( Nc − 1, α) = 293.24 (for Nc = 256 and α = 0.05). The results obtained for the
chi-square test, given in Table 7, indicate that the histograms of the ciphered images tested
are uniform because their experimental values are smaller than the theoretical values.
Entropy Analysis
The random behavior of the ciphered image can be quantitatively measured by entropy
information given by Shannon [40]:
Nc −1
H (C ) = − ∑ P(ci ) × log2 ( P(ci )) (29)
i =0
where H (C ) is the entropy of the encrypted image, and P(ci ) is the probability of each gray
level appearance (ci = 0, 1, . . . , 255). In the case of equal probability levels, the entropy
is maximum (=8). The closer the experimental entropy value is to the maximum value,
the more robust the encryption algorithm. We give in Table 8, the results obtained from
the entropy test on the plain and encrypted images. It is clear that the obtained entropies
of ciphered images are close to the optimal value. Then, from these results, the proposed
stream cipher has a high degree level of resilience.
Correlation Analysis
In an original image, each pixel is highly-correlated with adjacent pixels in a horizontal,
vertical, and diagonal directions. A good encryption algorithm should produce encrypted
images with correlation and redundancy as low as possible (close to zero) between adjacent
pixels. To assess the correlation, we performed the following: first, we randomly selected
8000 pairs of two adjacent pixels from the image; then we calculated the correlation
coefficients by using the following equation:
Cov( x, y)
ρ xy = & & (30)
D ( x ) D (y)
where:
N
1
Cov( x, y) =
N ∑ [ xi − E(x)][yi − E(y)] (31)
i =1
94
Appl. Sci. 2021, 11, 625
N
1
E( x ) =
N ∑ xi (32)
i =1
N
1
D(x) =
N ∑ [ xi − E(x)]2 (33)
i =1
where x and y are the grayscale values of two adjacent pixels in the image. The obtained
results are shown in Table 9.
Table 9. Correlation coefficients of two adjacent pixels in the plain and ciphered images.
It appears from Table 9 that the correlation coefficients for the plain images are close to
1, which shows that the pixels are highly correlated, whereas for the encrypted images, the
correlation coefficients are close to 0, which proves that there is no correlation between the
plain and ciphered images. Therefore, there is no similarity between plain and encrypted
images, proving the very good achieved confusion by the proposed SCbSC.
According to all these results of the histogram, entropy, and correlation, the proposed
stream cipher presents a good ability to resist statistical attacks.
5. Conclusions
In this paper, we studied and implemented on a Xilinx PYNQ-Z2 FPGA hardware
platform using VHDL a novel chaos-based stream cipher (SCbSC) using a proposed secure
pseudo-chaotic number generator (SPCNG). The proposed chaotic system includes some
countermeasures against side channel attacks (SCAs) and uses a weekly coupling matrix,
which prevents division and conquers attacks on the initial vector (IV). Next, we analyzed
the cryptographic properties of the proposed SPCNG and evaluated the performances of
its hardware metrics. The results obtained demonstrate, on the one hand, the high degree
of security, and on the other hand, the good hardware metrics achieved by the SCPNG.
After that, we realized the SCbSC system and asserted its resilience against cryptanalytic
attacks. Further, we evaluated its hardware metrics and compared them to those of some
chaotic and non-chaotic systems. All the results obtained indicate that the proposed SCbSC
is a good candidate for encrypting private data. Our future work will focus on designing
a chaos-based block cipher to secure IoT data and to check hardware implementations
when using non-volatile FPGA technology, which reduces the side attack possibilities in
real-field applications.
Author Contributions: Writing—original draft, F.D.; Writing—review & editing, F.D. and S.E.A.;
Validation, W.E.H.Y., M.M. and R.L. All authors have read and agreed to the published version of the
manuscript.
Funding: This research received no external funding.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
95
Appl. Sci. 2021, 11, 625
References
1. Lorenz, E.N.; Haman, K. The essence of chaos. Pure Appl. Geophys. 1996, 147, 598–599.
2. Wang, X.-Y.; Zhang, J.-J.; Zhang, F.-C.; Cao, G.-H. New chaotical image encryption algorithm based on Fisher–Yatess scrambling
and DNA coding. Chin. Phys. B 2019, 28, 040504. [CrossRef]
3. Belazi, A.; Abd El-Latif, A.A.; Belghith, S. A novel image encryption scheme based on substitution-permutation network and
chaos. Signal Process. 2016, 128, 155–170. [CrossRef]
4. Amigo, J.; Kocarev, L.; Szczepanski, J. Theory and practice of chaotic cryptography. Phys. Lett. A 2007, 366, 211–216. [CrossRef]
5. Kocarev, L. Chaos-based cryptography: A brief overview. IEEE Circuits Syst. Mag. 2001, 1, 6–21. [CrossRef]
6. Acho, L. A chaotic secure communication system design based on iterative learning control theory. Appl. Sci. 2016, 6, 311.
[CrossRef]
7. Datcu, O.; Macovei, C.; Hobincu, R. Chaos Based Cryptographic Pseudo-Random Number Generator Template with Dynamic
State Change. Appl. Sci. 2020, 10, 451. [CrossRef]
8. Abdoun, N.; El Assad, S.; Manh Hoang, T.; Deforges, O.; Assaf, R.; Khalil, M. Designing Two Secure Keyed Hash Functions Based
on Sponge Construction and the Chaotic Neural Network. Entropy 2020, 22, 1012. [CrossRef] [PubMed]
9. Battikh, D.; El Assad, S.; Hoang, T.M.; Bakhache, B.; Deforges, O.; Khalil, M. Comparative Study of Three Steganographic Methods
Using a Chaotic System and Their Universal Steganalysis Based on Three Feature Vectors. Entropy 2019, 21, 748. [CrossRef]
10. Liao, T.-L.; Wan, P.-Y.; Yan, J.-J. Design of synchronized large-scale chaos random number generators and its application to secure
communication. Appl. Sci. 2019, 9, 185. [CrossRef]
11. Pareek, N.K.; Patidar, V.; Sud, K.K. Image encryption using chaotic logistic map. Image Vis. Comput. 2006, 24, 926–934. [CrossRef]
12. Kocarev, L.; Jakimoski, G. Logistic map as a block encryption algorithm. Phys. Lett. A 2001, 289, 199–206. [CrossRef]
13. François, M.; Grosges, T.; Barchiesi, D.; Erra, R. Pseudo-random number generator based on mixing of three chaotic maps.
Commun. Nonlinear Sci. Numer. Simul. 2014, 19, 887–895. [CrossRef]
14. Wang, X.-Y.; Qin, X. A new pseudo-random number generator based on CML and chaotic iteration. Nonlinear Dyn.
2012, 70, 1589–1592. [CrossRef]
15. Taha, M.A.; Assad, S.E.; Queudet, A.; Deforges, O. Design and efficient implementation of a chaos-based stream cipher. Int. J.
Internet Technol. Secur. Trans. 2017, 7, 89–114. [CrossRef]
16. Jallouli, O.; El Assad, S.; Chetto, M.; Lozi, R. Design and analysis of two stream ciphers based on chaotic coupling and multiplexing
techniques. Multimed. Tools Appl. 2018, 77, 13391–13417. [CrossRef]
17. Lozi, R. Emergence of randomness from chaos. Int. J. Bifurc. Chaos 2012, 22, 1250021. [CrossRef]
18. Ding, L.; Liu, C.; Zhang, Y.; Ding, Q. A new lightweight stream cipher based on chaos. Symmetry 2019, 11, 853. [CrossRef]
19. Abdelfatah, R.I.; Nasr, M.E.; Alsharqawy, M.A. Encryption for multimedia based on chaotic map: Several scenarios.
Multimed. Tools Appl. 2020. [CrossRef]
20. Gautier, G.; Le Glatin, M.; El Assad, S.; Hamidouche, W.; Déforges, O.; Guilley, S.; Facon, A. Hardware Implementation of
Lightweight Chaos-Based Stream Cipher. In Proceedings of International Conference on Cyber-Technologies and Cyber-Systems,
Porto, Portugal, 22 September 2019; 5p.
21. Tanougast, C. Hardware implementation of chaos based cipher: Design of embedded systems for security applications. In
Chaos-Based Cryptography; Springer: Berlin/Heidelberg, Germany, 2011; pp. 297–330.
22. Koyuncu, İ.; Tuna, M.; Pehlivan, İ.; Fidan, C.B.; Alçın, M. Design, FPGA implementation and statistical analysis of chaos-ring
based dual entropy core true random number generator. Analog Integr. Circuits Signal Process. 2020, 102, 445–456. [CrossRef]
23. Nguyen, R. Penetration Testing on a C-Software Implementation aff1709rns006-c; Internal Report; Secure-IC SAS: Cesson-Sévigné,
France, 2018.
24. Nguyen, R.; Facon, A.; Guilley, S.; Gautier, G.; El Assad, S. Speed-up of SCA Attacks on 32-bit Multiplications. In Proceedings of
the International Conference on Codes, Cryptology, and Information Security, Rabat, Morocco, 22–24 April 2019; pp. 31–39.
25. Peng, J.; You, M.; Yang, Z.; Jin, S. Research on a block encryption cipher based on chaotic dynamical system. In Proceedings of the
Third International Conference on Natural Computation (ICNC 2007), Haikou, China, 24–27 August 2007; pp. 744–748.
26. Masuda, N.; Jakimoski, G.; Aihara, K.; Kocarev, L. Chaotic block ciphers: From theory to practical algorithms. IEEE Trans. Circuits
Syst. I Regul. Pap. 2006, 53, 1341–1352. [CrossRef]
27. El Assad, S. Chaos-Based Cryptography, Internal Report; University of Nantes: Nantes, France, 2019.
28. Jallouli, O. Chaos-Based Security under Real-Time and Eenergy Constraints for the Internet of Things. Ph.D. Thesis, University of
Nantes, Nantes, France, 2017.
29. Blackman, D.; Vigna, S. Scrambled linear pseudorandom number generators. arXiv 2018, arXiv:1805.01407.
30. Vigna, S. Further scramblings of Marsaglia’s xorshift generators. J. Comput. Appl. Math. 2017, 315, 175–181. [CrossRef]
31. Coron, J.-S.; Rondepierre, F.; Zeitoun, R. High order masking of look-up tables with common shares. Iacr Trans. Cryptogr. Hardw.
Embed. Syst. 2018, 40–72. [CrossRef]
32. Coron, J.-S.; Roy, A.; Vivek, S. Fast evaluation of polynomials over binary finite fields and application to side-channel counter-
measures. In International Workshop on Cryptographic Hardware and Embedded Systems; Springer: Berlin/Heidelberg, Germany, 2014;
pp. 170–187.
33. Rukhin, A.; Soto, J.; Nechvatal, J.; Smid, M.; Barker, E. A Statistical Test Suite for Random and Pseudorandom Number Generators for
Cryptographic Applications; Booz-allen and Hamilton Inc.: McLean, VA, USA, 2001.
96
Appl. Sci. 2021, 11, 625
34. Manifavas, C.; Hatzivasilis, G.; Fysarakis, K.; Papaefstathiou, Y. A survey of lightweight stream ciphers for embedded systems.
Secur. Commun. Networks 2016, 9, 1226–1246. [CrossRef]
35. Maximov, A.; Biryukov, A. Two trivial attacks on Trivium. In International Workshop on Selected Areas in Cryptography; Springer:
Berlin/Heidelberg, Germany, 2007; pp. 36–55.
36. Gaj, K.; Southern, G.; Bachimanchi, R. Comparison of hardware performance of selected Phase II eSTREAM candidates.
In Proceedings of the State of the Art of Stream Ciphers Workshop (SASC 2007), eSTREAM, ECRYPT Stream Cipher Project,
Report, Lausanne, Switzerland, 31 January–1 February 2007.
37. Bulens, P.; Kalach, K.; Standaert, F.-X.; Quisquater, J.-J. FPGA implementations of eSTREAM phase-2 focus candidates with
hardware profile. In Proceedings of the State of the Art of Stream Ciphers Workshop (SASC 2007), eSTREAM, ECRYPT Stream
Cipher Project, Report, Lausanne, Switzerland, 31 January–1 February 2007.
38. Schneier, B. Applied Cryptography: Protocols, Algorithms, and Source Code in C; John Wiley & Sons: Hoboken, NJ, USA, 2007.
39. Wu, Y.; Noonan, J.P.; Agaian, S. NPCR and UACI randomness tests for image encryption. CYber J. Multidiscip. J. Sci. Technol. Sel.
Areas Telecommun. 2011, 1, 31–38.
40. Wu, Y.; Zhou, Y.; Saveriades, G.; Agaian, S.; Noonan, J.P.; Natarajan, P. Local Shannon entropy measure with statistical tests for
image randomness. Inf. Sci. 2013, 222, 323–342. [CrossRef]
97
applied
sciences
Article
Bit Independence Criterion Extended
to Stream Ciphers
Evaristo José Madarro-Capó 1 , Carlos Miguel Legón-Pérez 1 , Omar Rojas 2 ,
Guillermo Sosa-Gómez 2, * and Raisa Socorro-Llanes 3
1 Institute of Cryptography, University of Havana, Havana 10400, Cuba; [email protected] (E.J.M.-C.);
[email protected] (C.M.L.-P.)
2 Facultad de Ciencias Económicas y Empresariales, Universidad Panamericana, Álvaro del Portillo 49,
Zapopan, Jalisco 45010, Mexico; [email protected]
3 Faculty of Informatics, Technological University of Havana (UTH), CUJAE, Havana 19390, Cuba;
[email protected]
* Correspondence: [email protected]; Tel.: +52-3313682200
Received: 30 September 2020; Accepted: 26 October 2020; Published: 29 October 2020
Abstract: The bit independence criterion was proposed to evaluate the security of the S-boxes used
in block ciphers. This paper proposes an algorithm that extends this criterion to evaluate the degree
of independence between the bits of inputs and outputs of the stream ciphers. The effectiveness of
the algorithm is experimentally confirmed in two scenarios: random outputs independent of the
input, in which it does not detect dependence, and in the RC4 ciphers, where it detects significant
dependencies related to some known weaknesses. The complexity of the algorithm is estimated based
on the number of inputs l, and the dimensions, n and m, of the inputs and outputs, respectively.
Keywords: bit independence criterion; bit independence; RC4; stream cipher; complexity
1. Introduction
Randomness is an essential component in the security of cryptographic algorithms [1,2].
In particular, stream ciphers are composed of pseudo-random number generators and base their
security on the statistical characteristics of these generators [1]. Several stream ciphers can be found in
the literature whose description is based on different methods for the generation of pseudo-random
numbers [3].
In practice, to determine if a generator is suitable to be used for cryptographic purposes,
several statistical tests are usually applied on it to measure the randomness of its outputs [4–6].
There are numerous statistical tests to measure the randomness of the outputs of a pseudo-random
number generator, among these those grouped in the batteries of NIST [7], Diehard [8], TestU01 [9],
and Knuth [10], among others [2]. However, despite a large number of statistical tests being present in
these batteries, none of them measure the correlation between the inputs and outputs of the stream
cipher; they only measure the randomness of the outputs, which is a necessary, but not sufficient,
condition to consider the generator for use in cryptography.
To consider a stream cipher secure, there must be no statistically significant correlation between
the structure of its inputs and outputs. If “patterns” depending on the structure of the cipher input are
generated in the output of stream ciphers, this could provide information about the input used. In the
literature, there are reports of cryptanalysis based on this type of weakness [11,12]. In this way, it is
essential to avoid the previous weakness and to have methods to detect it in the design and evaluation
stage of the algorithm; in particular, it is necessary to have statistical tests that are capable of detecting
the existence of significant statistical dependencies between the inputs and outputs of stream ciphers.
In general, there are very few statistical test reports to detect the existence of statistical dependencies
between the outputs and inputs of a stream cipher. Therefore, the design of statistical tests that allow
for the evaluation of them in this sense is highly important in cryptography.
The strict avalanche criterion (SAC) and the bit independence criterion (BIC) were proposed
in [13] to evaluate the strength of the S-boxes used in block ciphers [14]. These two criteria measure
different characteristics of the change’s effect that an input bit has on the output bits; while the SAC
verifies uniformity in the distribution of each output bit, the BIC measures the degree of independence
between the output bits [15]. The SAC has been extended to be applied to stream ciphers [16–22].
In [22], the RC4 stream cipher [23] was evaluated through the SAC and the existence of statistical
dependence between the input bits and outputs of the RC4 was detected for inputs of large size.
This confirms the results obtained in [24–27], where the existence of related inputs in RC4 was reported.
The idea developed in [22] was to determine the behavior of the distribution of the bits in the output by
changing any bit in the input. In the design of stream ciphers, the distribution behavior of the output
elements must be uniformly distributed, regardless of the bit that is being changed at the input [5].
Otherwise, the outputs could provide information on the input bits, which constitutes a weakness
that, in the worst-case scenario, could lead to an attack. A discussion of attacks on stream ciphers
can be found in [28]. However, the BIC has not been applied, to the best of our knowledge, to assess
the degree of statistical independence between the bits of the output stream ciphers from changing
a bit of the input. In this paper, we propose an algorithm that extends this criterion to evaluate the
degree of independence between the input bits and the outputs of the stream ciphers. The effectiveness
of the algorithm was experimentally confirmed in two scenarios: random outputs independent of
the input, in which it does not detect dependence, and in the RC4 cipher, where it detects significant
dependencies related to some known weaknesses [22,24–26].
2. Preliminaries
A stream cipher can be viewed as a function f : F2n → F2m that transforms a binary input
vector X = ( x1 , . . . , xn ) of n bits into a binary output vector Y = f ( X ) = (y1 , . . . , ym ) of m bits,
where n, m ∈ N. In [13], the difference between the outputs Y = f ( X ) and Y i = f ( X i ), corresponding
to the inputs X and X i , is called the avalanche vector and denoted by V i = Y ⊕ Y i , where X i = X ⊕ ei ,
with 1 ≤ i ≤ n and ei the unit vector with 1 in the i-th component. In V i = Y ⊕ Y i = (v1i , v2i , . . . , vim )
each vij ∈ F2 , with 1 ≤ j ≤ m, is called an avalanche variable (see Table A1, Appendix A).
Given the set D = { X1 , . . . , Xl } of l inputs Xr of n bits, with 1 ≤ r ≤ l, a binary matrix H i is
constructed for each ei , 1 ≤ i ≤ n. To construct the matrix H i , the avalanche vectors Vri = Yr ⊕ Yri =
i , vi , . . . , vi ) are calculated, with Y = f ( X ), Y i = f ( X ⊕ e ). It is said that f satisfies the BIC if,
(vr1 r2 rm r r r r i
by changing any bit i in the l inputs Xr ∈ D, it is satisfied that every pair of avalanche variables vi· j and
vi·k are independent, with 1 ≤ j, k ≤ m. The matrix H i will be called the SAC matrix associated with
the vector ei and is shown in Table 1.
To measure the degree of independence between the pairs of avalanche variables,
Webster and Tavares [13] used Pearson’s correlation coefficient. In [29], the maximum value of these
coefficients was used as a test statistic, denoted here by
If all pairs of avalanche variables vi· j and vi·k are independent, then ideally, BIC Pearson ( f ) = 0. Therefore,
in practice, when BIC Pearson ( f ) ≈ 0, it is concluded that f satisfies the BIC.
100
Appl. Sci. 2020, 10, 7668
V1i i
v11 i
v12 ... i
v1j ... i
v1k ... i
v1m
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
Vri i
vr1 i
vr2 ... i
vrj ... i
vrk ... i
vrm
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
Vli vil1 vil2 ... vilj ... vilk ... vilm
101
Appl. Sci. 2020, 10, 7668
The RC4 has two main components: the key scheduling, and the pseudo-random number
generator. The key scheduling generates an internal random permutation S of values from 0 to 255,
from an initial permutation, a (random) key K of l-byte length, and two pointers i and j. The maximal
key length is of l = 256 bytes (see Algorithm 1).
The main part of the algorithm is the pseudo-random number generator that produces one-byte
output in each step. As usual, for stream ciphers, the encryption will be an XOR of the pseudo-random
sequence with the message (see Algorithm 2).
The weaknesses found can be classified according to the theme they exploit, some of which are:
1. Weak keys.
2. Key recovery from the state.
3. Key recovery from the key-stream.
4. State recovery attacks.
5. Biases and distinguishes.
While the fifth point is the most studied subject in the literature, the third point is the most serious
attack made to RC4. The theme that is exploited in this paper has been deeply studied—in particular,
Grosul and Wallach [24] demonstrated that certain related key-pairs generate similar output bytes in
RC4. Later, Matsui [25] reported colliding key pairs for RC4 for the first time, and then stronger key
collisions were found in [26]. For the RC4 stream cipher, several modifications have been proposed;
while some modified only certain components or some operations, others completely changed the
algorithm (see [42]). It is important to note that even RC4 variants have had a lot of attention in the
scientific community (see [43]).
102
Appl. Sci. 2020, 10, 7668
Step 2. Evaluate the independence between the avalanche variables vi· j and vi·k .
The following sections describe each of these steps and end with the proposal of an algorithm to
evaluate the BIC in stream ciphers.
3.2. Test of Independence between Two Avalanche Variables vi· j and vi·k
Second difference. In [13], Pearson’s correlation coefficient ρ was used to measure the degree of
independence between the pairs of avalanche variables. The use of such a coefficient in [13,29] has two
main disadvantages: the first one is that it only detects linear correlations, and the second one is that
the critical region for the rejection of the null hypothesis is not explicitly defined, i.e., a threshold is
not defined below which BIC Pearson ( f ) ≈ 0 is decided. Thus, it can be a reason for an imprecision
in the decision when dealing with small coefficient values. In order to solve the first aforementioned
disadvantage, mutual information can be applied to measure the degree of independence between
pairs of avalanche variables [44], but in this case, it is important to determine which estimator to use,
since there are no estimators of unbiased entropy of minimal variance; the second disadvantage can
be solved&by defining the critical region using a transformation of the correlation coefficient of the
type t = ( N − 2)ρ2 /(1 − ρ2 ), where t is distributed as a t-Student distribution with N − 2 degrees
of freedom [45].
103
Appl. Sci. 2020, 10, 7668
'
Another approach is that when vi· j and vi·k are independent, then sijk = vi· j vi·k is balanced [46].
In this work, independence will be evaluated by measuring the adjustment HW (sijk ) to the binomial
distribution B(l, 1/2), where HW (·) is the Hamming weight. This allows setting a threshold for the
decision criterion on independence between vi· j and vi·k .
Since H i is a binary matrix, the adjustment to the binomial distribution will be measured by the
χ2 -test with 1 degree of freedom, with the test hypothesis given by:
That is,
1
H0 : HW (sijk ) ∼ B l, ,
2
1
H1 : HW (sijk ) ∼ B l, .
2
where
m −1 m
T i (m, α1 ) = T i = ∑ ∑t vi· j , vi·k , α1 , (5)
j =1 k > j
and ⎧
⎪
⎨ 1 If H0 is rejected for vi· j and vi·k
t(vi· j , vi·k , α1 ) = with significance α1 (6)
⎪
⎩
0 otherwise.
The variable T i counts the number of rejections of the null hypothesis H0 in the matrix H i .
Expected number of rejections of H0 . In each of the n SAC H i matrices, C2m pairs of columns are
formed, thus the number of rejections T satisfies
0 ≤ T ≤ n · C2m . (7)
104
Appl. Sci. 2020, 10, 7668
When T = 0, we have the ideal case for compliance with the BIC, since all the pairs of columns
are independent, while as T 0, the number of non-independent column pairs increases.
Under the hypothesis test above, with a significance level α1 , the expected number of rejections of
H0 is:
E T i | H0 = (α1 ·C2m ) , (8)
for each matrix H i . In total, among the n matrices SAC are expected
H0 rejections.
The random variable
n m −1 m
T= ∑ ∑ ∑t vi· j , vi·k , α1 , (10)
i =1 j =1 k > j
follows a binomial distribution B(n · C2m , α1 ). Taking into account that generally α1 < 0.1,
this distribution can be approximated, in this case, to the Poisson distribution with parameter
λ = (α1 · n · C2m ). Since λ is large, due to large values of n · C2m , then the Poisson distribution can be
approximated by the Normal distribution with mean and variance:
Thus
T − E( T | H0 )
ZT = & ∼ N (0, 1). (12)
σ2 ( T | H0 )
Decision criteria. To compare the ZT value with the N (0, 1) distribution, a significance level α2
is selected. Then, it is tested if f does not satisfy the BIC, with a significance level α2 , if ZT > Z1−α2 .
It can be seen that if 0 ≤ T ≤ E( T | H0 ), then the values of ZT decreases with respect to Z1−α2 and
ZT > Z1−α2 is not satisfied, so the BIC is fulfilled. On the other hand, if T E( T | H0 ), then the values
of ZT will be greater as T increases, so ZT > Z1−α2 is satisfied and the BIC compliance is rejected.
Normality of the test statistic T. In the expression of T there are n · C2m Bernoulli variables
t(v· j , vi·k , α1 ), whose distributions under H0 and H1 are different:
i
Under H0 , all variables t(vi· j , vi·k , α1 ) are independent, identically distributed and take the value
of 1 with probability pijk = P(t(vi· j , vi·k , α1 ) = 1) = α1 , so T follows exactly a binomial distribution
B (n · C2m , α1 ). Although generally α1 ≤ 0.1 the binomial distribution B (n · C2m , α1 ) can be
approximated by the normal distribution, with mean E ( T | H0 ) = α1 · n · C2m and variance
σ2 ( T | H0 ) = α1 · n · C2m (1 − α1 ), taking into account that n · C2m grows very quickly with m.
Under H1 , the variables t(vi· j , vi·k , α1 ) that appear in the expression of T are not identically
distributed, since the rejection of the BIC means that there are several matrices H i for which the
hypothesis H0 of independence between vi· j and vi·k is rejected. In this case, pijk = α1 and may be
different when i, j, k varies. For this reason, a binomial does not appear directly as the distribution of T.
However, it is still possible to approximate the distribution of T by the Normal distribution. For this it
is sufficient to calculate the mean
−1 m
∑in ∑m
j ∑k> j pijk
Pn·C2m = , (13)
n · C2m
between the probabilities of all the variables t(vi· j , vi·k , α1 ) and the distribution of T can be approximated
by the binomial distribution B(n · C2m , Pn·C2m ). This distribution, in turn, can be approximated by the
Normal distribution, taking into account high values of n · C2m . The precision of this approximation
105
Appl. Sci. 2020, 10, 7668
depends on the difference between the probabilities pijk involved in Pn·C2m , therefore the variance value
between these probabilities can be a measure of the quality of the approximation.
When comparing the distribution of T under H0 and H1 , similarities and differences are observed.
They are similar in that in both cases T follows a Normal distribution, but there are two differences,
the first and most important is observed between the expected values of both distributions (it will be
higher under H1 ) and the second refers to the level of adjustment to this distribution (may be lower
under H1 ). In the rest of this work, the proposed method to evaluate the BIC in stream ciphers will be
called the BIC test.
106
Appl. Sci. 2020, 10, 7668
it causes an increase in the number of operations. In practice, the idea is to obtain a cost-effectiveness
ratio using a value of l such that it maintains the fit and provides a practical number of operations.
Using the confidence interval for proportions [47], it is possible to obtain a value of l0 , such that
prefixing l > l0 achieves a good fit. This confidence interval is given by
⎛ ⎞
p − p
P ⎝− Zα1 /2 < * < Zα1 /2 ⎠ = 1 − α1 . (14)
pq
l
Example 1. Calculation of the lower bound l0 for l. A value l0 from which, with high probability, it is
satisfied that q ≈ p ≈ 0.5 is needed. Then, substituting for a significance level α1 = 0.01 and a deviation
e whose absolute value |e| satisfy inequality |e| = | p − 0.5| ≤ 0.03, we get
2
Z0.005 · 0.25
l0 = ≈ 2189.
0.032
In this way, for the significance level α1 and the deviation e selected, it is concluded that l must be chosen such
that l > l0 = 2189.
Example 2. Convergence of p and deviation e. Table 2 shows the behavior of the deviation e observed for
several l, l > l0 = 2189, with n = 64 and m = 32. It can be seen how, for most of the estimated e, the imposed
condition is met |
e| ≤ 0.03.
l
Mean Value p |e|
Selection of n, m under the null hypothesis H0 . The number n of inputs and the number m of outputs
influence the sample size for the calculation of the number T of rejections of H0 . In general, we will
have d = n · C2m pairs of columns to check and it is expected, with probability α1 , that λ = α1 · d pairs
of columns will be rejected.
Let λ0 = α1 · d0 be some default value of λ from which the distribution of T can be approximated
to N (0, 1). It is necessary to select n and m such that d > d0 is satisfied and a value of λ such that
λ > λ0 is obtained. It is advisable to select a high value of λ0 that avoids the use of corrections and
provides a good fit.
It is known that increasing λ0 provides better precision in the Poisson approximation to the
Normal distribution. To obtain d0 , we can use the confidence interval for proportions [47], this time in
an approximation to the Normal distribution with one tail. So, we have
⎛ ⎞
p − p
P⎝ * < Zα2 ⎠ = 1 − α2 . (16)
pq
d
107
Appl. Sci. 2020, 10, 7668
Example 3. Calculation of the lower bound d0 for d. Substituting, p = 0.01, q = 0.99, with a significance
level α2 = 0.001 and a deviation |e| of 0.003, we obtain
2
Z0.001 · 0.25
d0 = ≈ 10503.
0.0032
Then, λ0 = α1 · d0 ≈ 0.01 · (10503) ≈ 105, therefore, for the values of α1 and e chosen, it is enough to select
values of n and m such that λ > λ0 ≈ 105. In Table 3, for α1 = 0.01, some values of n and m are highlighted in
italics from which λ > λ0 = 105.
Table 3. λ values for multiple values of n and m with α1 = 0.01. Values of n and m are highlighted in
italics from which λ > λ0 = 105.
m
n
8 16 32 64
8 2.24 9.6 39.68 161.28
16 4.48 19.2 79.36 322.56
32 8.96 38.4 158.72 645.12
64 17.92 76.8 317.44 1290.24
128 35.84 153.6 634.88 2580.48
256 71.68 307.2 1269.76 5160.96
512 143.36 614.4 2539.52 10,321.92
To select n, m and l, the trade-off between reducing computational cost and maximizing
effectiveness can be taken into account. However, it is very important to be careful when selecting
which values to use, since minimizing computational cost could limit the effectiveness of the BIC
method and overestimate the quality of the stream cipher. It is advised to prioritize
increasing effectiveness.
108
Appl. Sci. 2020, 10, 7668
The values l ∈ {4096, 8192, 16,384, 32,678} will be varied, in order to verify the influence of the
variation of the parameters n, m and l in the adjustment of ZT . The values of n and m with the lowest
computational cost were selected, that is, the values of n and m that provide the lowest values of λ
such that λ > λ0 = 105.
The values n and m will be used as a power of two, since current ciphers work with inputs
and outputs whose size has these characteristics and also l to speed up, in terms of execution time,
the computation of the BIC method. However, it is important to note that the BIC method can be used
for any value of n, m and l, as long as the requirements outlined in the previous section are met.
Normality of ZT in H i random matrices. Figure 1 corresponds to the observed distribution of 1000
values of ZT , for each pair of parameters n and m, and each value of l.
(a) n = m = 32 (b) n = m = 64
Figure 1. Observed distribution of 1000 values of ZT in random H i matrices for various values of n, m,
and l.
l
(n, m)
4096 8192 16,384 32,768
(32, 32) −0.150216 0.044674 0.214355 −0.210047
(64, 64) −0.268244 0.110717 0.137298 −0.383549
(8, 64) −0.154163 −0.0239 0.173869 −0.164926
(64, 32) −0.118008 0.05765 0.236807 −0.175659
The analysis of Figure 1 and Tables 4 and 5, suggests the fulfillment of the hypothesis H0 about
the distribution of ZT ∼ N (0, 1), for all the values of the parameters l, n, m selected. As can be seen in
Tables 4 and 5, by varying l, n, m, the values E( ZT | H0 ) and σ2 ( ZT | H0 ) of the observed distribution of
109
Appl. Sci. 2020, 10, 7668
ZT maintain the fit to the parameters μ = 0 and σ2 = 1 expected in a distribution N (0, 1). Figure 1
shows the bell shape and approximate symmetry of the obtained distributions.
l
(n, m)
4096 8192 16,384 32,768
(32, 32) 0.906793 0.938818 1.06287 0.930434
(64, 64) 1.04064 1.02569 0.97230 1.04773
(8, 64) 0.972279 0.939652 1.05472 0.923853
(64, 32) 0.994157 0.997373 1.06138 0.983795
Normality Test. The Shapiro–Wilks [48] test for normality was applied to all selected parameter
sets. The results are shown in Figure 2 and Table 6.
(a) n = m = 32 (b) n = m = 64
In Figure 2 we can see how the observed distribution of Zt for all the values of l, n, m, fit the
distribution N (0, 1). Table 6 shows the p-values corresponding to the Shapiro-Wilk normality test for
each of the chosen parameter sets.
Table 6. p-values of the Shapiro-Wilk test of normality for samples of Zt , in random H i matrices,
that satisfy the BIC.
l n = m = 32 n = m = 64 n = 8 and m = 64 n = 64 and m = 32
4096 0.252382 0.724504 0.262997 0.482318
8192 0.127693 0.573267 0.161048 0.326505
16,384 0.296125 0.653315 0.141577 0.524475
32,768 0.309173 0.37739 0.210961 0.237133
110
Appl. Sci. 2020, 10, 7668
It is observed that in all cases, the p-values are greater than the usual values assumed for α,
such as 0.01 or 0.05 and are consistent with the assumed normality hypothesis. The higher the value of
n = m, the higher the p-value, which corresponds to the influence of these parameters on the value of
λ (see Table 3).
BIC test application on H i random matrices. To evaluate the behavior of the BIC test in random
matrices, each Zt was compared with the critical value Z1−α2 , and the number of rejections of H0
was counted. Tables 7 and 8 show the results for various levels of significance α2 and l = 16,384.
The observed number of rejections is expected to correspond to that expected according to the selected
α2 level, which would allow choosing α2 , to obtain zero rejections in this scenario.
Table 7. Expected E(# [ ZT > Z1−α2 | H0 ] ) and observed # [ ZT > Z1−α2 | H0 ] number of rejections in
samples of 1000 values of Zt for n = m, in H i random matrices.
# [ Z T > Z1−α2 | H0 ]
α2 E(# [ Z T > Z1−α2 | H0 ])
n = m = 32 n = m = 64
0.05 50 31 43
0.01 10 8 9
0.001 1 1 1
0.0001 0 0 0
Table 8. Expected E(# [ ZT > Z1−α2 | H0 ]) and observed # [ ZT > Z1−α2 | H0 ] number of rejections in
samples of 1000 values of Zt for n = m, in H i random matrices.
# [ Z T > Z1−α2 | H0 ]
α2 E(# [ Z T > Z1−α2 | H0 ])
n = 8 and m = 64 n = 64 and m = 32
0.05 50 36 42
0.01 10 7 5
0.001 1 0 1
0.0001 0 0 0
For the value of α2 = 0.0001 located in the last row of both tables, no statistical dependence is
detected as expected in random matrices, confirming the effectiveness of the criterion and illustrating
the importance of the proper selection of α2 , according to the number d = n · C2m of pairs of columns
whose independence is evaluated. For the values of l, n, m, α1 , α2 used, such that no Type I error is
made, the probability of making a Type II error must be calculated and the values that minimize it
must be chosen. In this sense, experiments will be carried out in the second scenario on a stream cipher.
111
Appl. Sci. 2020, 10, 7668
Figure 3. Distribution of the sample of 1000 values of ZT for SAC matrices generated with RC4 with
n = m ∈ {32, 64, 128, 160, 256}.
+2 ( Z ) of Z for SAC matrices generated with the RC4.
( ZT ) and variance σ
Table 9. Expected value E T T
(n, m) ( Z T )
E +2 ( Z )
σ T
To verify the normality of the data, the Shapiro–Wilks [48] normality test was applied to all the
selected parameter sets. The results are shown in Figure 4 and Table 10.
Figure 4. Normality test of the sample of 1000 values of ZT for SAC matrices generated with the RC4
with n = m ∈ {32, 64, 128, 160, 256}.
Table 10. p-values of the Shapiro-Wilk test of normality on samples of Zt for SAC matrices generated
with the RC4 with n = m ∈ {32, 64, 128, 160, 256}.
(n, m) p-Values
(32, 32) 0.103538
(64, 64) 0.582878
(128, 128) 0.382171
(160, 160) 0.943337
(256, 256) 0.673625
112
Appl. Sci. 2020, 10, 7668
In Figure 4 we can see how by increasing the values of m = n the Normal distribution N (μ, 1) of
the statistician Zt is maintained, however, the value of μ increases (see Figure 3 and Table 9).
It is observed that in all cases the p-values are greater than the usual values assumed for α, such as
0.01 or 0.05 and the samples maintain normality.
In Table 11 it is noted how in RC4 the effectiveness of the criterion increases as the values of n
and m increase. That is, increasing the values m = n increases the number of correct decisions to reject
H0 . As mentioned, it is known that by increasing the value of n in RC4 the probability of finding very
similar outputs, or even the same, increases for inputs that differ by a few bits [22,24–26].
Table 11. Expected E(# [ ZT > Z1−α2 | H0 ]) and observed # [ ZT > Z1−α2 ] number of rejections in 1000
repetitions of the BIC test in SAC matrices generated with the RC4. All cases in which the observed
number of rejections exceeds the expected value are indicated in italics.
# [ Z T > Z1− α2 ]
α2 E(# [ Z T > Z1−α2 | H0 ]) n=m
32 64 128 160 256
0.05 50 77 155 497 731 1000
0.01 10 22 44 272 462 993
0.001 1 2 11 91 215 953
0.0001 0 0 0 19 74 843
This experiment confirms the effectiveness of the BIC test by detecting dependence between the
inputs-outputs of RC4 and allows us to conclude that in RC4, the effectiveness is an increasing function
of the value of the parameters n = m. All cases in which the observed number of rejections exceeds
the expected value are indicated in italics.
An important feature in statistical tests is the determination of type I and type II errors [2].
Under H0 , we have that vi· j and vi·k are independent, then the type I error consists in rejecting
independence when they are and therefore deciding that the cipher has a weakness when it does not
have it. Meanwhile, not rejecting H0 when there is a dependency means that it would be decided that
the cipher passes the BIC, when in fact it does not pass it, and a type II error would be committed.
Table 12 shows the proportion of Type I and II errors, committed by the BIC test, for some
parameter sets.
Table 12. Proportion of type I and II errors made by the BIC test.
It can be seen that for α2 = 0.0001 type I and II errors are not made.
The outputs of RC4 [23] are known to pass numerous statistical tests [49], however they do not
satisfy the BIC statistical test proposed in this work. This shows that the BIC statistical test complements
the classic randomness tests, therefore it constitutes a tool to consider to evaluate stream ciphers.
113
Appl. Sci. 2020, 10, 7668
5. Conclusions
An algorithm was proposed to extend the application of the Bit Independence Criterion (BIC)
to stream ciphers. This algorithm detects the existence of statistical dependence between the inputs
and outputs of a stream cipher. The effectiveness of the BIC test was experimentally confirmed
when applied on random matrices, in which it does not detect dependence, and on the RC4 cipher,
detecting statistical dependencies between the inputs and outputs of this cipher that are related with
previously reported.
The algorithm depends on the number n of bits of the inputs, the number m of bits of the outputs,
and the number l of inputs used. These parameters determine its complexity. The results achieved
confirm the importance of varying the n and m parameters to apply the BIC criteria in the evaluation
of stream ciphers. For RC4 the effectiveness of the criterion is a growing function of the n and
m parameters.
It is recommended to guarantee the effectiveness of the proposed BIC test by selecting the values
of the parameters greater than the minimum value estimated in the article. From that minimum,
increase the values depending on the available computing power, estimating the time using the
complexity expressions that were presented from the algorithm.
The BIC statistical test complements the classical statistical tests of randomness as it allows
expanding the evaluation of the stream ciphers, by measuring the degree of independence present
between the input of the cipher and its outputs, thus measuring other statistical characteristics that are
not only the evaluation of randomness of their output sequences.
In future work, it is planned to apply this test to other stream ciphers, investigate the optimal
choice of the m and n parameters and compare the effectiveness of the criterion taking into account
the mutual information, the Pearson’s coefficient, with the transformation mentioned, and the criteria
applied in this work. The behavior of the proposal will be experimentally verified when the sample
size increases. It will be investigated in an implementation variant using parallelism for Algorithm 3.
Author Contributions: Conceptualization, E.J.M.-C., G.S.-G. and C.M.L.-P.; methodology, E.J.M.-C., G.S.-G. and
C.M.L.-P.; software, E.J.M.-C., G.S.-G. and C.M.L.-P.; validation, E.J.M.-C., R.S.-L. and O.R.; formal analysis,
E.J.M.-C., G.S.-G., O.R., R.S.-L. and C.M.L.-P.; investigation, E.J.M.-C., G.S.-G., O.R., R.S.-L. and C.M.L.-P.;
writing—original draft preparation, E.J.M.-C., G.S.-G., O.R., R.S.-L. and C.M.L.-P.; writing—review and editing,
E.J.M.-C., G.S.-G., O.R., R.S.-L. and C.M.L.-P.; supervision, E.J.M.-C., G.S.-G., O.R., R.S.-L. and C.M.L.-P. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.
Appendix A
Table A1. Notation table.
f : F2n −→ F2m Describe the function that transforms n input bits into m output bits
X = ( x1 , . . . , x n ) n-bit input binary vector
Y = f ( X ) = ( y1 , . . . , y m ) m-bit output binary vector
ei Unit vector with 1 in the i-th component with 1 ≤ i ≤ n
Xi Vector resulting from the operation X i = X ⊕ ei for input X
Yi m-bit output binary vector corresponding to input X i , Y i = f ( X i )
V i = Y ⊕ Y i = (v1i , v2i , . . . , vim ) Avalanche vector associated with vector ei and input X
vij ∈ F2 Avalanche variable associated to vector ei and input X with 1 ≤ j ≤ m
D = { X1 , . . . , X l } Set of l inputs Xr , with 1 ≤ r ≤ l
Xri Vector resulting from the operation Xri = Xr ⊕ ei for the input Xr
114
Appl. Sci. 2020, 10, 7668
References
1. Marton, K.; Suciu, A.; Ignat, I. Randomness in digital cryptography: A survey. Rom. J. Inf. Sci. Technol.
2010, 13, 219–240.
2. Demirhan, H.; Bitirim, N. Statistical Testing of Cryptographic Randomness. J. Stat. Stat. Actuar. Sci.
2016, 9, 1–11.
3. ECRYPT Stream Cipher Project. 2011. Available online: http://cr.yp.to/streamciphers.html (accessed on
5 July 2020) [CrossRef]
4. Yerukala, N.; Kamakshi Prasad, V.; Apparao, A. Performance and statistical analysis of stream ciphers in
GSM communications. J. Commun. Softw. Syst. 2020, 16, 11–18. [CrossRef]
5. Gorbenko, I.; Kuznetsov, A.; Lutsenko, M.; Ivanenko, D. The research of modern stream ciphers.
In Proceedings of the 2017 4th International Scientific-Practical Conference Problems of Infocommunications.
Science and Technology (PIC S&T), Kharkov, Ukraine, 10–13 October 2017; pp. 207–210. [CrossRef]
6. Upadhya, D.; Gandhi, S. Randomness evaluation of ZUC, SNOW and GRAIN stream ciphers. Adv. Intell.
Syst. Comput. 2017, 508, 55–63. [CrossRef]
7. Rukhin, A.; Soto, J.; Nechvatal, J. A Statistical Test Suite for Random and Pseudorandom Number Generators for
Cryptographic Applications; Technical Report April; Booz-Allen and Hamilton Inc.: Mclean, VA, USA, 2010.
8. Marsaglia, G. The Marsaglia Random Number CDROM Including the Diehard Battery of Tests of
Randomness. Florida State University, 1995. Available online: http://stat.fsu.edu/pub/diehard/
(accessed on 5 July 2020).
9. L’ecuyer, P.; Simard, R. TestU01: A C library for empirical testing of random number generators. ACM Trans.
Math. Softw. TOMS 2007, 33. [CrossRef]
10. McClellan, M.T.; Minker, J.; Knuth, D.E. The Art of Computer Programming, Vol. 3: Sorting and Searching;
Addison-Wesley Professional: Boston, MA, USA, 1974; Volume 28, p. 1175. [CrossRef]
11. Shi, Z.; Zhang, B.; Feng, D.; Wu, W. Improved key recovery attacks on reduced-round Salsa20 and ChaCha.
Lect. Notes Comput. Sci. 2013, 7839 LNCS, 337–351. [CrossRef]
12. Maitra, S.; Paul, G. New form of permutation bias and secret key leakage in keystream bytes of RC4.
In International Workshop on Fast Software Encryption; Springer: Berlin/Heidelberg, Germany, 2008;
Volume 5086 LNCS, pp. 253–269. [CrossRef]
13. Hancock, P.A. On the Design of Time. Ergon. Des. 2018, 26, 4–9. [CrossRef]
14. Qureshi, A.; Shah, T. S-box on subgroup of Galois field based on linear fractional transformation. Electron. Lett.
2017, 53, 604–606. [CrossRef]
15. Naseer, Y.; Shah, T.; Shah, D.; Hussain, S. A Novel Algorithm of Constructing Highly Nonlinear S-p-boxes.
Cryptography 2019, 3, 6. [CrossRef]
16. Turan, M.S. On Statistical Analysis of Synchronous Stream Ciphers. arXiv 2008, arXiv:1011.1669v3.
17. Duta, C.L.; Mocanu, B.C.; Vladescu, F.A.; Gheorghe, L. Randomness Evaluation Framework of Cryptographic
Algorithms. Int. J. Cryptogr. Inf. Secur. 2014, 4, 31–49. [CrossRef]
18. Castro, J.C.H.; Sierra, J.M.; Seznec, A.; Izquierdo, A.; Ribagorda, A. The strict avalanche criterion randomness
test. Math. Comput. Simul. 2005, 68, 1–7. [CrossRef]
19. Mishra, P.R.; Gupta, I.; Pillai, N.R. Generalized avalanche test for stream cipher analysis. In Proceedings of the
International Conference on Security Aspects in Information Technology, Haldia, India, 19–22 October 2011;
Volume 7011 LNCS, pp. 168–180. [CrossRef]
20. Srinivasan, C.; Lakshmy, K.V.; Sethumadhavan, M. Measuring diffusion in stream ciphers using statistical
testing methods. Def. Sci. J. 2012, 62, 6–10. [CrossRef]
21. Sosa-Gómez, G.; Rojas, O.; Páez-Osuna, O. Using hadamard transform for cryptanalysis of pseudo-random
generators in stream ciphers. EAI Endorsed Trans. Energy Web 2020, 7. [CrossRef]
115
Appl. Sci. 2020, 10, 7668
22. Madarro Capó, E.J.; Cuellar, O.J.; Legón Pérez, C.M.; Gómez, G.S. Evaluation of input—Output statistical
dependence PRNGs by SAC. In Proceedings of the 2016 International Conference on Software Process
Improvement (CIMPS), Aguascalientes, Mexico, 12–14 October 2016; pp. 1–6. [CrossRef]
23. Paul, G.; Maitra, S. RC4: Stream cipher and its variants. RC4 Stream Cipher Its Var. 2011, 1–281. [CrossRef]
24. Grosul, A.L.; Wallach, D.S. A Related-Key Cryptanalysis of RC4; Rice University: Houston, TX, USA, 2000;
pp. 1–13.
25. Matsui, M. Key collisions of the RC4 stream cipher. In International Workshop on Fast Software Encryption; Springer:
Berlin/Heidelberg, Germany, 2009; Volume 5665 LNCS, pp. 38–50. [CrossRef]
26. Chen, J.; Miyaji, A. How to find short RC4 colliding key pairs. In International Conference on Information Security;
Springer: Berlin/Heidelberg, Germany, 2011; Volume 7001 LNCS, pp. 32–46. [CrossRef]
27. Maitra, S.; Paul, G.; Sarkar, S.; Lehmann, M.; Meier, W. New Results on Generalization of Roos-Type Biases
and Related Keystreams of RC4. In International Conference on Cryptology in Africa; Springer:
Berlin/Heidelberg, Germany, 2013; pp. 222–239. [CrossRef]
28. Maximov, A. Some Words on Cryptanalysis of Stream Ciphers; Citeseer: Lund, Sweden, 2006.
29. Vergili, I.; Yücel, M.D. Avalanche and bit independence properties for the ensembles of randomly chosen
n × n s-boxes. Turk. J. Electr. Eng. Comput. Sci. 2001, 9, 137–145.
30. Karell-Albo, J.A.; Legón-Pérez, C.M.; Madarro-Capó, E.J.; Rojas, O.; Sosa-Gómez, G. Measuring
independence between statistical randomness tests by mutual information. Entropy 2020, 22, 741. [CrossRef]
31. Ibrahim, H.; Khurshid, K. Performance Evaluation of Stream Ciphers for Efficient and Quick Security of
Satellite Images. Int. J. Signal Process. Syst. 2019, 7, 96–102. [CrossRef]
32. Gorbenko, I.; Kuznetsov, A.; Gorbenko, Y.; Vdovenko, S.; Tymchenko, V.; Lutsenko, M. Studies on statistical
analysis and performance evaluation for some stream ciphers. Int. J. Comput. 2019, 18, 82–88.
33. RC4 Cipher Is No Longer Supported in Internet Explorer 11 or Microsoft Edge. Available online:
https://support.microsoft.com/en-us/help/3151631/rc4-cipher-is-no-longer-supported-in-internet-
explorer-11-or-microsoft (accessed on 5 July 2020).
34. SSL Configuration Required to Secure Oracle HTTP Server after Applying Security Patch Updates.
Available online: https://support.oracle.com/knowledge/Middleware/2314658_1.html (accessed on
5 July 2020).
35. Satapathy, A.; Livingston, J. A Comprehensive Survey on SSL/ TLS and Their Vulnerabilities. Int. J.
Comput. Appl. 2016, 153, 31–38. [CrossRef]
36. Soundararajan, E.; Kumar, N.; Sivasankar, V.; Rajeswari, S. Performance analysis of security algorithms.
In Advances in Communication Systems and Networks; Springer: Singapore, 2020; Volume 656, pp. 465–476.
[CrossRef]
37. Jindal, P.; Makkar, S. Modified RC4 variants and their performance analysis. In Microelectronics, Electromagnetics
and Telecommunications; Springer: Singapore, 2019; Volume 521, pp. 367–374. [CrossRef]
38. Parah, S.A.; Sheikh, J.A.; Akhoon, J.A.; Loan, N.A.; Bhat, G.M. Information hiding in edges: A high
capacity information hiding technique using hybrid edge detection. Multimed. Tools Appl. 2018, 77, 185–207.
[CrossRef]
39. Tyagi, M.; Manoria, M.; Mishra, B. Effective data storage security with efficient computing in cloud.
Commun. Comput. Inf. Sci. 2019, 839, 153–164. [CrossRef]
40. Dhiman, A.; Gupta, V.; Singh, D. Secure portable storage drive: Secure information storage. Commun. Comput.
Inf. Sci. 2019, 839, 308–316. [CrossRef]
41. Nita, S.; Mihailescu, M.; Pau, V. Security and Cryptographic Challenges for Authentication Based on
Biometrics Data. Cryptography 2018, 2, 39. [CrossRef]
42. Zelenoritskaya, A.V.; Ivanov, M.A.; Salikov, E.A. Possible Modifications of RC4 Stream Cipher.
Mech. Mach. Sci. 2020, 80, 335–341. [CrossRef]
43. Jindal, P.; Singh, B. Optimization of the Security-Performance Tradeoff in RC4 Encryption Algorithm.
Wirel. Pers. Commun. 2017, 92, 1221–1250. [CrossRef]
44. Verdú, S. Empirical estimation of information measures: A literature guide. Entropy 2019, 21, 720. [CrossRef]
45. Hutson, A.D. A robust Pearson correlation test for a general point null using a surrogate bootstrap
distribution. PLoS ONE 2019, 14. [CrossRef]
46. Liu, F.; Dong, Q.; Xiao, G. Probabilistic analysis methods of S-boxes and their applications. Chin. J. Electron.
2009, 18, 504–508.
116
Appl. Sci. 2020, 10, 7668
47. Walpole, R.E.; Myers, R.H. Probability & Statistics for Engineers & Scientists; Pearson Education Limited:
London, UK, 2012.
48. Siraj-Ud-Doulah, M. A Comparison among Twenty-Seven Normality Tests. Res. Rev. J. Stat. 2019, 8, 41–59.
49. Riad, A.M.; Shehat, A.R.; Hamdy, E.K.; Abou-Alsouad, M.H.; Ibrahim, T.R. Evaluation of the RC4 algorithm
as a solution for converged networks. J. Electr. Eng. 2009, 60, 155–160.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional
affiliations.
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
117
applied
sciences
Article
A Novel Intermittent Jumping Coupled Map Lattice Based on
Multiple Chaotic Maps
Rong Huang 1,2 , Fang Han 1,2, *, Xiaojuan Liao 3 , Zhijie Wang 1,2 and Aihua Dong 1,2
1 College of Information Science and Technology, Donghua University, Shanghai 201620, China;
[email protected] (R.H.); [email protected] (Z.W.); [email protected] (A.D.)
2 Engineering Research Center of Digitized Textile & Apparel Technology, Ministry of Education,
Donghua University, Shanghai 201620, China
3 College of Information Science and Technology, Chengdu University of Technology, Chengdu 610059, China;
[email protected]
* Correspondence: [email protected]; Tel.: +86-021-67792315
Abstract: Coupled Map Lattice (CML) usually serves as a pseudo-random number generator for
encrypting digital images. Based on our analysis, the existing CML-based systems still suffer from
problems like limited parameter space and local chaotic behavior. In this paper, we propose a novel
intermittent jumping CML system based on multiple chaotic maps. The intermittent jumping mecha-
nism seeks to incorporate the multi-chaos, and to dynamically switch coupling states and coupling
relations, varying with spatiotemporal indices. Extensive numerical simulations and comparative
studies demonstrate that, compared with the existing CML-based systems, the proposed system has
a larger parameter space, better chaotic behavior, and comparable computational complexity. These
results highlight the potential of our proposal for deployment into an image cryptosystem.
119
Appl. Sci. 2021, 11, 3797. https://doi.org/10.3390/app11093797 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 3797
120
Appl. Sci. 2021, 11, 3797
existing ones [1,7,13,14] in terms of the aforementioned analyses, and thus highlight the
potential for deployment into a practical image cryptosystem.
Figure 1. The bifurcation diagrams and Lyapunov exponents for (a) Logistic map, (b) Sine map, and (c) Chebyshev
map, respectively.
121
Appl. Sci. 2021, 11, 3797
Clearly, the coupling mechanism in Equation (1) is static, due to the constant coupling
coefficient and the regularity of the coupling relations. To alleviate this problem, the NCML
system [7] defines a non-adjacent coupling mechanism as follows:
where the two spatial positions u(l ) and v(l ) are obtained from the Arnold cat map. That is:
, - , -, - , -
u(l ) 1 s l 1
= mod( L) + , (3)
v(l ) t st + 1 l 1
where s and t are the parameters of the Arnold cat map. However, as discussed before, the
Arnold cat map used in Equation (3) belongs to a time-invariant transformation, and tends
to sample the same spatial position repeatedly within a single time step.
The DCML system [13] provides another way to break the static defect, which can be
formulated as:
where the coupling coefficient is no longer a constant, but varies with the spatiotemporal
indices l and n. To be specific, the DCML system [13] leverages an auxiliary Logistic map
of the form:
ε k+1 = f (ε k ) = μ aux · εk · (1 − ε k ), (5)
where k = 1, 2, · · · , LN, to generate the sequence of dynamical coupling coefficients. In
practical use, this sequence is reshaped into a matrix of size L × N so as to gear towards
the spatiotemporal indices in Equation (4). The auxiliary parameter μ aux is set to 3.99 for
achieving outstanding dynamics, and the initial value of ε k is set to e.
The NDCML system [14], which combines the non-adjacent coupling mechanism and
the dynamical coupling coefficient together, can be viewed as a natural extension of the
NCML [7] and DCML [13] systems. That is:
where u(l ) and v(l ) are determined by Equation (3), and ε n (l ) is obtained from the auxiliary
Logistic map defined in Equation (5).
In this paper, we propose a novel Intermittent Jumping Coupled Map Lattice (IJCML),
in which the pseudo-random information generated from the Logistic map, Sine map, and
Chebyshev map will be integrated together in a dynamical manner. The definitions of the
Sine map and the Chebyshev map can be described by the following equations:
122
Appl. Sci. 2021, 11, 3797
phenomenon that a lattice sometimes interacts with other ones (when wn (l ) ≥ 0.5), and
other times updates alone (when wn (l ) < 0.5). Clearly, the coupling states are mutually
different between lattices.
When wn (l ) ≥ 0.5, the spatial positions of coupled lattices are un (l ) and vn (l ), re-
spectively. In this paper, we abandon the Arnold cat map, and resort to a chaotic map to
determine un (l ) and vn (l ). Specifically, the pseudo-random information wn (l ) is reused
here in the following form:
⎧
⎨ un (l ) = mod( w (l ) · 107 , L) + 1,
n (10)
⎩ vn (l ) = mod( wn (l ) · 107 , L) + 1,
Figure 2. S-box of AES. The substitution value is determined by the intersection of the row with
index ‘x’ and the column with index ‘y’.
123
Appl. Sci. 2021, 11, 3797
As shown in Equations (9) and (10), the proposed IJCML system combines the Logistic
map, the Sine map, and the Chebyshev map together in a nonlinear complex way. First,
the Logistic map serves as the nonlinear mapping function as usual. Second, the Sine
map plays a key role in the coupling term. In this paper, the parameter of the Sine map
is set to 0.999, so that, in each time step, the state variables of the lattices at the spatial
positions un (l ) and vn (l ) will be updated along chaotic trajectories. Third, the Chebyshev
map generates the pseudo-random signal that is used to ensure the dynamic switching
of coupling states and coupling relations. Additionally, as long as we protect the initial
state variables of the Sine map and the Chebyshev map (e.g., treating them as keys), it
is virtually impossible for an adversary to infer the mixing behaviors. This alleviates the
problem of deterministic mixing in the existing works [17,24]. This work provides an
intermittent jumping mechanism, which can effectively promote the turbulence evolution
amongst lattices.
where δ(·) denotes the Dirac delta function. Equation (11) reveals that there are two factors
determining the diffusion energy. One is the total number of times that the lth lattice is
sampled after N iterations. Another is the square of the coupling coefficient. Note that,
for a specific CML-based system, the spatiotemporal variability of un (l ), vn (l ), and
ε n (l ) in Equation (11) may be absent. For example, for the original CML system, we shall
set that un (l ) = l − 1, vn (l ) = l + 1, and ε n (l ) = e, respectively. Since this analysis
is primarily concerned with whether the distribution is uniform or not, rather than the
magnitude of E(l), we normalize the diffusion energy as follows:
which measures how far the lth lattice’s diffusion energy is spread to the others. Note
that, for the IJCML system, the calculations of Equations (11) and (13) are triggered only
when wn (l ) ≥ 0.5. It is desirable for larger inter-lattice diffusion distances with uniform
distribution.
Figure 3 exhibits results for the diffusion energy analysis. In this simulation study,
we set e to 0.99. The blue histogram in the upper panel is the normalized distribution of
diffusion energies, while the bottom panel shows the histogram of the inter-lattice diffusion
124
Appl. Sci. 2021, 11, 3797
distances. In addition, we print the standard deviation (std− ) of the blue histogram, and
print the average (avg+ ) and the standard deviation (std− ) of the orange one, where the
superscript ”+” (or “−”) is intended to indicate that a higher (or lower) value is better.
Figure 3. Diffusion energy analysis for (a) the CML system, (b) the NCML system, (c) the DCML system, (d) the NDCML
system, and (e) the proposed IJCML system, respectively.
We find that although the CML system achieves the best uniformity with zero-valued
standard deviations, D (l ) equals 1 for all l, reflecting that each lattice only interacts with its
adjacent counterpart in a regular way. As shown in Figure 3b, the inter-lattice diffusion
distances reach larger values, which demonstrates that the NCML system indeed enlarges
the range of coupling. However, there are several peaks that appear periodically in the
histograms. This is due to the use of the Arnold cat map, which repeatedly samples the
same spatial position at a regular period. Such non-uniform distribution implies that only
several lattices may dominate the diffusion of spatiotemporal dynamics, leading to the local
chaotic behaviors. As shown in Figure 3c, the dynamical coupling coefficient can slightly
modulate the diffusion energies. However, it fails to equalize NDCML’s histograms. In
contrast, the proposed IJCML system equalizes the distribution of diffusion energies, and
achieves larger inter-lattice diffusion distances at the same time. This validates that the
IJCML system can uniformly spread the diffusion energies to non-adjacent lattices, so as to
encourage the diffusion of spatiotemporal dynamics.
125
Appl. Sci. 2021, 11, 3797
Figure 4. Power spectrum analysis for the CML system, the NCML system, the DCML system, the NDCML system, and the
proposed IJCML system.
126
Appl. Sci. 2021, 11, 3797
By comparison, we find that the parameter e has a relatively small influence on the
power spectrum. Thus, the following discussion mainly focuses on the effect of different
values of μ. As we see, for the baseline systems, almost all the power is concentrated at
zero-frequency component when μ = 2.49 or 2.99. This means that there is almost no
fluctuation in the corresponding sequence, so that it belongs to a direct-current signal.
When μ = 3.49, the baseline systems distribute the power onto several frequency compo-
nents. This implies that the corresponding sequence contains periodic transitions between
several state variables, so that it belongs to a deterministic signal. When μ = 3.99, five
power spectrums share the similar noise-like modality, meaning that all the systems have
turned into the complete turbulence pattern [4]. Remarkably, even when μ = 2.49, 2.99,
or 3.49, the IJCML system’s power is evenly dispersed over all frequency components, in
the sense that there exists only one small peak submerged in the noise-like fluctuations.
This phenomenon demonstrates that the proposed system behaves better than the baseline
ones, and indeed enlarges the parameter space of spatiotemporal chaos.
where l = 1, 2, · · · , L. The notations P(l ) and Q(l ) record the number of ones and
zeros, respectively, in the lth binary sequence, and we have N = P(l ) + Q(l ). The value
of equilibrium degree lies in the interval [0, 1], and the higher the value, the better the
equilibrium degree. For some bit-level image cryptosystems, the equilibrium degree
analysis is helpful to check whether a CML-based system can generate pseudo-random
binary sequences.
Figure 5 shows results for the equilibrium degree analysis under the setting 2.8 ≤ μ < 4
and 0 ≤ e ≤ 1. In this simulation study, the sequence xn (l ) generated by the lth lattice is
first binarized through the operation mod( x n (l ) · 107 , 2), where n = 1, 2, · · · , N. We
calculate the average value of ED(l) over all lattices, where l = 1, 2, · · · , L.
Figure 5. Equilibrium degree analysis for (a) the CML system, (b) the NCML system, (c) the DCML
system, (d) the NDCML system, and (e) the proposed IJCML system, respectively.
127
Appl. Sci. 2021, 11, 3797
As we see, when μ < 2.99, the average values of equilibrium degrees are all equal to 0
for the baseline systems. When 2.99 ≤ μ ≤ 3.57, the baseline systems become trapped in
an unstable transition zone, over which the average values are drastically varying. When
μ > 3.57, the DCML system reaches a flat plateau with the average value close to 1 as
shown in Figure 5c. For the CML system, however, there exists an obvious sunken area
embedded into the flat plateau at the range 3.57 < μ < 3.99 and 0 < e < 0.18. For the
NDCML system, two straight ravines occur at μ = 3.74 and 3.84, respectively, and directly
go through the flat plateau. The NCML system has a wavelike plateau, which involves
multiple sunken areas and ravines. This means that these baseline systems fail to achieve
a stable equilibrium degree even when the nonlinear mapping function f ( x ) has entered
into the chaotic regime (i.e., μ > 3.57). In contrast, due to the mixture of multiple chaotic
maps, the proposed IJCML system possesses a more stable and larger flat plateau as shown
in Figure 5e. If e is set to 0, the IJCML system’s equilibrium degree degrades considerably,
which validates that the intermittent jumping mechanism is helpful to enhance the chaotic
behaviors. Based on these observations, we state that the IJCML system outperforms the
baseline ones in terms of the equilibrium degree.
9
H (c) = − ∑ p(c)log2 p(c), (15)
c =0
Figure 6. Information entropy analysis for (a) the CML system, (b) the NCML system, (c) the DCML
system, (d) the NDCML system, and (e) the proposed IJCML system, respectively. The symbol ‘max’
on the vertical axis signifies the maximum value of the average information entropy, namely 3.3219.
128
Appl. Sci. 2021, 11, 3797
We see that, for the baseline systems, the average values of H(c) form a stair-stepped
upward trend as μ increases, and the two notable steps appear at μ = 3.02 and 3.48,
respectively. When μ > 3.48, the DCML system has a smooth surface, in which the average
values are independent of e. For the NDCML system, two straight ravines that appear at
μ = 3.74 and 3.84, respectively, go through the surface. This phenomenon is consistent with
the observation in Section 3.3, demonstrating that there still exist defects in the NDCML
system even when μ is set to a higher value. For the CML system, a L-shaped valley with
larger area is embedded into the surface. This analysis provides a warning that one should
carefully select the parameters before using a CML-based image cryptosystem. Equipped
with the non-adjacent coupling mechanism, the NCML system reduces the area of the
valley, but fails to construct a smooth surface. For the IJCML system, the average value of
H(c) increases along the axes of μ and e, and the leaf-like surface covers a larger parameter
space of spatiotemporal chaos. These results verify that the IJCML system is superior to
the baseline ones in terms of information entropy.
where the superscript i (or j) stands for the lattice’s index, while the symbol MAX is a
constant, representing the maximum possible value of H(ci ) (or H(c j )). In this simulation
study, we quantize a state variable into ten levels as before, so that MAX= log2 10 ≈ 3.3219.
In Equation (16), I (c i ; c j ) is the mutual information between ci and c j , and its definition is
written as:
9 9
p(c i , c j )
I (c i ; c j ) = ∑ ∑ p(c i , c j )log2 . (17)
c i =0 c j =0 p(c i ) · p(c j )
129
Appl. Sci. 2021, 11, 3797
Figure 7. Resulting values of MIED for the inter-lattice independence analysis. The four settings of
{μ, e}, from left to right, are {3.49, 0.1}, {3.49, 0.9}, {3.99, 0.1}, and {3.99, 0.9}, respectively. The symbol
‘max’ on the vertical axis signifies the maximum value of MIED, namely 3.3219.
When μ = 3.99, the values of MIED constitute a plateau lying close to MAX. The
groove along the principal diagonal corresponds to the self-independence of a specific
lattice, namely MIED(c i , ci ), where i = 1, 2, · · · , L. Clearly, a flat plateau with a narrow
groove indicates good performance of the inter-lattice independence. For the CML system,
increasing e reinforces the interactions between two lattices, and thus results in a wider
groove and four sunken regions around the corners. A similar phenomenon also happens
in the DCML system, implying that a dynamical coupling coefficient is insufficient to
eliminate the coupling-caused correlations. For the NCML and NDCML systems, there
exist multiple diagonal grooves embedded into the flat plateau due to the use of Arnold
cat map. From the first two columns of Figure 7e, we find that the IJCML system can
reach higher value of MIED when μ = 3.49 and e = 0.9. More significantly, the IJCML
system forms the flattest plateau with a unit-width groove when μ = 3.99. These compar-
isons demonstrate that the IJCML system has a larger parameter space for generating the
independent sequences randomly.
130
Appl. Sci. 2021, 11, 3797
Further, we calculate the average values of MIED over all lattices via Equation (18)
as follows:
L, i = j i
∑iL=1 ∑ j=1 MIED (c , c j )
, (18)
L·( L − 1)
and investigate the inter-lattice independence under the setting 3.5 ≤ μ < 4 and 0 ≤ e ≤ 1.
The corresponding results are displayed in Figure 8.
Figure 8. Average values of MIED for the inter-lattice independence analysis. (a) Result for the CML system. (b) Result
for the NCML system. (c) Result for the DCML system. (d) Result for the NDCML system. (e) Result for the proposed
IJCML system. The symbol ‘max’ on the vertical axis signifies the maximum value of the average information entropy,
namely 3.3219.
As we see, the CML and NCML systems share the same defect, in the sense that a
cuboid-shaped valley appears at the range 0.12 < e < 0.18. This result suggests that the
CML and NCML systems may be unsuitable for a multi-lattice-based image cryptosystem.
It is clear that the DCML and NDCML systems are well-behaved when μ > 3.7, and are
both insensitive to e. In most cases, the IJCML system achieves higher values than the
baseline systems, especially when 3.5 < μ < 3.7 and 0.85 < e ≤ 1. This result reflects
that the intermittent jumping mechanism can effectively compensate for the performance
degeneration, and thus demonstrates that the IJCML system surpasses the baseline ones in
terms of the inter-lattice independence.
131
Appl. Sci. 2021, 11, 3797
where λ denotes the Lyapunov exponent of F(x), while i stands for the time step. Typically,
a positive Lyapunov exponent indicates that F(x) has chaotic behaviors with the trajectories
diverging exponentially.
Kolmogorov-Sinai Entropy Density (KSED), which incorporates the Lyapunov expo-
nents of all lattices, is devoted to measuring the overall dynamics of a CML-based system.
Formally, its definition is written as:
L
KSED = ∑ λ+ (l )/L, (20)
l =1
where λ+ (l ) represents the positive Lyapunov exponent of the lth lattice. In other words,
Equation (20) overlooks all negative Lyapunov exponents during its calculation.
In addition, Kolmogorov-Sinai Entropy Breadth (KSEB) [32] is used, in this paper, to
count the proportion of chaotic lattices. We calculate its value as follows:
where L+ denotes the number of lattices whose Lyapunov exponents are positive. Clearly,
the higher the values of KSED and KSEB, the better chaotic behaviors of a dynamical system.
Figure 9 exhibits the resulting values of KSED under the setting 3 ≤ μ < 4 and
0 ≤ e ≤ 1. Starting from μ = 3.57, the DCML system possesses a smooth surface with
positive curvatures, validating that increasing μ can constantly strengthen the chaotic be-
haviors. For the NDCML system, however, there exist two objectionable ravines appearing
at μ = 3.74 and 3.84, and straightly passing through the surface. The CML and NCML
systems behave unstably when μ > 3.57, as the values of KSED are varying sharply and
a distinct valley appears at the range 3.57 < μ < 3.99 and 0.11 < e < 0.16. The IJCML
system achieves numerous high values of KSED that form a flat plateau. Noticeably, even
when μ < 3.57, setting e to a higher value can still lead the system into the chaotic regime.
Thus, the proposed system has a larger parameter space of spatiotemporal chaos, which
benefits from the intermittent jumping mechanism.
Figure 9. Resulting values of KSED for the Lyapunov exponent analysis. (a) Result for the CML system. (b) Result for
the NCML system. (c) Result for the DCML system. (d) Result for the NDCML system. (e) Result for the proposed
IJCML system.
132
Appl. Sci. 2021, 11, 3797
In Figure 10, we display the resulting values of KSEB. The maximum value of KSEB,
being equal to 1.0, means that all the lattices in the system are activated into spatiotemporal
chaos. At first glance, the IJCML system’s flat plateau, as shown in Figure 10e, is of the
largest area. We note that, among the baseline systems, only the DCML system has a stable
flat plateau when μ > 3.57. For the other baseline systems, there exist deep valleys or
straight ravines embedded into the flat plateau. In other words, even though μ is set to
a higher value, some lattices are still far from spatiotemporal chaos. Therefore, one shall
spend time inspecting the dynamics carefully for each lattice before deploying them in an
image cryptosystem.
Figure 10. Resulting values of KSEB for the Lyapunov exponent analysis. (a) Result for the CML
system. (b) Result for the NCML system. (c) Result for the DCML system. (d) Result for the NDCML
system. (e) Result for the proposed IJCML system.
Further, we design a statistical tool called Cumulative Percentage Function (CPF) for
analyzing KSEB. The CPF is to reflect the percentage of parameter pairs {μ, e} making KSEB
less than or equal to a specific threshold α. We show the plots of CPFs in Figure 11. The
ideally optimal case is that all possible combinations of μ and e can lead a system into the
fully chaotic regime, where all its lattices have the chaotic behaviors, namely KSEB = 1.0.
Obviously, the ideally optimal case corresponds to an impulse function centered at 1.0.
Among all CPFs, the pink one is closest to the impulse function, consisting of a wide flat
region and a narrow step region. The higher the step occurs at α = 1.0 in Figure 11, the
larger the area of flat plateau in Figure 10. Based on above observations, we state that
the proposed IJCML system is better than the baseline ones in terms of the Lyapunov
exponent analysis.
133
Appl. Sci. 2021, 11, 3797
134
Appl. Sci. 2021, 11, 3797
Figure 12. Bifurcation diagrams. The setting of e, from left to right, is 0.1, 0.5, and 0.9, respectively.
For the baseline systems, we can clearly observe that period-doubling bifurcations
form a route to spatiotemporal chaos. For example, the NDCML system gets trapped
in period-two oscillations when μ = 3.4. Two bifurcation points at μ = 3.45 double the
periods of the orbits and give rise to pitchfork bifurcations. This process is repeated as
μ increases. Starting from μ = 3.57, it is virtually impossible to observe the pitchfork
bifurcations because the number of bifurcation points becomes large and the gaps between
the bifurcation points are negligible. Unfortunately, the chaotic oscillations of the baseline
systems may occasionally be interspersed with periodic windows in some cases. For
example, when setting μ to 3.63, we can observe the periodic windows in the NDCML
system, where the oscillations suddenly degenerate into several periods. Comparing the
baseline systems’ bifurcation diagrams, we find that they roughly share the same phe-
nomenon. That is, the period-doubling bifurcations trigger chaotic oscillations interspersed
with periodic windows. By contract, the proposed IJCML system has a large number of
135
Appl. Sci. 2021, 11, 3797
bifurcation points with tiny gaps between them even when μ = 3.4. This demonstrates
that the IJCML system enters the chaotic regime early. Additionally, the IJCML system
greatly reduces the periodic windows compared with the baseline ones. As shown in
Figure 12e, a higher coupling coefficient introduces more pseudo-random motions from
the multiple chaotic maps, so as to further enhance the chaotic oscillations. Based on these
comparisons, we claim that the IJCML system behaves better than the baseline ones in
terms of the bifurcation diagram analysis.
136
Appl. Sci. 2021, 11, 3797
Figure 13. Space-amplitude plots for spatiotemporal behavior analysis. The setting of μ, from left to right, is 3.15, 3.57, and
3.99, respectively.
When μ = 3.15, the CML system simply creates period-two responses. The other
baseline systems twist the period-two responses at a regular pace, resulting in a X-shaped
pattern. The twisting points correspond to stable solutions of a system, whose positions
depend on the initial state variables. In contrast, the X-shaped pattern is unrecognizable
for the IJCML system. These results reveal that, when the baseline systems are restricted
to the frozen random pattern [4], the IJCML system has advanced into the defect chaotic
diffusion pattern [4].
When μ = 3.57, we can observe the period-doubling behaviors from the space-
amplitude plots of the baseline systems, in the sense that the long-thin X-shaped pattern
evolves into a compound version. Specifically, there exist multi-period responses twisted
alternately, and the dynamic range of the solutions has been enlarged at the same time. In
particular, some of the lattices in the DCML system, for example, the 40th one, have entered
the defect chaotic diffusion pattern [4]. These results reveal that the baseline systems are
transitioning from the frozen random pattern [4] to the defect chaotic diffusion pattern [4].
When μ = 3.00, the period-changing behaviors become complex and unstable,
resulting in extremely twisted and superimposed responses. Remarkably, the dynamic
range is further extended towards the upper and lower bounds. These results reveal that
all the systems have entered the complete turbulence pattern [4], while the IJCML system
has the strongest ability to approach the lower bound. Consequently, we state that the
IJCML system has better spatiotemporal behaviors than the baseline ones.
137
Appl. Sci. 2021, 11, 3797
traverses the L lattices. Hence, all the five CML-based systems investigated in this paper
have the quadratic time complexity, namely O( NL).
Second, we theoretically analyze the space complexity. Clearly, it is necessary to
save the resulting pseudo-random matrix of size N × L. This requires quadratic space
complexity, namely O( NL). Compared with the original CML system [1], the variants, i.e.,
the NCML system [7], the DCML system [13], the NDCML system [14], and the proposed
IJCML system need to occupy additional space to save the intermediate variables obtained
from the auxiliary chaotic maps. See Equations (3), (5), and (8) for details. Fortunately,
since these auxiliary chaotic maps can be separately executed out of the two nested “for”
loops, the final space complexity remains the same, namely O( NL).
However, it is undeniable that the variants may require more time and space consump-
tions in practice. To make an intuitive comparison, we experimentally count the average
time for running 1000 iterations over 100 lattices, and inspect the size of the additional
space. The parameters μ and e are set to [3, 3.99] and [0, 0.99], respectively, and both of
them take 0.01 as step size. Our computing device is a desktop computer with a 2.90 GHz
Intel i7-10700 central processing unit, 16.00 GB memory. Our programming environment is
Matlab R2017a installed on the Window 10 operation system. The intermediate variables
are saved as “.mat” format.
Table 1 lists the simulation results. The IJCML system takes 0.5400 s, on average,
to generate a pseudo-random matrix of size 1000 × 100, which is faster than the NCML
system [7], the DCML system [13], and the NDCML system [14], and is comparable to the
original CML system [1]. High efficiency of the IJCML system is due to the intermittent
jumping mechanism because the coupling operation, as shown in Equation (9), will be
skipped with a probability of about 50%. On the other hand, the NCML system [7]
only requires 3.42 KB space to additionally save the time-invariant spatial positions. In
contrast, the remaining three variants prepare about 746.00 KB space to accommodate the
dynamical coupling coefficients [13,14] or the intermediate variables wn (l ) obtained from
the Chebyshev map. Fortunately, a space of size 746.00 KB is almost negligible for modern
hardware devices. Consequently, the computational complexity of the proposed IJCML is
better than or comparable to the baseline systems [1,7,13,14].
4. Conclusions
In this paper, we propose a novel IJCML system based on multiple chaotic maps. The
intermittent jumping mechanism establishes a new coupling mode, in which not only the
coupling states but also the coupling relations dynamically vary with the spatiotemporal
indices. We conduct extensive numerical simulations and comparative studies to analyze
the proposed system from the following aspects: the diffusion energy analysis, the power
spectrum analysis, the equilibrium degree analysis, the information entropy analysis,
the inter-lattice independence analysis, the Lyapunov exponent analysis, the bifurcation
diagram analysis, the spatiotemporal behavior analysis, and the computational complexity
analysis. The simulation results adequately demonstrate that, compared with the baseline
systems [1,7,13,14], the proposed IJCML system has better chaotic behaviors, which brings
stronger spatiotemporal chaos for a single lattice and higher independency between lattices.
Our future work is to study the effective way for discretizing the IJCML system, and to
consider the issues of practical realization and circuit implementation [33].
138
Appl. Sci. 2021, 11, 3797
Author Contributions: Conceptualization, R.H. and F.H.; methodology, R.H., F.H., and Z.W.; project
administration, R.H., F.H., and X.L.; funding acquisition, R.H., F.H., and X.L.; software, R.H. and
A.D.; writing, R.H.; validation, X.L. and A.D. All authors have read and agreed to the published
version of the manuscript.
Funding: This research was funded in part by the Fundamental Research Funds for the Central
Universities (17D110408), the National Natural Science Foundation of China (11972115, 62001099,
61806171), and the National Key Research and Development Program of China (2019YFC1521300).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data sharing not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Kaneko, K. Spatiotemporal intermittency in coupled map lattices. Prog. Theor. Phys. 1985, 74, 1033–1044. [CrossRef]
2. Kaneko, K. Turbulence in coupled map lattices. Phys. D 1986, 18, 475–476. [CrossRef]
3. Kaneko, K. Lyapunov analysis and information flow in couple map lattices. Phys. D 1986, 23, 436–447. [CrossRef]
4. Kaneko, K. Pattern dynamics in spatiotemporal chaos: Pattern selection, diffusion of defect and pattern competition intermittency.
Phys. D 1989, 34, 1–41. [CrossRef]
5. Kaneko, K. Overview of coupled map lattices. Chaos 1992, 2, 279–282. [CrossRef] [PubMed]
6. Fridrich, J. Symmetric ciphers based on two-dimensional chaotic maps. Int. J. Bifurcat. Chaos 1998, 8, 1259–1284. [CrossRef]
7. Zhang, Y.Q.; Wang, X.Y. A symmetric image encryption algorithm based on mixed linear-nonlinear coupled map lattice. Inf. Sci.
2014, 273, 329–351. [CrossRef]
8. Zhang, Y.Q.; Wang, X.Y. A new image encryption algorithm based on non-adjacent coupled map lattices. Appl. Soft Comput.
2015, 26, 10–20. [CrossRef]
9. Guo, S.F.; Liu, Y.; Gong, L.H.; Yu, W.Q.; Gong, Y.L. Bit-level image cryptosystem combining 2D hyper-chaos with a modified
non-adjacent spatiotemporal chaos. Multimed. Tools Appl. 2018, 77, 21109–21130. [CrossRef]
10. Wang, X.Y.; Zhao, H.Y.; Wang, M.X. A new image encryption algorithm with nonlinear-diffusion based on multiple coupled map
lattices. Opt. Laser Technol. 2019, 115, 42–57. [CrossRef]
11. Zhang, H.; Wang, X.Q.; Xie, H.W.; Wang, C.P.; Wang, X.Y. An efficient and secure image encryption algorithm based on
non-adjacent couple maps. IEEE Access 2020, 8, 122104–122120. [CrossRef]
12. Liu, L.Y.; Zhang, Y.Q.; Wang, X.Y. A novel method for constructing the S-box based on spatiotemporal chaotic dynamics. Appl. Sci.
2018, 8, 2650. [CrossRef]
13. Wang, X.Y.; Feng, L.; Wang, S.B.; Zhang, C.; Zhang, Y.Q. Spatiotemporal chaos in coupled logistic map lattice with dynamic
coupling coefficient and its application in image encryption. IEEE Access 2018, 6, 39705–39724.
14. Wang, X.Y.; Feng, L.; Li, R.; Zhang, F.C. A fast image encryption algorithm based on non-adjacent dynamically coupled map
lattice model. Nonlinear Dyn. 2019, 95, 2797–2824. [CrossRef]
15. Wang, X.Y.; Zhao, H.Y.; Feng, L.; Ye, X.L.; Zhang, H. High-sensitivity image encryption algorithm with random diffusion based
on dynamic-coupled map lattices. Opt. Laser Eng. 2019, 122, 225–238. [CrossRef]
16. Tao, Y.; Cui, W.H.; Zhang, Z. Spatiotemporal chaos in multiple dynamically coupled map lattices and its application in a novel
image encryption algorithm. J. Inf. Secur. Appl. 2020, 55, 102650. [CrossRef]
17. Wang, X.Y.; Qin, X.M.; Liu, C.M. Color image encryption algorithm based on customized globally coupled map lattices.
Multimed. Tools Appl. 2019, 78, 6191–6209. [CrossRef]
18. Zhang, Y.Q.; Wang, X.Y.; Liu, L.Y.; He, Y.; Liu, J. Spatiotemporal chaos of fractional order logistic equation in nonlinear coupled
lattices. Commun. Nonlinear Sci. Numer. Simulat. 2017, 52, 52–61. [CrossRef]
19. Zhang, Y.Q.; Wang, X.Y.; Liu, L.Y.; Liu, J. Fractional order spatiotemporal chaos with delay in spatial nonlinear coupling. Int. J.
Bifurcat. Chaos 2018, 28, 1850020. [CrossRef]
20. Lv, X.P.; Liao, X.F.; Yang, B. A novel pseudo-random number generator from coupled map lattice with time-varying delay.
Nonlinear Dyn. 2018, 94, 325–341. [CrossRef]
21. Lv, X.P.; Liao, X.F.; Yang, B. Bit-level plane image encryption based on coupled map lattice with time-varying delay. Mod. Phys.
Lett. B 2018, 32, 1850124. [CrossRef]
22. Wang, M.X.; Wang, X.Y.; Wang, C.P.; Xia, Z.Q.; Zhao, H.Y.; Gao, S.; Zhou, S.; Yao, N.M. Spatiotemporal chaos in cross coupled
map lattice with dynamic coupling coefficient and its application in bit-level color image encryption. Chaos Soliton. Fract.
2020, 139, 110028. [CrossRef]
23. Wang, M.X.; Wang, X.Y.; Zhao, T.T.; Zhang, C.; Xia, Z.Q.; Yao, N.M. Spatiotemporal chaos in improved cross coupled map lattice
and its application in a bit-level image encryption scheme. Inf. Sci. 2021, 544, 1–24. [CrossRef]
139
Appl. Sci. 2021, 11, 3797
24. Wang, X.Y.; Guan, N.N.; Zhao, H.Y.; Wang, S.W.; Zhang, Y.Q. A new image encryption scheme based on coupling map lattices
with mixed multi-chaos. Sci. Rep. 2020, 10, 9784. [CrossRef]
25. Zhang, Y.Q.; He, Y.; Wang, X.Y. Spatiotemporal chaos in mixed linear-nonlinear two-dimensional coupled logistic map lattice.
Physica A 2018, 490, 148–160. [CrossRef]
26. Lu, G.Q.; Smidtaite, R.; Howard, D.; Ragulskis, M. An image hiding scheme in a 2-dimensional coupled map lattice of matrices.
Chaos Soliton Fract. 2019, 124, 78–85. [CrossRef]
27. Kumar, S.; Kumar, R.; Kumar, S.; Kumar, S. Cryptographic construction using coupled map lattice as a diffusion model to
enhanced security. J. Inf. Secur. Appl. 2019, 46, 70–83. [CrossRef]
28. He, Y.; Zhang, Y.Q.; Wang, X.Y. A new image encryption algorithm based on two-dimensional spatiotemporal chaotic system.
Neural Comput. Appl. 2020, 32, 247–260. [CrossRef]
29. Zou, C.Y.; Wang, X.Y.; Li, H.F.; Wang, Y.Z. Enhancing the kinetic complexity of 2-D digital coupled chaotic lattice. Nonlinear Dyn.
2020, 102, 2925–2943. [CrossRef]
30. FIPS PUB 197. Advanced Encryption Standard; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2001.
31. Wolf, A.; Swift, J.B.; Swinney, H.L.; Vastano, J.A. Determining Lyapunov exponents from a time-series. Phys. D 1985, 16, 285–317.
[CrossRef]
32. Zhang, Y.Q.; Wang, X.Y.; Liu, J.; Chi, Z.L. An image encryption scheme based on the MLNCML system using DNA sequences.
Opt. Laser Eng. 2016, 82, 95–103. [CrossRef]
33. Kocarev, L. Chaos-based cryptography: A brief overview. IEEE Circ. Syst. Mag. 2001, 1, 6–21. [CrossRef]
140
applied
sciences
Article
A Digital Cash Paradigm with Valued and No-Valued e-Coins
Ricard Borges 1,2,† and Francesc Sebé 1,2, *,†
Abstract: Digital cash is a form of money that is stored digitally. Its main advantage when compared
to traditional credit or debit cards is the possibility of carrying out anonymous transactions. Diverse
digital cash paradigms have been proposed during the last decades, providing different approaches
to avoid the double-spending fraud, or features like divisibility or transferability. This paper presents
a new digital cash paradigm that includes the so-called no-valued e-coins, which are e-coins that can
be generated free of charge by customers. A vendor receiving a payment cannot distinguish whether
the received e-coin is valued or not, but the customer will receive the requested digital item only in
the former case. A straightforward application of bogus transactions involving no-valued e-coins is
the masking of consumption patterns. This new paradigm has also proven its validity in the scope of
privacy-preserving pay-by-phone parking systems, and we believe it can become a very versatile
building block in the design of privacy-preserving protocols in other areas of research. This paper
provides a formal description of the new paradigm, including the features required for each of its
components together with a formal analysis of its security.
141
Appl. Sci. 2021, 11, 9892. https://doi.org/10.3390/app11219892 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 9892
142
Appl. Sci. 2021, 11, 9892
the vendor (the payee). The vendor cannot determine whether the received e-coin was
valued or not while the customer receives the requested digital item only when a valued
e-coin has been spent.
This paradigm has proven its validity in the scope of pay-by-phone parking appli-
cations. In [17], a driver, after parking their car in a regulated zone, acquires tickets for
some consecutive short time intervals during which their car is expected to be parked. So
as to mask the expected parking duration, all the drivers always request the same amount
of tickets. Those tickets belonging to intervals after the expected parking duration are
paid through no-valued e-coins. An ad-hoc construction belonging to this digital cash
paradigm was presented in [17]. Nevertheless, that initial design assumed the use of a
fixed suite of cryptosystems due to restrictions in what concerns to the key-size of the
used cryptography. More precisely, the cleartexts of the cryptosystem used by the vendor
for signing the issued e-coins had to be large enough to accommodate public keys of the
cryptosystem used for encrypting the acquired digital item. The role of the mentioned
cryptosystems will be explained later in Section 4.
In this paper, we propose how the mentioned key size limitation can be eliminated.
The novel construction also avoids the need in [17] for a timestamp authority. Instead,
customers timestamp their transactions by themselves.
Section 1 has presented an introduction to digital cash systems. Section 2 briefly
reviews the cryptographic tools required in our construction. Next, Section 3 presents a
construction based on the use of OAEP (Optimal Asymmetric Encryption Padding [18]),
which allows the simulation of digital signatures over messages of arbitrary length. After
that, the novel digital cash paradigm is detailed in Section 4. Some cryptosystems providing
the required features of the new paradigm are discussed in Section 5. Section 6 is devoted
to analyzing the security of the proposal. Next, experimental results are summarized in
Section 7, while Section 8 concludes the paper.
2. Preliminaries
This section provides a brief introduction to the cryptographic primitives used by the
proposed paradigm.
143
Appl. Sci. 2021, 11, 9892
further requires finding a piece of data whose digest matches the obtained one. This is
unfeasible if an appropriate one-way hash function is being used for digest computation.
The following lemma states a property of OAEP, which is crucial for our construction.
Lemma 1. Given a message M and a k0 -bit string Y, it is hard to find an {r, X } pair so that
( X, Y ) = OAEPm,k0 ( M, r ).
Proof. Given M and Y, one must find a bitstring r satisfying Equation (1).
Since G acts as a random oracle, after choosing r, the output of G(r ) is assumed
to be random so that M ⊕ G(r ) is also random. Function H is also a random oracle so
that H( M ⊕ G(r )) is random and so r ⊕ H( M ⊕ G(r )) is. Hence, the probability that the
resulting random string matches Y is 2−k0 so that the expected number of trials needed for
finding such an r is 2k0 which is unfeasible if k0 is large enough. Given that r is k0 bits long,
there are exactly 2k0 candidates so that such an r may not exist.
144
Appl. Sci. 2021, 11, 9892
M = X ⊕ G(H( X ) ⊕ Y ). (2)
In this case, an equivalent analysis can be applied leading to a 2−m success probability,
which is harder, as typically m k0 .
145
Appl. Sci. 2021, 11, 9892
4.1. Overview
Our proposal is a pre-paid digital cash paradigm. Customers acquire valued e-coins
in advance and store them in an e-wallet. They will later be spent against the vendor when
purchasing items. The paradigm is composed of two actors:
• Vendor. A vendor sells digital products online and participates in the issuance of
valued e-coins after being paid for them.
• Customers. They manage an e-wallet containing valued e-coins. These e-coins are
acquired in advance and stored until spent during a purchase procedure. Customers
can generate no-valued e-coins on their own.
No-valued e-coins can be spent against the vendor but, in such a case, the customer
will not receive any product back. No-valued e-coins enable bogus purchases aiming to
mask consumption patterns. The vendor cannot distinguish whether the e-coin involved
in a transaction was valued or not.
All the e-coin components but the last one (SignPKV (Y )) are always generated by
the customer. If the e-coin is valued, signature SignPKV (Y ) is computed by the vendor;
otherwise, it is simulated by the customer. Regarding the components of an e-coin,
• vS /QS is a private/public key-pair of a public key cryptosystem allowing digital
signature computation. Hence, data signed with vs can be validated under QS . QS has
been OAEPPA -encoded (with plaintext-awareness) into ( XS , YS ) = OAEPPA ( QS , rS )
for some random rS .
• v R /Q R is a private/public key-pair of a public key cryptosystem allowing data en-
cryption. Hence, data encrypted under Q R can only be decrypted by providing v R .
Q R has been OAEP-encoded into ( XR , YR ) = OAEP( Q R , r R ) for some random r R .
• Let Y = YS ⊕ YR . Then, {Y, SignPKV (Y )} is a digest-signature tuple which can be
validated under PKV .
146
Appl. Sci. 2021, 11, 9892
The private key v R corresponding to Q R is not known, and the corresponding part of
the tuple is empty (∅).
Note that we need a cryptosystem in which the probability of obtaining a valid public
key in a pseudo-random manner is relatively high (step 3). More details are given in
Section 5.3.
5. Cryptosystems Choice
This section provides an assessment on the features to be provided by the cryptosys-
tems chosen to implement the paradigm.
147
Appl. Sci. 2021, 11, 9892
• If the e-coin is no-valued, the tuple is simulated by the customer. The vendor does not
take part in this process (Section 4.4, step 1).
Hence, the signature scheme chosen for such {Y, SignPKV (Y )} tuples has to en-
able both:
• The computation of blind signatures.
• The generation of simulated digest-signature tuples.
Next, we discuss two feasible options.
148
Appl. Sci. 2021, 11, 9892
5.3.1. ECIES
The Elliptic Curve Integrated Encryption Scheme (ECIES) [23] is an elliptic curve-
based public key encryption scheme whose security holds on the assumed intractability of
the Elliptic Curve Discrete Logarithm Problem (ECDLP).
Such a cryptosystem is set by choosing an elliptic curve E represented as an expression
of the form shown in Equation (5):
Y 2 = X 3 + AX + B, (5)
with A, B being elements of a finite field F such that its set of points E(F) has a cardinality
divisible by a large prime q. An order-q point P of E(F) is also chosen. Throughout this
section, we assume q is prime.
An ECIES private key is generated by choosing a random v ∈ {0, . . . , q − 1}. The
corresponding public key is the point of E(F) computed as Q = vP.
The probability that a random point ( x, y) ∈ F × F belongs to E(F) is negligible since
its components should satisfy Equation (5). Nevertheless, this drawback can be addressed
by representing elliptic curve points in compressed form. A point ( x, y) ∈ E(F) can be
represented as ( x, b) with b being a Boolean indicating whether y > −y. In this way, a
randomly generated compressed point ( x, b) belongs to E(F) (and hence it is a public key),
if its x-component satisfies that x3 + Ax + B is a quadratic residue in F. This happens with
a close to 1/2 probability [17].
5.3.2. ElGamal
ElGamal [24] is a public key cryptosystem whose security holds on the assumed
intractability of the Discrete Logarithm Problem (DLP).
This cryptosystem is set by choosing a large prime q satisfying that p = 2q + 1 is also
prime. The cryptosystem is built on the order-q multiplicative subgroup of Z∗p . An order-q
element g is chosen during the setup.
A private key is generated by choosing a random x ∈ {0, . . . , q − 1} and the corre-
sponding public key is computed as y = g x (mod p).
149
Appl. Sci. 2021, 11, 9892
A randomly selected element from Z∗p turns out to be a public key if its order is q. This
happens exactly with a 1/2 probability since Z∗p contains exactly p − 1 = 2q elements with
q of them having the desired order.
6. Security Analysis
A digital cash system like the one presented in this paper should satisfy the following
security requirements:
1. Valued e-coins cannot be forged by malicious customers;
2. E-coins cannot be double-spent;
3. Customers cannot be falsely accused of double-spending an e-coin.
Although the vendor can be assumed to be a somehow trusted party, Req. 3 is still
needed to prevent malicious double-spenders from claiming they are being accused falsely.
The following lemmas address the fulfillment of the enumerated requirements.
Lemma 2. Valued e-coins of the proposed digital cash paradigm cannot be forged.
Proof. Let us recall that an e-coin is a tuple of the form shown in Equation (3). An e-coin
can only be spent if private key vS is known. Otherwise, the digital signature required
at step 1 of the “Spending” protocol cannot be computed. Hence, QS must be generated
together with vS , and the YS component of its OAEP encoding (with plaintext awareness) is
obtained pseudo-randomly by calling OAEPPA ( QS , rS ) for some random rS . Hence, the YS
component cannot be chosen by a dishonest party aiming to forge an e-coin. Note also that
the YS component of spent e-coins is checked not to be part of an already spent e-coin. In
this way, there is no point in taking the YS component of a new e-coin from an existing one.
If the forger then simulates the {Y, SignPKV (Y )} digest-signature pair, the resulting Y
cannot be chosen (otherwise the underlying signature scheme would be forgeable), so that
YR = Y ⊕ YS can neither be chosen and, after taking any XR , the public key Q R obtained
from ( Q R , r R ) = OAEP−1 ( XR , YR ) is pseudo-random and its private key remains unknown.
In this way, the resulting e-coin is no-valued. Lemma 1 guarantees that given YR and some
chosen Q R , finding a {r R , XR } pair satisfying the previous expression is unfeasible.
Alternatively, the forger could generate a v R /Q R key-pair and OAEP-encode it into
( XR , YR ). In this case, the obtained YR component is pseudo-random and so the resulting
Y = YS ⊕ YR is. Hence, the signature SignPKV (Y ) over Y cannot be obtained by the forger
without the participation of the vendor.
Proof. When an e-coin is spent, the vendor stores a record which includes its YS component.
Hence, any attempt to spend the same e-coin in the future will be detected.
Proof. Customers spending an e-coin are required to digitally sign a timestamped sequence
using the vS private key. This digital signature can be validated under public key QS . Only
the customer who generated an e-coin knows its vS secret key.
A vendor claiming that an e-coin is being double-spent is required to provide the
signed timestamped sequence of the first time the e-coin was spent (Section 4.5, step 4). If
the claim is false, they will be unable to provide it.
7. Experimental Results
The proposed paradigm has been validated through a prototype implemented in
Java. Cryptographic operations involving large integers use the java.math.BigInteger
150
Appl. Sci. 2021, 11, 9892
library. Hash digests have been computed using the SHA-224 [25] function. Regarding the
employed cryptosystems, we have chosen the following:
• Vendor’s key-pair (Section 5.1): RSA with 2048 bit keys.
• Cryptosystem for e-coin transaction signature (Section 5.2): ECDSA [26] with 224 bit keys.
• Cryptosystem for product encryption (Section 5.3): ECIES with 224 bit keys.
Our experiments have measured the running time of the “Valued e-coin genera-
tion” (Section 4.3), “No-valued e-coin generation” (Section 4.4), and “Spending an e-coin”
(Section 4.5) procedures. The prototype has been run on several personal computers. Aver-
age running times from 500 executions have been measured. As expected, computers with
a faster processor lead to better running times. We have also observed that the running
time benefits from parallel execution mode.
Table 1 shows the average running time of the “Valued e-coin generation” and “No-
valued e-coin generation” procedures. Let us recall that the generation of valued e-coins
involves both the customer and the server (which is required to compute a blind signature)
while the procedure for generating no-valued ones is run entirely by the customer. The
table shows that the generation of a no-valued e-coin takes some more time than a valued
one. This is due to the fact that step 3 of the procedure for generating no-valued e-coins
sometimes has to be run more than one time. In our experiments, in which we have
implemented the ECIES cryptosystem, there is a 50% chance of having to run it again.
In the fastest tested processor, in parallel mode, generation of a valued e-coin and a no-
valued one takes around 3 and 4 ms, respectively, leading to generation rates of 333 and
250 e-coins per second, respectively.
Table 2 shows the running time of the “Spending an e-coin” procedure at the vendor
part. We focus on this part of the process due to the fact that a vendor may receive a lot of
concurrent payments. We do not distinguish between spending a valued or a no-valued
e-coin since the procedure is exactly the same in both cases. The fastest running time,
obtained on an AMD Ryzen 7 processor in parallel mode, indicates that receiving an e-coin
payment takes 2.69 ms, so that around 371 payments can be processed in a single second.
System Server
Processor Cores Threads GHz Serial Parallel
AMD Athlon 4 4 2.80 51.24 13.99
Intel i5-8350U 4 8 1.70-3.60 26.67 7.37
Intel i7-6700 4 8 3.40-4.00 22.68 5.12
Intel i7-8700 6 12 3.20-4.60 20.69 3.59
AMD Ryzen 7 8 16 3.70-4.30 25.41 2.69
8. Conclusions
This paper has presented a novel digital cash paradigm in which customers are able
to generate no-valued e-coins by themselves. Such no-valued e-coins can be spent like
regular valued ones in such a way that the vendor receiving a payment is unable to
distinguish between both situations. The customer only receives the requested digital
151
Appl. Sci. 2021, 11, 9892
product when the spent e-coin is a valued one. This new paradigm fits in scenarios in
which customers may wish to mask their consumption patterns through bogus transactions
like pay-per-view TV or music platforms. The paradigm has already proven its validity in
privacy-preserving pay-by-phone parking systems enabling drivers with the possibility of
keeping their expected parking time secret.
In our future research, we plan to investigate the design of privacy-preserving proto-
cols, which include the presented digital cash paradigm as a building block.
Author Contributions: Conceptualization, R.B. and F.S.; methodology, R.B. and F.S.; validation,
R.B. and F.S.; formal analysis, F.S.; writing—original draft preparation, F.S.; writing—review and
editing, R.B.; funding acquisition, F.S. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by the Spanish Ministry of Science, Innovation and Universities
grant number MTM2017-83271-R.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data is contained within the article.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.
References
1. Chaum, D. Blind Signatures for Untraceable Payments. In Advances in Cryptology; Chaum, D., Rivest, R.L., Sherman, A.T., Eds.;
Springer: Boston, MA, USA, 1983; pp. 199–203.
2. Brands, S. Untraceable Off-line Cash in Wallet with Observers. In Advances in Cryptology—CRYPTO’93; Stinson, D.R., Ed.;
Springer: Berlin/Heidelberg, Germany, 1994; pp. 302–318.
3. Eng, T.; Okamoto, T. Single-term divisible electronic coins. In Advances in Cryptology—EUROCRYPT’94; De Santis, A., Ed.;
Springer: Berlin/Heidelberg, Germany, 1995; pp. 306–319.
4. Nakanishi, T.; Sugiyama, Y. Unlinkable Divisible Electronic Cash. In Information Security; Goos, G., Hartmanis, J., van Leeuwen,
J., Pieprzyk, J., Seberry, J., Okamoto, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; pp. 121–134.
5. Canard, S.; Gouget, A. Divisible E-Cash Systems Can Be Truly Anonymous. In Advances in Cryptology—EUROCRYPT 2007; Naor,
M., Ed.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 482–497.
6. Au, M.H.; Susilo, W.; Mu, Y. Practical Anonymous Divisible E-Cash from Bounded Accumulators. In Financial Cryptography and
Data Security; Tsudik, G., Ed.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 287–301.
7. Liu, J. Efficient Arbitrarily Divisible E-Cash Applicable to Secure Massive Transactions. IEEE Access 2019, 7, 59299–59310.
[CrossRef]
8. Bourse, F.; Pointcheval, D.; Sanders, O. Divisible E-Cash from Constrained Pseudo-Random Functions. In Advances in
Cryptology—ASIACRYPT 2019; Galbraith, S.D., Moriai, S., Eds.; Springer International Publishing: Cham, Switzerland, 2019;
pp. 679–708.
9. Rivest, R.L.; Shamir, A. PayWord and MicroMint: Two simple micropayment schemes. In International Workshop on Security
Protocols; Springer: Berlin/Heidelberg, Germany, 1996; pp. 69–87.
10. Oros, H.; Popescu, C. A Secure and Efficient Off-Line Electronic Payment System for Wireless Networks. Int. J. Comput. Commun.
Control. 2010, V, 551–557. [CrossRef]
11. Sai Anand, R.; Madhavan, C. An Online, Transferable E-Cash Payment System. In Progress in Cryptology —INDOCRYPT 2000;
Roy, B., Okamoto, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; pp. 93–103.
12. Bauer, B.; Fuchsbauer, G.; Qian, C. Transferable E-Cash: A Cleaner Model and the First Practical Instantiation. In Public-Key
Cryptography—PKC 2021; Garay, J.A., Ed.; Springer International Publishing: Cham, Switzerland, 2021; pp. 559–590.
13. Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. 2009. pp. 1–9. Available online: https://bitcoin.org/bitcoin.pdf
(accessed on 22 September 2021).
14. Wood, G. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Proj. Yellow Pap. 2021, 151, 1–32.
15. Park, K.W.; Baek, S.H. OPERA: A Complete Offline and Anonymous Digital Cash Transaction System with a One-Time Readable
Memory. IEICE Trans. Inf. Syst. 2017, 100, 2348–2356. [CrossRef]
16. European Central Bank. Report on Digital Euro; Tech. Report; Frankfurt am Main, Germany, 2020. Available online: https:
//www.ecb.europa.eu/pub/pdf/other/Report_on_a_digital_euro~4d7268b458.en.pdf (accessed on 22 September 2021).
17. Borges, R.; Sebé, F. An efficient privacy-preserving pay-by-phone system for regulated parking areas. Int. J. Inf. Secur. 2021,
20, 715–727. [CrossRef]
152
Appl. Sci. 2021, 11, 9892
18. Bellare, M.; Rogaway, P. Optimal Asymmetric Encryption—How to Encrypt with RSA; Springer: Berlin/Heidelberg, Germany, 1995;
pp. 92–111.
19. Schneier, B. Applied Cryptography: Protocols, Algorithms, and Source Code in C, 2nd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA,
1995.
20. Goldwasser, S.; Micali, S.; Rackoff, C. The knowledge complexity of interactive proof systems. SIAM J. Comput. 1989, 18, 186–208.
[CrossRef]
21. Rivest, R.L.; Shamir, A.; Adleman, L. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Commun. ACM
1978, 21, 120–126. [CrossRef]
22. Boldyreva, A. Threshold Signatures, Multisignatures and Blind Signatures Based on the Gap-Diffie-Hellman-Group Signature
Scheme. In Public Key Cryptography—PKC 2003; Desmedt, Y.G., Ed.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 31–46.
23. Gayoso, V.; Hernandez, L.; Sánchez, C. A Survey of the Elliptic Curve Integrated Encryption Scheme. J. Comput. Sci. Eng. 2010,
2, 7–13.
24. ElGamal, T. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans. Inf. Theory 1985,
31, 469–472. [CrossRef]
25. Handschuh, H. SHA Family (Secure Hash Algorithm). In Encyclopedia of Cryptography and Security; van Tilborg, H.C.A., Ed.;
Springer: Boston, MA, USA, 2005; pp. 565–567. [CrossRef]
26. Johnson, D.; Menezes, A.; Vanstone, S.A. The Elliptic Curve Digital Signature Algorithm (ECDSA). Int. J. Inf. Secur. 2001, 1, 36–63.
[CrossRef]
153
applied
sciences
Article
Authorization Mechanism Based on Blockchain Technology for
Protecting Museum-Digital Property Rights
Yun-Ciao Wang 1 , Chin-Ling Chen 2,3,4, * and Yong-Yuan Deng 2, *
1 National Museum of Marine Biology and Aquarium, Pingtung 94450, Taiwan; [email protected]
2 Department of Computer Science and Information Engineering, Chaoyang University of Technology,
Taichung 41349, Taiwan
3 School of Information Engineering, Changchun Sci-Tech University, Changchun 130600, China
4 School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361005, China
* Correspondence: [email protected] (C.-L.C.); [email protected] (Y.-Y.D.)
Featured Application: Museums not only achieve the goal of promoting social education, but
also solve their financial problems.
Abstract: In addition to the exhibition, collection, research, and educational functions of the museum,
the development of a future museum includes the trend of leisure and sightseeing. Although the
museum is a non-profit organization, if it can provide digital exhibits and collections under the
premises of “intellectual property rights” and “cultural assets protection”, and licensing and adding
value in various fields, it can generate revenue from digital licensing and handle the expenses of
museum operations. This will be a new trend in the sustainable development of museum operations.
Especially since the outbreak of COVID-19 at the beginning of this year (2020), the American Alliance
of Museums (AAM) recently stated that nearly a third of the museums in the United States may
be permanently closed since museum operations are facing “extreme financial difficulties.” This
research is aimed at museums using the business model of “digital authorization”. It proposes an
authorization mechanism based on blockchain technology protecting the museums’ digital rights in
Citation: Wang, Y.-C.; Chen, C.-L.; the business model and the application of cryptography. The signature and time stamp mechanism
Deng, Y.-Y. Authorization Mechanism achieve non-repudiation and timeless mechanism, which combines blockchain and smart contracts
Based on Blockchain Technology for to achieve verifiability, un-forgery, decentralization, and traceability, as well as the non-repudiation
Protecting Museum-Digital Property of the issue of cash flow with signatures and digital certificates, for the digital rights of museums
Rights. Appl. Sci. 2021, 11, 1085. in business. The business model proposes achievable sustainable development. Museums not only
https://doi.org/10.3390/app11031085 achieve the goal of promoting social education, but also solve their financial problems.
155
Appl. Sci. 2021, 11, 1085. https://doi.org/10.3390/app11031085 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021, 11, 1085
to the data and files on various utensils, paintings, and calligraphy, specimens, and doc-
uments that have been processed through digital processes. The “digital collections”
authorization originated at the beginning of photography in the 19th century. The British
Museum accepted donations of photographic images, as well as professional photogra-
phers’ photo collections, and sold the collections taken in the museum and the records of
museum activity photos; this was the beginning of the museum’s image recording and
image authorization [2].
In the 20th century, international museums and governments implemented the digiti-
zation plan of various museum collections based on the mission of collection preservation
and promotion of cultural policies. Currently, in the 21st century, digital technology is
booming, and museums have entered the era of digitization. There are a huge number of
digital images. Production allows museums to hold numerous copyrights, signaling an
important turning point for image authorization. The international museum community
has invested a lot of money and human resources starting more than ten years ago to
digitize its collections on a large scale. For example, J. Paul Getty Trust, associated with
the Getty Museum, paid US$4.2 million from 1997 to 2002 in funding to establish the
“Electronic Cataloguing Initiative”, which sponsored 21 Los Angeles area museums whose
main collections are visual arts.
In 2009, the foundation launched OSCI in cooperation with the J. Paul Getty Museum
and eight other institutions. Arthur M. Sackler and Freer Art Gallery; Los Angeles County
Museum of Art; National Gallery of Art in Washington, DC; San Francisco Museum
of Modern Art; Seattle Art Museum; and the Tate and Walker Art Center. The goal of
the alliance is to create models for online catalogs, which will greatly increase access to
museum collections to provide interdisciplinarity and the latest research, and innovate
how to conduct, introduce, and use this research [3]. In 2002, the Culture Online Project of
the British Department of Culture, Media and Sports was founded [4]; the British Museum
established the “Merlin Project” in 2006 along with other projects, which are all efforts
related to the museum’s digital collection.
The core of the museum is its collection and heritage, the physical evidence of human
survival and its environment. This includes two levels of connotation: One is the cultural
relic entity in the museum’s collection; the other is the information resources that recreate
the cultural relic entity, reveal its original information and cultural connotation, including
text introductions, images, video three-dimensional models, etc. Museum experts and
scholars research and publish works on a certain collection or collection preservation
technology, as well as works of collection pictures taken by museum photographers, etc.,
all belonging to the collection resources.
The “Creative Economy Report 2010” of UNESCO [4] points out that “cultural her-
itage” is the source of all art forms, and the soul of culture and creative industries, which
brings together history, anthropology, ethnology, aesthetics, and social perspectives, while
influences people’s creativity. The intellectual property authorization of the museum
means that the museum authorizes the copyright of its collection resources. It includes
cultural relics, specimens, and artworks to other institutions for the development of cultural
derivatives, transforms cultural resources into cultural goods, and establishes effective
communication with consumers. It forms a unique brand of museums and reflects the
intention of museums to develop products [5]. The authorized person pays the correspond-
ing fee to the authorizer, and the authorizer gives the authorized person corresponding
guidance and assistance. In particular, museums in various countries with rich collections
can serve as models for brand authorization.
Brand authorization began in the United States in the early 20th century. When Dis-
ney’s classic cartoon image of Mickey Mouse became famous, a furniture merchant paid
Walt Disney US$300 in exchange for the right to print the image of Mickey Mouse on its
products. Disney is recognized as the originator of international brand authorization. Cur-
rently, brand authorization has become a global industry with a relatively mature operation
model and a complete industrial chain. According to the “2019 Global Licensing Industry
156
Appl. Sci. 2021, 11, 1085
157
Appl. Sci. 2021, 11, 1085
Under this authorization model, the authorized party often directly participates in
the use of the digital image resources of cultural relics by third-party manufacturers.
The advantage is that it is not only conducive to the museum as the authorized party to
promptly understand the development of digital image resources, but is also given an
in-depth understanding of the connotations of the collection by the relevant departments
158
Appl. Sci. 2021, 11, 1085
of the museum, which is often helpful to the successful development of digital resources.
However, the shortcomings of this authorization model are also obvious. Because the
authorized party is a state-owned museum, the nature of its public welfare institutions
often makes it limited in authorization methods, scope, personnel incentives, and so on, so
it can easily lead to insufficient responses to market demand and changes.
2 Proxy authorization model of museum digitized collections
The proxy authorization model refers to the model in which the museum does not
directly act as the authorized subject, but entrusts an agent or an authorization platform
as an intermediary, authorizes through a contract with the authorized party, and finally
uses the digital resources of the collection in the manner agreed to in the contract. In
this model, there will be two authorization behaviors: The first time is the authorization
by the museum to the agent or the authorization platform, and the second time is the
authorization by the agent or the authorization platform to the third party. The process of
this type of authorization mode is shown in Figure 2. The Louvre Museum in France and
the Solomon R Guggenheim Museum in the United States are typical representatives of
this authorization model.
The entrusted authorization model means that the museum authorizes an agent to
sign an authorization contract with the authorized person on behalf of the museum, a
common museum proxy authorization model. In the proxy authorization model, agents
as authorized intermediaries often have rich authorization management experience and
mature customer groups, respond quickly to market demand, and have strong marketing
capabilities, which can assist museums in rapidly opening up the authorization market,
thereby promoting museums. The cultural and creative production industry has developed
rapidly. However, agents, as market entities dominated by economic interests, tend to
ignore the public welfare contained in cultural relics, significantly weakening the mu-
seum’s ability to control the use of the digital collection by authorized third parties. In
this process, third parties are based on market interests driving the development and
utilization of authorized resources, so the cultural and economic risks faced by museums
will increase accordingly.
The platform authorization model is similar to the entrusted authorization model, but
there are differences in the scale of the authorizing party and the authorized party. Under
the entrusted authorization model, it is usually one-to-one, that is, a museum entrusts
159
Appl. Sci. 2021, 11, 1085
160
Appl. Sci. 2021, 11, 1085
property rights. The signature and time stamp mechanism of cryptography is used to
achieve a non-repudiation mechanism, and the smart contract achieves transparency, un-
forgeability, and traceability; this mechanism will thereby solving the above-mentioned
problems faced by museum-digital rights management.
The rest of this article is organized as follows. The second section provides preliminary
knowledge. The third section discusses the proposed methods for two kinds of authority
mechanisms in the business model. The fourth section presents an analysis of the proposed
scheme. The fifth section includes a discussion and comparison of the proposed scheme
with related works. Finally, we present the conclusion and future works.
2. Preliminary
2.1. Smart Contract
A smart contract is a special agreement that is used when making a contract in the
blockchain. It contains code functions and can interact with other contracts, guide decisions,
store data, etc. The main force of smart contracts is to provide verification and execution
of the conditions stipulated in the contract. Smart contracts allow credible transactions
without the need for a third party. These transactions are traceable and irreversible. The
concept of smart contracts was first proposed in 1994 by Nick Szabo [27,28], a computer
scientist and cryptography expert. The purpose of smart contracts is to provide better
security than traditional contract methods and to reduce other transaction costs associated
with the contract.
2.2. ECDSA
In cryptography, the Elliptic Curve Digital Signature Algorithm (ECDSA) offers a vari-
ant of the Digital Signature Algorithm (DSA), which uses elliptic curve cryptography [29].
As with elliptic-curve cryptography in general, the bit size of the public key believed to be
needed for ECDSA is about twice the size of the security level, in bits. For example, at a
security level of 80 bits (meaning an attacker requires a maximum of about 280 operations
to find the private key), the size of an ECDSA public key would be 160 bits, whereas the
size of a DSA public key is at least 1024 bits. On the other hand, the signature size is
the same for both DSA and ECDSA: Approximately 4t bits, where t is the security level
measured in bits; that is, about 320 bits for a security level of 80 bits.
The signature and verification process of ECDSA is as follows: Suppose Alice wants
to send a message to Bob. Initially, both parties must reach a consensus on the curve
parameters (CURVE, G, n). In addition to the field equation of the curve, the base point G
on the curve and the multiplication order n of the base point G are also required. Alice also
needs a private key, d A and a public key, Q A , where Q A = d A G. If the message Alice wants
to send is m, Alice needs to choose a random value k between [1, n − 1]: Calculate z = h(m),
( x1 , y1 ) = kG, r = x1 modn, s = k−1 (z + rd A )modn, and send the ECDSA signature pair
(r, s) together with the original message m to Bob. After receiving the signature pair (r, s)
and the original message m, Bob will verify the correctness of the ECDSA signature. Bob
first calculates z = h(m), u1 = z s−1 modn, u2 = rs−1 modn, ( x1 , y1 ) = u1 G + u2 Q A ,
?
r = x1 modn, and if it passes the verification, then Bob confirms that the ECDSA signature
and message m sent by Alice are correct.
161
Appl. Sci. 2021, 11, 1085
3. Method
3.1. System Architecture
Figure 3 is the system architecture diagram.
162
Appl. Sci. 2021, 11, 1085
In this study, we use the Elliptic Curve Digital Signature Algorithm (ECDSA),
blockchain, and smart contracts to design a traceable authorization mechanism for the
museum’s digital content resource. There are six parties involved in this study: Museum
(M), Content Administrator (CA), Licensee (L), Blockchain Center (BCC), Proxy (P), and
Bank (B).
(a) Museum (M): The museum is the owner of the digital content. The museum collects
the cultural relics and is responsible for the generation and management of the
museum’s digital content resource. The digital content resource is classified and
protected by the museum.
(b) Content Administrator (CA): The CA is a cloud platform of the museum. It is respon-
sible for reviewing the Licensee’s request to determine ‘allow or not’ to access the
digital content resource.
(c) Licensee (L): When citizens or institutions want to access the digital content resource
of the museum, the Licensee should pay a premium to the museum.
(d) Blockchain Center (BCC): This center records the access information of the digital
right resource for the Licensee. The BCC accepts the parties’ registration and issues
the identity certificate and public/private key pair to each party.
(e) Proxy (P): The proxy is an agency of the museum. After CA authenticates the Li-
censee’s identity, P is responsible for actually cloud authorization for the Licensee to
access the museum’s digital content resource.
(f) Bank (B): Bank is authorized by a Licensee to pay a premium to the museum. We
briefly illustrate the scenarios in the following steps.
• Step 1: Registration phase:
Museum, Licensee, Proxy, and Bank need to register with Blockchain Center; the
Blockchain Center issues the identity certificate and public/private key pair to each party.
• Step 2: Digital content production phase:
The DCA classifies the museum’s resources, encrypts these resources into a protected
digital resource, and then stores it in the CA. The CA also uploads the detailed categories
into the Blockchain center.
• Step 3: Authentication phase and issuing invoice phase:
After the Licensee proposes to access digital resource requests, the CA reviews the
Licensee’s qualifications and then issues the invoice.
• Step 4: Payment phase:
163
Appl. Sci. 2021, 11, 1085
After payment, the Licensee requests the Bank to issue a certificate for the museum
to authenticate this payment. The Content Administer then authenticates the Licensee’s
identity. The Content Administer performs one of the following cases.
Case 1: Generates the authorized key to the Licensee directly.
Case 2: Generates a proxy key to the Agency, and the Agency transfers it to the
Licensee.
• Step 5: Digital content browsing phase:
After the Licensee receives the authorized key, the Licensee uses it to decrypt the
protected digital content. The digital content can be read (or played) normally.
In the proposed smart contract, we have developed key information that will be stored
in the blockchain. In the structure of the lm/la/am smart contract, we developed the field
of id (identification), transaction detail, certificate, and timestamp. In the structure of the
ml/ma/al smart contract, we developed the field of id, transaction detail, transaction id,
and timestamp. In the structure of lc/lp/pc smart contract, we developed the field of id,
transaction detail, payment information, and timestamp. In the structure of the cl/cp/pl
smart contract, we developed the field of id, transaction detail, authentication key, and
timestamp. In the initialization phase, the blockchain center also issues the public and
private key pairs for all roles.
164
Appl. Sci. 2021, 11, 1085
Figure 4. Each role of the system registers with the Blockchain Center.
• Step 1: Role X generates an identity IDX , and sends it to the Blockchain Center.
• Step 2: The Blockchain center generates an ECDSA private key d X based on the role
X, calculates:
Q X = d X G. (1)
If the identity of the registered role is verified, the smart contract Xins will be triggered,
and the content is presented as follows (Scheme 2):
Then the blockchain center will transmit IDX , (d X , Q X ), PKX , SKX , Cert X to role X.
• Step 3: The role X stores (d X , Q X , PKX , SKX , Cert X ) .
165
Appl. Sci. 2021, 11, 1085
the archive itself, as an annotation explanation for the archive itself and various media
materials, as well as an indexing tool for users to inquire.
• Step 3: Through the overall planning of the collection environment, a suitable infor-
mation system can be constructed, and the functions of digital data preservation and
management can be achieved through the operation of the system. When a Licensee
wants to access these multimedia materials, it must first obtain legal authorization
from the Content Administrator (CA).
• Step 4: The CA will provide the Licensee with an authorization key; the Licensee can
use the authorization key to unlock the information provided by the CA and get a
decryption key, which can be used to obtain the plaintext of multimedia messages.
The details will be introduced in the following phase.
u L− M1 = z L− M s L− M −1 modn, (10)
−1
u L− M2 = r L− M s L− M modn, (11)
( x L− M , y L− M ) = u L− M1 G + u L− M2 Q L , (12)
?
x L− M = r L− M modn. (13)
166
Appl. Sci. 2021, 11, 1085
If the verification is passed, CA will get the relevant content request information and
trigger the smart contracts lmins and lmchk. The content is as follows (Scheme 3):
167
Appl. Sci. 2021, 11, 1085
The CA calculates:
BCL− M = h(r L− M , s L− M ), (14)
( IDBC , BCL− M ) will also be uploaded to the blockchain center. Then the CA generates
a random value k M− L and calculates:
( x M− L , y M− L ) = k M− L G, (16)
r M− L = x M− L modn, (17)
−1
s M− L = k M− L (z M− L + r M− L d M )modn, (18)
Enc M− L = EPK L ( ID M , M M− L , TID, invoice, TS M− L , IDBC ), (19)
and sends ID M , Enc M− L , (r M− L , s M− L ) to the Licensee.
• Step 3: The Licensee first calculates:
u M− L1 = z M− L s M− L −1 modn, (23)
−1
u M− L2 = r M− L s M− L modn, (24)
( x M− L , y M− L ) = u M− L1 G + u M− L2 Q M , (25)
?
x M− L = r M− L modn. (26)
If the verification is passed, the content request information is confirmed by CA, and
the smart contracts mlins and mlchk will be sent. The content is as follows (Scheme 4):
168
Appl. Sci. 2021, 11, 1085
BC M− L = h(r M− L , s M− L ), (27)
169
Appl. Sci. 2021, 11, 1085
170
Appl. Sci. 2021, 11, 1085
u L− A1 = z L− A s L− A −1 modn, (36)
u L− A2 = r L− A s L− A −1 modn, (37)
( x L− A , y L− A ) = u L− A1 G + u L− A2 Q L , (38)
?
x L− A = r L− A modn. (39)
171
Appl. Sci. 2021, 11, 1085
If the verification is passed, the proxy will get the relevant content request information
and trigger the smart contracts lains and lachk. The content is as follows (Scheme 5):
u A− M1 = z A− M s A− M −1 modn, (49)
−1
u A− M2 = r A− M s A− M modn, (50)
( x A− M , y A− M ) = u A− M1 G + u A− M2 Q A , (51)
?
x A− M = r A− M modn. (52)
If the verification is passed, the CA will get the relevant content request information
and trigger the smart contracts amins and amchk. The content is as follows (Scheme 6):
172
Appl. Sci. 2021, 11, 1085
The CA calculates:
BC A− M = h(r A− M , s A− M ), (53)
( IDBC , BC A− M ) will also be uploaded to the blockchain center.
• Step 5: The CA generates a random value k M− A and calculates:
u M− A1 = z M− A s M− A −1 modn, (62)
u M− A2 = r M− A s M− A −1 modn, (63)
( x M− A , y M− A ) = u M− A1 G + u M− A2 Q M , (64)
?
x M− A = r M− A modn. (65)
If the verification is passed, the content request information is confirmed by the proxy,
and the smart contracts mains and machk will be sent. The content is as follows (Scheme 7):
173
Appl. Sci. 2021, 11, 1085
u A− L1 = z A− L s A− L −1 modn, (75)
−1
u A− L2 = r A− L s A− L modn, (76)
( x A− L , y A− L ) = u A− L1 G + u A− L2 Q A , (77)
?
x A− L = r A− L modn. (78)
If the verification is passed, the content request information is confirmed by the CA,
and the smart contracts alins and alchk will be sent. The content is as follows (Scheme 8):
174
Appl. Sci. 2021, 11, 1085
BC A− L = h(r A− L , s A− L ), (79)
z L−C = h( IDL , ML−C , Cert L , TID, Cert pay , TS L−C , IDBC ), (87)
175
Appl. Sci. 2021, 11, 1085
176
Appl. Sci. 2021, 11, 1085
If the verification is passed, the content administrator will get the relevant payment
information and trigger the smart contracts lcins and lcchk. The content is as follows
(Scheme 9):
( IDBC , BCL−C ) will also be uploaded to the blockchain center. Then the content
administrator generates a random value k C− L and calculates:
C1 = Zrb G, (94)
C2 = Z G + Pm ,
r
(95)
zC− L = h( IDC , MC− L , (C1 , C2 ), TSC− L , IDBC ), (96)
( xC− L , yC− L ) = k C− L G, (97)
rC− L = xC− L modn, (98)
sC− L = k C− L −1 (zC− L + rC− L dC )modn, (99)
EncC− L = EPK L ( IDC , MC− L , (C1 , C2 ), TSC− L , IDBC ), (100)
and sends IDC , EncC− L , (rC− L , sC− L ) to the Licensee.
• Step 3: The Licensee first calculates:
( IDC , MC− L , (C1 , C2 ), TSC− L , IDBC ) = DSK L ( EncC− L ), (101)
uses
TS NOW − TSC− L ≤ ΔT (102)
to confirm whether the timestamp is valid, verifies the correctness of the ECDSA signature,
and then calculates:
177
Appl. Sci. 2021, 11, 1085
( IDBC , BCC− L ) will also be uploaded to the blockchain center. Finally, the APP calculates:
Pm = C2 − C1 −b (109)
to successfully obtain the identity of the digital content. This step is performed automati-
cally by the smart contract, and the Licensee cannot skip the verification process privately.
178
Appl. Sci. 2021, 11, 1085
Enc L− P = EPKP ( IDL , ML− P , Cert L , TID, Cert pay , TS L− P , IDBC ), (114)
and sends IDL , Enc L− P , (r L− P , s L− P ) to the proxy.
• Step 2: The Proxy first calculates:
( IDL , ML− P , Cert L , TID, Cert pay , TS L− P , IDBC ) = DSKP ( Enc L− P ), (115)
uses
TS NOW − TS L− P ≤ ΔT (116)
to confirm whether the timestamp is valid, verifies Cert L with PK L and Cert pay with
PK BANK , verifies the correctness of the ECDSA signature, and then calculates:
u L− P1 = z L− P s L− P −1 modn, (118)
−1
u L− P2 = r L− P s L− P modn, (119)
( x L− P , y L− P ) = u L− P1 G + u L− P2 Q L , (120)
?
x L− P = r L− P modn. (121)
179
Appl. Sci. 2021, 11, 1085
180
Appl. Sci. 2021, 11, 1085
If the verification is passed, the proxy will get the relevant payment information and
trigger the smart contracts lpins and lpchk. The content is as follows (Scheme 11):
181
Appl. Sci. 2021, 11, 1085
z P−C = h( IDP , IDL , MP−C , Cert P , Cert L , TID, Cert pay , TSP−C , IDBC ), (130)
182
Appl. Sci. 2021, 11, 1085
183
Appl. Sci. 2021, 11, 1085
u P− L1 = z P− L s P− L −1 modn, (160)
u P− L2 = r P− L s P− L −1 modn, (161)
( x P− L , y P− L ) = u P− L1 G + u P− L2 Q P , (162)
?
x P− L = r P− L modn. (163)
If the verification is passed, the authorization information is confirmed by Licensee,
and the smart contracts plins and plchk will be sent. The content is as follows (Scheme 14):
( IDBC , BCP− L ) will also be uploaded to the blockchain center. Finally, the APP calculates:
Pm = C2 − C1 −b (165)
to obtain the identity of the digital content successfully. This step is performed auto-
matically by the smart contract, and the Licensee cannot skip the verification process
privately.
4. Analysis
In this section, we analyze the requirements of digital rights management as follows.
184
Appl. Sci. 2021, 11, 1085
4.1. Verifiable
Using digital certificate verification can publicly verify the identity of the Licensee,
and the authorization information was published based on the openness and transparency
of the information on the chain, truly realizing the high efficiency and specialization in the
field of digital copyright.
Let’s take the message transmitted by the Licensee (L) and Content Administrator
(CA) as an example. When CA sends a message signed by ECDSA to L, L first ver-
ifies the correctness of the time stamp and signature, then generates blockchain data
BCC− L = h(rC− L , sC− L ), and uses IDBC as an index to upload the blockchain data to the
Blockchain Center (BCC). That is to say, after verifying the correctness of the time stamp
and signature for each role that receives the message, it also verifies the correctness of the
blockchain data generated by the previous role. Therefore, our proposed solution achieves
the characteristics of public verification through blockchain technology and ECDSA digi-
tal signature.
4.2. Trustless
The identity of the authorized object of digital content is verified by the Digital
Content Administrator. The authorization period is controlled by the Digital Content
Administrator. The Licensee cannot occupy or transfer privately. Any nodes that participate
in the system do not need to trust each other. The operation of the system and operating
rules are open and transparent, and all information is open. A node cannot deceive other
nodes. In this way, the trust relationship between nodes is realized, making it possible to
obtain trust between nodes at a low cost. For example, when Licensee (L) requests digital
content authorization from the Content Administrator (CA), CA will send an authorization
message to L. This message Pm = f ( IDDC , Vtime) contains the digital content ID and
the authorization period, and L will be unable to privately occupy or transfer digital
content privately.
4.3. Unforgery
Use time stamp and signature mechanism to irreversibly generate a string composed
of random numbers and letters for the data placed in each block. This original text cannot
be inferred from the string, thus effectively solving the trust problem. After the hash
function operation, the messages are described as follows.
The hash value cannot be reversed back to the original content, so this agreement
achieves the characteristic that the message cannot be tampered with.
4.4. Traceable
After the digital content is on the chain, the data block containing the copyright
information is permanently stored on the blockchain and cannot be tampered with. All
transaction traces can be traced throughout the entire process, which can be used as a
digital certificate to deal with infringement. For example: When we want to verify and trace
185
Appl. Sci. 2021, 11, 1085
whether the blockchain data between the Licensee (L) and Content Administrator (CA)
? ?
is legal, we can compare and verify BCL−C = h(r L−C , s L−C ) and BCC− L = h(rC− L , sC− L ).
When we want to verify and trace whether the blockchain data between the Licensee (L)
? ?
and Proxy (P) is legal, we can compare and verify BCL− P = h(r L− P , s L− P ) and BCP− L =
h(r P− L , s P− L ). When we want to verify and trace whether the blockchain data between the
?
Proxy (P) and Content Administrator (CA) is legal, we can compare and verify BCP−C =
?
h(r P−C , s P−C ) and BCC− P = h(rC− P , sC− P ).
4.5. Non-Repudiation
The content of the message sent by each role is signed by the sender with its ECDSA
private key. After receiving the message, the receiver will verify the message with the
sender’s public key. If the message is successfully verified, the sender will not deny the
content of the message transmitted. Table 1 is an undeniable description of each role in
this program.
Item
Phase Signature Sender Receiver Signature Verification
(r L − M , s L − M ) L CA ?
x L− M = r L− M modn
Authentication and issuing invoice phase
(direct authorization) (r M − L , s M − L ) CA L ?
x M− L = r M− L modn
(r L − A , s L − A ) L P ?
x L− A = r L− A modn
4.7. Timeliness
In our proposed scheme, the Content Administrator (CA) is responsible for the pro-
duction and management of the digital content property rights and the identity verification
of the Licensee (L); the Content Administrator (CA) is also responsible for the issuance
of a time-sensitive playback license, and the Licensee’s playback key identification code
cannot permanently occupy the playback of digital content. The Licensee must obtain the
decryption key through the authorization key. However, the authorization key contains
the digital content ID and the authorization period. If the authorization period expires,
186
Appl. Sci. 2021, 11, 1085
the Licensee will be unable to obtain the decryption key; that is, it cannot perform digital
content playback. Thus, we do not worry about the leakage of digital property rights.
4.8. Decentralization/Distribution
In the proposed scheme, the information handled by each role is signed by the role
with a private key, and the circulation of all information is open and transparent. A node
cannot deceive other nodes. In this way, the trust relationship between nodes is realized,
making it possible to obtain trust between nodes at a low cost. Thus, the proposed scheme
achieves decentralization and distribution.
4.9. Sustainability
The proposed scheme provides two kinds of authority mechanisms. It not only helps
to translate the field visit museum into an online visit to a museum’s digital collections, but
also promotes social education and contributes to the sustainable operation of the museum
via our proposed method.
Item
Phase BCC CA P L
System role registration phase 1TMul N/A N/A N/A
Authentication and issuing invoice phase 7TMul + 3TH 7TMul + 3TH
N/A N/A
(direct authorization) +2TCmp + 2TSig +1TCmp + 2TSig
Authentication and issuing invoice phase 7TMul + 3TH 7TMul + 3TH 7TMul + 3TH
N/A
(entrusted authorization) +3TCmp + 2TSig +2TCmp + 2TSig +1TCmp + 2TSig
Payment verification and browsing phase 9TMul + 3TH 7TMul + 3TH
N/A N/A
(direct authorization) +3TCmp + 2TSig +1TCmp + 2TSig
Payment verification and browsing phase 10TMul + 3TH 7TMul + 3TH 7TMul + 3TH
N/A
(entrusted authorization) +4TCmp + 2TSig +3TCmp + 2TSig +1TCmp + 2TSig
TMul : Multiplication operation; TH : Hash function operation; TCmp : Comparison operation; TSig : Signature operation.
Table 2 is the computation cost analysis of all stages and roles in this scheme. We
analyze the payment verification and browsing phase (entrusted authorization) with the
highest computational cost. The CA requires 10 multiplication operations, 3 hash function
operations, 4 comparison operations, and 2 signature operations. The Proxy requires 7
multiplication operations, 3 hash function operations, 3 comparison operations, and 2
signature operations. The Licensee requires 7 multiplication operations, 3 hash function
operations, 1 comparison operation, and 2 signature operations. The method we proposed
has a good computational cost.
187
Appl. Sci. 2021, 11, 1085
system role registration phase is 5056 bits, which takes 0.361 ms under 3.5 G (14 Mbps)
communication environment, 0.051 ms under 4 G (100 Mbps) communication environment,
and takes 0.253 ms under 5 G (20 Mbps) communication environment [34]. The proposed
scheme has excellent performance.
Item 3.5G 4G 5G
Message Length Rounds
Phase (14 Mbps) (100 Mbps) (20 Gbps)
System role registration phase 3552 bits 2 0.254 ms 0.036 ms 0.178 us
Authentication and issuing invoice
2528 bits 2 0.181 ms 0.025 ms 0.126 us
phase (direct authorization)
Authentication and issuing invoice
5056 bits 4 0.361 ms 0.051 ms 0.253 us
phase (proxy authorization)
Payment verification and browsing
2528 bits 2 0.181 ms 0.025 ms 0.126 us
phase (direct authorization)
Payment verification and browsing
5056 bits 4 0.361 ms 0.051 ms 0.253 us
phase (proxy authorization)
5.3. Comparison
In this section, we compare the related works which involved the blockchain and
smart contract technologies in Table 4.
Table 4. Comparison of the proposed and existing digital right management surveys.
188
Appl. Sci. 2021, 11, 1085
This research provides museum exploration based on direct authorization and proxy
authorization combined with a cash flow payment verification mechanism. The signature
and time stamp mechanism of cryptography is applied to achieve a non-repudiation mech-
anism (Table 1), which combines blockchain and smart contracts to achieve verifiability,
non-tampering, and traceability; digital signatures and digital certificates are used to solve
the non-repudiation of the cash flow. Table 2 shows that this method has a good computa-
tional cost, while Table 3 shows that the solution we proposed has a low communication
cost and can improve the effectiveness of authorization. Table 4 shows the comparison
between this digital right management and the existing digital right management survey
and proposes a complete presentation of the digital rights of the museum in combination
with the financial flow. In addition to the realization of museum social education, the
increased benefits of digital rights are conducive to the long-term operation of the museum;
the sustainable development of the museum is expected.
In the future, the research will focus on the establishment of a promotion platform
for the authorization mechanism of the alliance chain museum of blockchain technology,
to achieve a win-win situation of resource sharing and economic benefits. Besides, the
world organization has made the world economy globalized and the international market
integrated. It is foreseeable that international economic and trade disputes will emerge
endlessly. Governments of various countries have added or strengthened arbitration regu-
lations to resolve disputes involving various profits as future digital property management.
If there is a dispute, it can be resolved through the mechanism of international legal arbi-
tration. This research provides a good foundation for future research on the authorization
of the museum collection alliance chain and the dispute resolution arbitration mechanism.
Author Contributions: The authors’ contributions are summarized below. Y.-C.W. and C.-L.C. made
substantial contributions to the conception and design. C.-L.C. and Y.-Y.D. were involved in drafting
the manuscript. C.-L.C. and Y.-Y.D. acquired data and analysis and conducted the interpretation of
the data. The critically important intellectual contents of this manuscript were revised by Y.-C.W. All
authors have read and agreed to the published version of the manuscript.
Funding: This research was supported by the Ministry of Science and Technology, Taiwan, R.O.C.,
under contract number MOST 109-2221-E-324-021.
Informed Consent Statement: This study only base on the theoretical basic research. It is not
involving humans.
Data Availability Statement: The data used to support the findings of this study are available from
the corresponding author upon request.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
Abbreviations are used in this paper and listed as follows:
q A k-bit prime number
GF(q) Finite group q
E The elliptic curve defined on finite group q
G A generating point based on the elliptic curve E
IDx A name representing identity x
kx A random value on elliptic curve
(rx , sx ) Elliptic curve signature value of x
Mx-y A message from x to y
IDBC An index value of blockchain message
BCx Blockchain message of x
PKX /SKX An asymmetric public/private key
EPKX (M) Use X’s public key PKx to encrypt the message M
189
Appl. Sci. 2021, 11, 1085
DSKX (M) Use X’s private key SKx to decrypt the message M
TID The transaction identity
IDDC An identity of digital content
keym Asymmetric key containing KeyID and Seed
Certx A digital certificate of x conforms to the X.509 standard
h(.) Hash function
?
A=B Verify whether A is equal to B
References
1. Parry, R. Recoding the Museum: Digital Heritage and the Technologies of Change; Routledge: London, UK, 2007; pp. 58–81.
2. Fenton, R. Photographer of the 1850s; South Bank: London, UK, 1988.
3. The Getty Foundation. Available online: https://www.getty.edu/foundation/initiatives/current/osci/ (accessed on 30 Novem-
ber 2020).
4. Creative Economy Report 2010. United Nations Conference on Trade and Development. Available online: https://unctad.org/
system/files/official-document/ditctab20103_en.pdf (accessed on 23 January 2021).
5. Chiou, S.-C.; Wang, Y.-C. The example application of genetic algorithm for the framework of cultural and creative brand design
in Tamsui Historical Museum. Soft Comput. 2018, 22, 2527–2545. [CrossRef]
6. UNESCO. Convention for the Safeguarding of the Intangible Cultural Heritage. 2003. Available online: http://unesdoc.unesco.
org/images/0013/001325/132540e.pdf (accessed on 30 November 2020).
7. Chang, C.-W.; Wang, S.-I.; Yang, C.-J.; Shao, K.-T. Fish fauna in subtidal waters adjacent to the National Museum of Marine
Biology and Aquarium. P1atax 2011, 8, 41–51. [CrossRef]
8. Liu, M.-C. Image management procedures of the National Museum of Marine Biology and Aquarium. Museol. Q. 2013, 27.
[CrossRef]
9. ARTouch Editorial Department. The Epidemic Is Not Far Away: 1/3 of the US Museums May Be Permanently Closed,
and Japanese Exhibitions with No Works. Available online: https://artouch.com/news/content-12951.html (accessed on 26
November 2020).
10. Chen, H.Y.; Wang, H.A.; Lin, C.L. Using watermarks and offline DRM to protect digital images in DIAS. In Proceedings of the
International Conference on Theory and Practice of Digital Libraries; Springer: Berlin/Heidelberg, Germany, 2007; pp. 529–531.
11. Thomas, T.; Emmanuel, S.; Subramanyam, A.V.; Kankanhalli, M.S. Joint watermarking scheme for multiparty multilevel DRM
architecture. IEEE Trans. Inf. Forensics Secur. 2009, 4, 758–767. [CrossRef]
12. Tsai, M.J.; Luo, Y.F. Service-oriented grid computing system for digital rights management (GC-DRM). Expert Syst. Appl. 2009, 36,
10708–10726. [CrossRef]
13. Chen, C.L. A secure and traceable E-DRM system based on mobile device. Expert Syst. Appl. 2008, 35, 878–886. [CrossRef]
14. Chen, C.L. An all-in-one mobile DRM system design. Int. J. Innov. Comput. Inf. Control 2010, 6, 897–911.
15. Chen, C.L.; Tsaur, W.J.; Chen, Y.Y.; Chang, Y.C. A secure mobile DRM system based on cloud architecture. Comput. Sci. Inf. Syst.
2014, 11, 925–941. [CrossRef]
16. Hassan, H.E.R.; Tahoun, M.; ElTaweel, G.S. A robust computational DRM framework for protecting multimedia contents using
AES and ECC. Alex. Eng. J. 2020, 59, 1275–1286. [CrossRef]
17. Zhao, B.; Fang, L.; Zhang, H.; Ge, C.; Meng, W.; Liu, L.; Su, C. Y-DWMS: A digital watermark management system based on
smart contracts. Sensors 2019, 19, 3091. [CrossRef]
18. Ma, Z.; Jiang, M.; Gao, H.; Wang, Z. Blockchain for digital rights management. Future Gener. Comput. Syst. 2018, 89, 746–764.
[CrossRef]
19. Vishwa, A.; Hussain, F.K. A blockchain based approach for multimedia privacy protection and provenance. In Proceedings of the
2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bengaluru, India, 18–21 November 2018; pp. 1941–1945.
20. Ma, Z.; Huang, W.; Bi, W.; Gao, H.; Wang, Z. A master-slave blockchain paradigm and application in digital rights management.
China Commun. 2018, 15, 174–188. [CrossRef]
21. Ma, Z.; Huang, W.; Gao, H. Secure DRM scheme based on Blockchain with high credibility. Chin. J. Electron. 2018, 27, 1025–1036.
[CrossRef]
22. Lu, Z.; Shi, Y.; Tao, R.; Zhang, Z. Blockchain for digital rights management of design works. In Proceedings of the 2019 IEEE 10th
International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 18–20 October 2019; pp. 596–603.
23. American Association of Museums. Museums for a New Century, a Report of the Commission on Museums for a New Century; American
Association of Museums: Washington, DC, USA, 1984.
24. Ma, Z. Digital rights management: Model, technology and application. China Commun. 2017, 14, 156–167.
25. Du Toit, J. Protecting private data using digital rights management. J. Inf. Warf. 2018, 17, 64–77.
26. Mrabet, H.; Belguith, S.; Alhomoud, A.; Jemai, A. A survey of IoT security based on a layered architecture of sensing and data
analysis. Sensors 2020, 20, 3625. [CrossRef]
27. Szabo, N. Smart contracts: Building blocks for digital markets. EXTROPY J. Transhumanist Thought 1996, 18, 16.
28. Szabo, N. The Idea of Smart Contracts. 1997. Available online: http://www.fon.hum.uva.nl/rob/Courses/InformationInSpeech/
CDROM/Literature/LOTwinterschool2006/szabo.best.vwh.net/smart_contracts_idea.html (accessed on 26 November 2020).
190
Appl. Sci. 2021, 11, 1085
29. Han, W.; Zhu, Z. An ID-based mutual authentication with key agreement protocol for multiserver environment on elliptic curve
cryptosystem. Int. J. Commun. Syst. 2014, 27, 1173–1185. [CrossRef]
30. Boneh, D.; Lynn, B.; Shacham, H. Short signatures from the Weil pairing. In Proceedings of the International Conference on the Theory
and Application of Cryptology and Information Security; Springer: Heidelberg/Berlin, Germany, 2001; pp. 514–532.
31. Chen, C.-L.; Yang, T.-T.; Chiang, M.-L.; Shih, T.-F. A privacy authentication scheme based on cloud for medical environment. J.
Med. Syst. 2014, 38, 143. [CrossRef]
32. Chen, C.-L.; Yang, T.-T.; Shih, T.-F. A secure medical data exchange protocol based on cloud environment. J. Med. Syst. 2014, 38, 112.
[CrossRef]
33. Blaze, M.; Bleumer, G.; Strauss, M. Divertible protocols and atomic proxy cryptography. In Proceedings of the International
Conference on the Theory and Applications of Cryptographic Techniques; Springer: Berlin/Heidelberg, Germany, 1998; pp. 127–144.
34. Marcus, M.J. 5G and IMT for 2020 and beyond. IEEE Wirel. Commun. 2015, 22, 2–3. [CrossRef]
191
MDPI
St. Alban-Anlage 66
4052 Basel
Switzerland
Tel. +41 61 683 77 34
Fax +41 61 302 89 18
www.mdpi.com