Implementation of A New Lightweight Encryption Design For Embedded Security
Implementation of A New Lightweight Encryption Design For Embedded Security
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
I. INTRODUCTION
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
TABLE 1
COMPARISON OF LIGHTWEIGHTED ALGORITHMS
Lightweight
Algorithm
HIGHT
Block
Size
64
mCrypton
64
SEA
TEA
ICEBERG
CLEFIA
PRESENT
96
64
64
128
64
Key
Length
128
3048
64
96
128
96
128
128
128
128
2420
2681
3758
3758
2355
7732
2488
1884
GEs
Maximum no. of
instructions for 64 bit word
size permutation
Maximum no. of
instructions for 128 bit word
size Permutation
No. of cycles for scheduling
of permutation instructions
on Super Scalar processor
Maximum number of
instructions for permuting
64 bit with control bits
Table
LOOK UP
GRP
23
47
16
31
13
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
Memory requirement for 64
48
16 kb
bit permutation
bytes
Paper [16] shows key generation by GRP achieving
tremendous speed because most of the permutation
instructions exist in this block.
Fig. 2: Block diagram for 128 bit GRP Permutation with key generation
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
TABLE 3
PRESENT AND CLEFIA MEMORY REQUIREMENT
Algorithm
PRESENT
CLEFIA
Memory Size
(in Bytes)
Flash
RAM
Memory Memory
Key
Size
E/D
128
bit
128
bit
128
bit
128
bit
3200
1320
3252
1384
4708
1256
4880
1256
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
research work which focuses on the compact hardware
implementation of a cipher. Bit permutation instructions are
efficient in such types of implementations. Bit permutation
instructions are complex in nature, and that gives them an
edge in the cryptographic environment. The well known
cipher like DES is also implemented with the help of bit
permutation instructions but falls prey to attacks because of
short key lengths. Among all bit permutation instructions,
GRP proved to be an efficient instruction in terms of
cryptographic properties, memory size and total number of
gate counts. OMFLIP has poor differential properties and its
structure is easily susceptible to attacks. Bit permutation
instructions are widely studied and currently supported by all
word oriented processors. We found compactness and
mapping interface of GRP with PRESENT and CLEFIA. In
this paper, we present the results of implementing most of the
standard algorithms in order to identify their memory
requirements, gate equivalents and power consumption. All
standard algorithms are implemented and compared on the
same platform which is LPC2129 a 32 bit processor by NXP
(Philips).
We have implemented AES 128 bit, GRP for 128 bit and 64
bit, PRESENT for 64 bit, CLEFIA for 128 bit and DES for
128 bit block size with different key combinations of 64, 128
and 80 bits. Fig. 6 indicates memory requirement of P-box of
AES [3] and PRESENT [8] with GRP [16] and OMFLIP [17]
computed based on KEIL 4.0 simulator and LPC2129.
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
the encryption / decryption outputs. The baud rate for this
application was set at 9600 bps.
In PRESENT-GRP module, 64 bit / 128 bit blocks were
passed through the S-box of PRESENT and after mapping
according to PRESENT, the output was passed to the
permutation layer which performed encryption based on GRP
algorithms. Keys at each stage were applied based on key
generation method of GRP. In GRP key generation, inputs will
be the bit positions given by user, and based on that GRP
generates a sequence of 0s and 1s which serve as key to the
encryption and decryption process. GRP has very robust
mechanism of key generation which is the necessity in
cryptographic environment. GRP does the key generation as
well as encryption with fast bit permutations.
We have designed an optimized version of GRP by doing
small changes at the algorithmic level. The GRP design
presented in previous papers [25][33] need 3224 bytes of
FLASH memory. In this paper, a compressed version of GRP
is reported, a design that consumes only 3088 bytes of
memory. This optimized structure has been achieved by
reducing many arrays to just two arrays. We have designed an
entirely new logic which supports very less arrays and works
very fast as compared to existing structure. Execution time for
our design is nearly half of the old GRP design at software
level. Few observations we made during optimizing code
which are listed as follows:
Using character data type instead of integer data
type
Reduced instructions i.e. making one instruction that
can perform three operations, instead of using
three different instructions for different operations
Using local variable instead of global variable,
global variable consumes more space
Using minimum functions
Making binary to hex / hex to binary functions
instead of using Power functions (defined)
Using minimum variables as Recursive
Making complex logic for long processes
Passing only one data type into the function instead
of multiple. For example, GRP (a)
All these steps help to achieve a more optimized structure
of GRP which results in 1789 GEs for 64 bit permutation.
Table 4 shows memory requirements for past implementations
of GRP and optimized GRP design. To the best of our
knowledge, this is the most optimized design for GRP 128 bit.
TABLE 4
OPTIMIZED GRP DESIGN
Memory Size of
Old GRP 128
3224 bytes
Memory Size of
Optimized GRP128
2944 bytes
Standard
Cell
Process
Library
Cell Name
GE
NOT
0.18m
UMCL18
G212T3
HDINVBD1
0.67
AND
0.18m
HDAND2D1
1.33
OR
0.18m
UMCL18
G212T3
UMCL18
G212T3
HDOR2D1
1.33
MUX
0.18m
UMCL18
G212T3
HDMUX2D1
2.33
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
Fig. 10: GEs comparison chart of all standard algorithms with PRESENTGRP
TABLE 6
DIFFERENTIAL PROPERTIES OF GRP PERMUTATIONS
Operation
GRP
Type A
(es,0) et
0 < p n1/2 +
1/2
Type B
(0, et)
For any t,
E(||) = n/4
Type C
(es, et)
For any t,
E(||) = n/4
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
non linear S-box uses 4 bit structure which yields into less GE
and less power consumption. Extra properties in S-box help
PRESENT to achieve the desired avalanche effect. The
theorem which shows the effect of differential cryptanalysis
on S-box of PRSENT is that Any five differential
characteristics of PRESENT has a minimum number of ten
active S-boxes and results from papers [8][26] shows that
PRESENT has very good and compact S-box. There are 16 Sboxes of PRESENT which are divided into four groups. From
papers [8][26], the characteristics of S-box are outlined below:
TABLE 7
LINEAR APPROXIMATION OF GRP PERMUTATIONS
Operation
GRP
Type L
(es, 0, et)
b 1/4 + 1/2n+1
Maximum with s
=t=0
Type M
(es, eu, et)
b 1/4 - 1/2n+1
Maximum with s = u
=t=0
26 x A7 = 26 x (2-7)7 = 2-43
Therefore, by assumption that a cryptanalyst need only
approximate 28 of the 31 rounds from linear cryptanalysis in
order to mount an attack the cipher requires 284 (286??) known
plain text / cipher text which crosses the available text limit
[8]. There are also more attacks which are proven like
structural attacks, algebraic attacks, and key schedule attacks,
other than linear and differential attacks. Some ciphers have
strong word-like structures, where the words are typically
bytes. Algebraic attacks have had better success when applied
to stream ciphers than block ciphers. Algebraic attacks and
key schedule attacks are unlikely to pose a threat to
PRESENT. These properties of PRESENT are sufficient to
resist key schedule-based attacks.
IX. CONCLUSION
Bit permutation instructions increases strength of a block
cipher by allowing them to perform any arbitrary permutations
efficiently with log(n) steps as compared to n. It performs
fast bit permutation and uses subword sorter that makes the
operation faster and can increase the throughput in
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
applications like scanning an image, performing bubble sort
and in the permutations layer in block ciphers. GRP generates
the control words faster, that which helps in increasing the
performance of many embedded systems. Block ciphers like
RC5 , RC6 use DDR instructions which make them vulnerable
to differential attacks. This further increases the number of
rounds and memory requirements. But, by replacing DDR
with GRP not only adds cryptographic strength to the cipher,
but also reduces the memory requirements and the power
consumption. GRPs have better differential and linear
cryptanalysis properties. Though they cannot completely
replace DDR, but they add strength to the block cipher by
removing the weaknesses of DDR. Other ciphers like hash
functions and stream ciphers may get benefited by one
introducing the bit permutation instructions in them. GRP
have all these good properties that provide strength in
cryptographic environment. But, it lacks S-box which is
necessary to provide a more secure design. This shifted our
focus to find a lightweighted S-box that can be mapped onto
GRP to get a secure and efficient hybrid crypto structure.
In the search for an S-box, our research got motivation from
the ISO standard for lightweight cryptography and we shifted
our focus to lightweight design. In this paper, we present the
design with a permutation box (P-box) by using GRP for 128
and 64 bit block size. Key generation is also achieved through
GRP. For designing a lightweight secure cipher, a confusion
property is a must and it should be well mapped with the
diffusion property. The fusion of S-box of various lightweight
block ciphers and P-box of GRPs were made and were
compared in terms of memory space and GEs. We have
implemented latest lightweight ciphers and interfaced them
with GRP. In order to achieve a very compact implementation
of cipher as reported in this paper, we have carefully designed
the permutation box that has resulted in a much lower gate
count. We have carefully analyzed and studied linear and
differential cryptanalysis of P-box of GRP and found it
resistant to attacks like brute force attacks. Similarly, S-box of
latest ciphers like PRESENT, CLEFIA, TEA etc, have been
considered and implemented on a 32 bit processor. We already
know that GRP design is hardware efficient and needs very
few GEs which meets requirement of security in a lightweight
cryptographic design. PRESENT is an engineered cipher
whose S-box is the most compact substitution box among all
the light variants and has good linear and differential
properties. PRESENTs S-box results in a very compact
implementation that consumes merely 21 GEs for a single 4
bit S-box. RAM and Flash memory requirements for
PRESENT-GRP implementation results in very less bytes as
compared to other lightweight algorithms and even with
PRESENT individually.
This paper proposes a novel approach by introducing a
compact hybrid system in terms of memory requirements that
is best suited for lightweight cryptographic design.
ACKNOWLEDGMENT
The authors would like to thank Symbiosis Institute of
Technology, Pune, Symbiosis International University, Pune
and Prof. Ayan Mahanolobis from IISER, Pune, and eminent
professionals from the automotive embedded domain, in Pune
for providing suggestions and valuable inputs that helped us to
carry out this research successfully. Special thanks to Axel
York Poschmann and Zhijie Jerry Shi, whose thesis and work
motivated this research and also provided us with some ideas
to carry on the work in the future.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TIFS.2014.2365734, IEEE Transactions on Information Forensics and Security
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
10
[38] B. Kaliski and Y.L. Yin, On the security of the RC5 encryption
algorithm, RSA Laboratories Technical Report TR-602, available at
www.rsa.com/rsalabs/aes, September 1998.
[39] S. Devadas and S. Malik, A survey of optimization techniques targeting
low power VLSI circuits, In ACM/IEEE Conference on Design
Automation, pages 242247, 1995.
[40] Zhijie Jerry Shi, Ruby B. Lee, Bit permutation instructions:
architecture, implementation, and cryptographic properties, 2004.
1556-6013 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.