Modified pqsigRM RM Code-Based Signature Scheme
Modified pqsigRM RM Code-Based Signature Scheme
ABSTRACT We present a novel code-based signature scheme called modified pqsigRM. This scheme
is based on a modified Reed–Muller (RM) code, which reduces the signing complexity and key size
compared with existing code-based signature schemes. In fact, it strengthens pqsigRM submitted to
NIST for post-quantum cryptography standardization. The proposed scheme has the advantage of the
pqsigRM decoder and uses public codes that are more difficult to distinguish from random codes. We use
(U , U + V )-codes with the high-dimensional hull to overcome the disadvantages of code-based schemes.
The proposed decoder samples from coset elements with small Hamming weight for any given syndrome
and efficiently finds such an element. Using a modified RM code, the proposed signature scheme resists
various known attacks on RM-code-based cryptography. For 128 bits of classical security, the signature size
is 4096 bits, and the public key size is less than 1 MB.
INDEX TERMS Cryptography, digital signatures, error correction codes, post-quantum cryptography
(PQC), Reed-Muller (RM) codes.
n−k
I. INTRODUCTION have relatively small error correction capability t = log n,
Recently, code-based cryptographic algorithms have been to reduce the signing time. Therefore, it has a large signing
extensively studied in post-quantum cryptography (PQC). complexity and certain drawbacks in terms of parameter
Code-based cryptography is based on the syndrome decoding scaling. Moreover, it has been shown in [4] that high-rate
problem and its variants. The syndrome decoding problem Goppa codes can be distinguished from random codes. This
is to find a vector e satisfying HeT = sT and wt(e) ≤ w, falsifies the assumption of existential unforgeability under a
where H is a parity check matrix of a random (n, k) code, chosen message attack (EUF-CMA) security proof in [17],
s is a random syndrome vector, w is a small value, and which is based on the indistinguishability of Goppa codes.
wt(e) denotes the Hamming weight of a vector e. Berlekamp Although Morozov et al. claimed to have proved the strong
and McEliece first proved the hardness of the syndrome EUF-CMA security of the CFS signature scheme without the
decoding problem [19] and McEliece proposed a cryptosys- indistinguishability of Goppa codes [18], the large key size
tem based on Goppa codes [22]. and expensive signing remain as drawbacks.
Courtois, Finiasz, and Sendrier proposed the CFS signature There are several variants of the CFS signature scheme,
scheme [2], which is a code-based signature scheme using a such as signature schemes using LDGM codes [7] and
full-domain hash (FDH) approach. In this scheme, t! hashes, blockwise-triangular secret key [9]. To find a signature with
and decodings are required on average to sign a message small Hamming weight, the scheme in [7] uses a sparse coset
when an (n, k) Goppa code with error correction capability t element added to a codeword with small Hamming weight.
is used. It is proposed to use high-rate Goppa codes, which Even though this is efficient and has a small key size, an attack
algorithm was presented in [6]. An attack algorithm for the
The associate editor coordinating the review of this manuscript and signature scheme using a blockwise-triangular secret key was
approving it for publication was Sedat Akleylek . also proposed [8].
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
177506 VOLUME 8, 2020
Y. Lee et al.: Modified pqsigRM: RM Code-Based Signature Scheme
The Kabatianskii-Krouk-Smeets (KKS) signature The two problems are analyzed in Section V. Consider-
scheme [30] and its variants [31], [32] take a different ing state-of-the-art attacks, we suggest security parameters
approach than CFS signature scheme. However, owing to in Section VI. The paper is concluded in Section VII.
the attack proposed in [36], these are considered (at best) to
be one-time signature schemes. Moreover, from the attacks II. PRELIMINARIES
in [37], it is known that the parameters in the KKS scheme A. BASIC NOTATION
and its variants should be carefully chosen. A Vector is denoted in boldface in the form of a column
SURF is a variant of CFS signature scheme using (U , U + vector. (x0 |x1 ) denotes the concatenation of two vectors x0
V )-codes [29]. SURF uses (n, kU + kV ) binary codes defined and x1 . For example, h(m|r) means the hash function h with
by {(u|u + v)|u ∈ U , v ∈ V }, where U and V are (n/2, kU ) input (m|r), where (m|r) represents the concatenation of
and (n/2, kV ) random binary codes, respectively. A variant of binary representation of vector m and a random value r.
the Prange decoder is applied to SURF to find an error vector Matrices are denoted by a boldfaced capital letter, for exam-
with a small Hamming weight. The security of SURF is ple, A. Matrix multiplication is denoted by · or can be omitted
based on the decoding-one-out-of-many (DOOM) problem, when it is unnecessary. Codes and probability distributions
in which a solution for the syndrome decoding problem is are denoted in calligraphic fonts, for example C, and it can
sought in the presence of several syndromes. Unfortunately, be distinguished by context. xσ denotes that a vector x is per-
as it has been demonstrated that the hull of any (U , U + V )- muted by a permutation σ , for example, xσ = (x1 , x3 , x2 , x0 ),
code is highly probable to be a two-repetition code when U where x = (x0 , x1 , x2 , x3 ) and σ = (1, 3, 2, 0).
and V are random binary codes [29], the hull of the public
key can be used for key attacks on SURF. In the recently pro- B. CFS SIGNATURE SCHEME
posed signature scheme, Wave [35], the generalized ternary CFS signature scheme is an algorithm that applies the
(U , U + V )-codes are used instead of binary codes as they FDH methodology to the Niederreiter cryptosystem. The
efficiently resist the hull attack in [29]. Moreover, finding CFS signature scheme is based on Goppa codes, as McEliece
errors with large Hamming weight for the given syndrome cryptosystem. A summary of CFS signature scheme is given
allows small parameters. A tighter security reduction using in Algorithm 1.
rejection sampling and preimage samplable functions [34] As described in Algorithm 1, the signing process iterates
was proved in [35]. until a decodable syndrome is obtained. The probability that
(ni)
Pt
In this paper, a new code-based signature scheme using a given random syndrome can be decoded is 2i=0 1
n−k ' t! .
binary codes with a (U , U + V )-code as its subcode is Hence, the error correction capability t = logn−k
n should be
proposed. For two linear codes C1 and C2 , C2 is called a sufficiently small to reduce the number of iterations. Thus,
subcode of C1 if all codewords in C2 are in C1 . The subcode the high-rate Goppa codes should be used. Regarding the
used in the proposed signature scheme is a binary (U , U + key size, the complexity of the decoding attack on the CFS
V )-code, where U and V are obtained by modifying the signature scheme is known to be a small power of the key size,
RM codes. We design V and U ⊥ to have a sufficient number namely, ≈ keysizet/2 . Hence, the key size should be fairly
of common codewords, where U ⊥ denotes the dual code large to meet a certain security level. In summary, the CFS
of U . Using the relationships between U and V , it is shown signature scheme is insecure and inefficient owing to the use
that the proposed signature scheme resists the attack for of Goppa codes.
(U , U + V )-codes in [29]. Further, an efficient and random-
ized decoding algorithm is proposed. This algorithm makes C. REED–MULLER CODES AND RECURSIVE DECODING
it possible to reduce the key size and signature length. As the RM codes were introduced by Muller [23] and Reed [24].
codes in the proposed signature scheme are a modification of and its decoding algorithm, so-called recursive decoding, was
RM codes, the decoding algorithm makes use of the recursive proposed in [10]. There are various definitions of RM codes,
structure. The proposed signature scheme is an improvement but we adopt a recursive definition here as recursive decoding
of pqsigRM [1] submitted to NIST for PQC standardization, is defined by using this structure.
and it resolves the weaknesses of early versions of pqsigRM Pr An mRM code RM(r,m) is a
linear binary (n = 2m , k = i=0 i ) code, where r and
by modifying the public code. Moreover, we ensure the dis- m are integers. RM(r,m) is defined as RM(r,m) := {(u|u +
tinguishability of the public code of the proposed signature v)|u ∈ RM(r,m−1) , v ∈ RM(r−1,m−1) }, where RM(0,m) :=
scheme. {(0, . . . , 0), (1, . . . , 1)} with code length 2m and RM(m,m) :=
The rest of this paper is organized as follows. In Section II, m
F22 . This is the well-known Plotkin’s construction, and its
we discuss FDH code-based signature schemes and generator matrix is given by
RM codes. A new code-based signature scheme, called
G(r,m−1) G(r,m−1)
modified pqsigRM using modified RM code is proposed G(r,m) = ,
in Section III. In Section IV, the security of the proposed 0 G(r−1,m−1)
signature scheme is analyzed, and it is proved that the sig- where G(r,m) is the generator matrix of RM(r,m) .
nature scheme is EUF-CMA secure. The proof is based on Recursive decoding is a soft-decision decoding algorithm
two ad-hoc problems and the assumption that these are hard. that depends on the recursive structure of the RM codes;
Algorithm 1 CFS Signature Scheme [2] Algorithm 2 Recursive Decoding of RM Code [10]
Key generation: function RecursiveDecoding(y, r, m)
H is the parity check matrix of an (n, k) Goppa code if r = 0 then
n−k Perform MD decoding on RM(0, m)
The error correction capability t is log n
S and Q are an (n − k) × (n − k) scrambler matrix and else if r = m then
n × n permutation matrix, respectively Perform MD decoding on RM(r, r)
Secret key: H, S, and Q else
Public key: H0 ← SHQ (y0 |y00 ) ← y
Signing: yv = y0 · y00
v̂ ← RecursiveDecoding(yv , r − 1, m − 1)
m is a message to be signed
yu ← (y0 + y00 · v̂)/2
i←1
û ← RecursiveDecoding(yu , r, m − 1)
Do
Output (û|û ·O
v)
i← i+1 end if
Find syndrome s ← h(h(m)|i) end function
Compute s0 ← S−1 s
Until a decodable syndrome s0 is found
Find an error vector satisfying He0T ← s0
* Compute eT ← Q−1 e0T , and then the signature is in Fn2 . Then, we can find an error vector with small Hamming
(m, e, i) weight for any given syndrome corresponding to the received
Verification: vector. Starting from (U , U + V )-codes, we replace certain
rows and append random rows on the generator matrix of
Check wt(e) ≤ t and H0 eT = h(h(m)|i)
(U , U + V )-codes. Thus, these codes are no longer (U , U +
If True, then return ACCEPT; else, return REJECT
V )-codes. However, they have a (U , U + V )-subcode and can
use the decoder for (U , U + V )-codes.
it is described in detail in Algorithm 2, where y0 · y00 denotes A. PARTIAL PERMUTATION OF GENERATOR MATRIX AND
the component-wise multiplication of the vectors y0 and y00 . MODIFIED REED–MULLER CODES
In recursive decoding, a binary symbol a ∈ {0, 1} is mapped New codes named modified RM codes are defined in this
onto (−1)a , and it is assumed that all codewords belong section. We first present the core of the proposed codes,
to {−1, 1}n . which is a (U , U +V )-code. Subsequently, we describe which
First, y00 (the second half of the received vector y) is rows are replaced or appended to the generator matrix. The
component-wisely multiplied by y0 (the first half of the rationale for these operations is provided in Section V.
received vector). Then, a codeword from RM(r,m−1) (i.e., u) For a code C, we define its hull by the intersection of
is removed from y00 as it is both in y0 and y00 , and then only v the code and its dual, in other words, hull(C) = C ∩ C ⊥ .
and the error vector remain. This is regarded as a codeword of The proposed (U , U + V )-code is designed to have a
RM(r−1,m−1) added to an error vector and is referred to as v̂. high-dimensional hull, where dim(U ⊥ ∩ V ), dimenstion of
Using v̂, we can remove the codeword of RM(r−1,m−1) from U ⊥ ∩ V , is large. In general, for a (U , U + V )-code C,
the second half of the received vector. y0 is then added to y00 · v̂, a codeword (u|u + v) ∈ hull(C) satisfies v = u⊥ and
and the sum is divided by 2. This is regarded as a codeword u + v = v⊥ , where u ∈ U and v ∈ V . Hence, when
of RM(r,m−1) added to the error vector, and then decoding is U ⊥ ∩ V = {0}, hull(C) has only (u|u) codewords, and this
performed. Recursively, the received vector is further divided may reveal the secret key. To avoid this, the proposed code is
into sub-vectors of length n/4, n/8, etc. Finally, we reach designed so that dim(U ⊥ ∩ V ) is large.
RM(m,m) or RM(0,m) , then the division terminates and the min- For convenience, we focus on the generator matrix. First,
imum distance (MD) decoding of RM(m,m) or RM(0,m) , which we construct the generator matrix G(r,m) of an RM code
is trivial, is performed. The decoding for the entire code is and then permute its submatrices. An example is shown
performed by reconstructing these results into (U , U + V ) in Figure 1, where σp1 and σp2 denote two independent
form. partial permutations that randomly permute only p out of
n/4 columns. As will be explained in Section VI-B, p is
III. MODIFIED REED–MULLER CODES AND PROPOSED related to the decoding performance. To generate σp1 and
SIGNATURE SCHEME σp2 , p column indices are randomly selected from the index
In this section, we propose new codes, their decoder, and a set {0, 1, . . . , n/4 − 1}, and the selected indices are ran-
signature scheme that uses these codes and decoders. The pro- domly permuted, whereas the others are not. Then, σp1 is
posed code essentially has a (U , U + V )-code as its subcode, used to permute the submatrices corresponding to G(r,m−2) ’s
and recursively, U and V are also (U , U + V )-codes. This in the first dim(RM(r,m−2) ) rows, and σp2 is used to per-
recursive structure allows the decoding of any given vector mute the submatrix corresponding to G(r−2,m−2) in the last
rows; i.e., it should not belong to the hull of the code. Further-
more, it should be verified that the hull has codewords with
Hamming weight that is not a multiple of four as a result of
appending this row. The others are kapp random independent
vectors including at least one vector of odd Hamming weight.
These kapp vectors are independent of the partially permuted
RM codes and independent of each other.
After all these modifications, the resulting code is called
a modified RM code. An example of its generator matrix is
FIGURE 1. Generator matrix of partially permuted RM code with given in Figure 2.
parameter (r , m).
permuted RM codes. Let Capp be the code spanned by the Algorithm 3 Decoding for Modified RM Code
added kapp + 1 rows. The number of codewords increases by function Decode(s; H)
2kapp +1 times when rows are appended by adding codewords r ← Prange(H, s)
of Capp to each (U , U + V )-codeword. Choosing a codeword while True do
of Capp (including 0), subtracting it from the received vector r, r ← r+ random codeword
decoding it, and adding the subtracted codewords back is the c ← ModDec(r, r, m)
decoding process when rows are appended. Thus, the code is if wt(r + c) ≤ w then
decodable even if arbitrary random codes are appended to its Output r + c
generator matrix. end if
Hence, it suffices to explain the decoding algorithm for end while
the (U , U + V )-subcode of a modified RM code. This end function
decoding basically follows the recursive decoding of RM
codes [10]. The difference is the partial permutation and the function ModDec(y, r, m)
y ← yσ
−1
replacement of RM(r,r) . Considering the decoding proposed
in [10], we have c = (u|u + v) for all c ∈ RM(r,m) , where if r = 0 then
u ∈ RM(r,m−1) and v ∈ RM(r−1,m−1) . RM(r,m−1) and Output MD decoding on RM(0, m)
RM(r−1,m−1) are also (U , U + V )-codes, except for r = 0 or else if r = m then
r = m. Here, if the code corresponding to u or v is replaced Output MD decoding on RM(r, r)
with a code other than the RM code and the decoding of or replaced (2r , krep ) code
the replaced code can be performed appropriately, the entire else
code c can also be decoded [15]. (y0 |y00 ) ← y
When the subcode of the RM code is replaced with its yv = y0 · y00
permutation, the entire code can also be decoded by slightly v̂ ← ModDec(yv , r − 1, m − 1)
modifying the recursive decoding. Moreover, no decod- yu ← (y0 + y00 · v̂)/2
ing failure occurs because the recursion eventually reaches û ← ModDec(yu , r, m − 1)
RM(0,m0 ) , RM(r 0 ,r 0 ) , or the (2r , krep ) code to replace RM(r,r) y ← (û|û · v̂)
and there exists polynomial-time MD decoder for these end if
codes. Even the (2r , krep ) random code is MD decodable in Output yσ
constant time because it is a small code. To handle partial per- end function
mutations, when the code is decodable, it uses the fact that the *σ is σp1 or σp2 for permuted block and identity, otherwise.
permutation is always decodable if the permutation is known.
Depermutation and decoding followed by permutation is the
decoding process for permuted codes.
In general, the output distribution of decoding is crucial 2) SIGNING
for security. Thus, we also propose a randomized decoding To sign a given message m, we randomly select a coin i from
method, the output of which is almost uniformly distributed. {0, 1}λ0 . A binary vector s = h(h(m|H0 )|i) is calculated,
Using the algorithm described above, a random decoder can where h : {0, 1}∗ → {0, 1}n−k is a cryptographic hash
easily be designed. Algorithm 3 summarizes the randomized function. Our goal is to find the error vector e satisfying
decoding. It is easy to find a received vector (regardless of its H0 eT = SHQeT = s. Let s0 = S−1 m.
Hamming weight) for any given syndrome; a coset element Performing the decoding as in Algorithm 3, we find an
corresponding to the syndrome is randomly selected. This is error vector e0 satisfying He0T = s0 . If wt(e0 ) ≤ w,
given to the decoder as an input. Finally, the decoder finds we compute eT = Q−1 e0T , and the signature is then given
a different error vector with a small Hamming weight for as (m, e, i).
different inputs.
3) VERIFICATION
C. PROPOSED SIGNATURE SCHEME If wt(e) ≤ w and H0 eT = h(h(m|H0 )|i), we return ACCEPT;
Herein, the proposed modified pqsigRM signature scheme otherwise, we return REJECT.
using the codes in the previous section is presented. Its decod- The key generation, signing, and verification processes
ing algorithm is presented in Section III-B. are summarized in Algorithm 4. For simplicity, let H rep-
resent all the secrets such as partial permutations σp1 and
1) KEY GENERATION σp2 , appended rows, and replaced codes. It should be noted
Let G be the generator matrix of a modified (n, k) RM code, that in the signing process, we choose a random coset ele-
and H be the parity check matrix. Let S be an (n − k) × ment and perform ModDec(·). As ModDec(·) returns dif-
(n−k) random non-singular matrix and Q be an n×n random ferent outputs for different inputs even in the same coset,
permutation matrix. Then, the public key is H0 = SHQ, and we can achieve randomized decoding. The output distri-
the secret keys are H, S, and Q. bution of this randomized decoding output is analyzed in
Section V. We add a salt λ0 to obtain a tight security A. DECODING ONE OUT OF MANY
proof. Information set decoding is a brute-force attack method that
finds an error vector e such that HeT = s and wt(e) ≤ w,
Algorithm 4 Modified pqsigRM Signature Scheme where Stern improved the attack complexity in [14]. It has
been extensively studied, and Dumer’s algorithm [38] as well
Key Generation:
as more involved variants in [39], [40] have been proposed.
Using σp1 and σp2 , generate a partially permuted generator In the variants of the CFS signature scheme, there are sev-
matrix G eral hash queries. Therefore, to launch a forgery attack, it suf-
Generate H from G fices to find an error vector with small Hamming weight for
Generate S and Q any of the syndromes. Hence, the decoding problem DOOM
Compute H0 ← SHQ given below is adequate for tight security proof. The usual
Secret key: H, S, Q FDH proof for existential forgery using syndrome decoding
Public key: H0 would require a work factor ≥ qH · 2λ , where qH ≤ 2λ is
Signing: the number of hash queries. However, with DOOM, the work
m is a message to be signed factor is required to be ≥ 2λ . Although the work factor of
i ←- {0, 1}λ0 DOOM is greater than that of syndrome decoding, it provides
Find syndrome s ← h(h(m|H0 )|i) tighter bounds for security.
s0T ← S−1 sT Problem 1 (DOOM):
Perform decoding e0 ← Decode(s; H) Instance: A parity check matrix H ∈ F2
(n−k)×n
of an (n, k)
* Compute eT ← Q−1 e0T , and then the signature is linear code, syndromes s1 , s2 , · · · , sq ∈ F2 , and an
n−k
(m, e, i) integer w.
Verification: Output: (e, i) ∈ Fn2 × [1, q] such that wt(e) ≤ w and
Check wt(e) ≤ w ∧ H0 eT = h(h(m|H0 )|i) HeT = sTi .
If True, then return ACCEPT; else, return REJECT We consider the case in which the adversary has q instances
and M = max (1, wn /2n−k ) solutions for each instance. Of
course, in our case, w is not small, and thus M is wn /2n−k .
if the adversary knows only the public key, we have a same space. For all n ≥ 0, we have
strong-key substitution attack. Both polynomial-time weak- n
and strong-key substitution attacks on the CFS signature
X
ρ(D10 ⊗ · · · ⊗ Dn0 , D11 ⊗ · · · ⊗ Dn1 ) ≤ ρ(Di0 , Di1 ).
scheme were proposed in [21]. A modification of the CFS i=1
scheme that resists such attacks was also proposed in [21].
In this modification, the syndrome s is generated by hashing Definition 3 (Computational Distance and Indistin-
the message, counter, and public key, rather than hashing only guishability): The computational distance between two dis-
the message and counter. It has been demonstrated that this tributions D0 and D1 in time t is
modified CFS signature scheme is secure against key substi-
1
ρc (D0 , D1 ) := max AdvD ,D (A) ,
0 1
tution attacks [18]. In the modified pqsigRM, the syndrome
is given as s = h(h(m|H0 )|i), and thus it is also secure against 2 |A|≤t
key substitution attacks.
where |A| denotes the running time of A, and AdvD ,D is
0 1
TABLE 2. Parameters for each security level. of DOOM with high-dimensional hull for the parameters
n, k, qH , and w. We stress that s1 , . . . , sqH are random inde-
pendent vectors of Fn−k
2 . Let P(Si ) denote the probability that
A wins Game i.
Game 0 is the EUF-CMA game for SpqsigRM .
Game 1 is the same as Game 0 except for the following
failure event F: There is a collision in a signature query. From
the difference lemma in [41], we have
P(S1 ) ≤ P(S0 ) + P(F). (1)
The following lemma is from [35].
Lemma 2: For λ0 = λ + 2 log2 (qH ), we have P(F) ≤ λ1 .
Game 2 is obtained from Game 1 by changing Hash and
2) PROOF OF EUF-CMA SECURITY
Sign as follows, where Sw denotes the set of vectors with
Let SpqsigRM denote the proposed modified pqsigRM. The Hamming weight w in Fn2 :
following definitions as well as the theorem and its proof are
adopted from those in [29], [35]. proc Hash(m, i) proc Sign(m)
if i ∈ Lm i ← Lm .next()
Definition 5 (Challenger Procedures in the EUF-CMA em,i ←- Sw s ← Hash(m, i)
Game): The challenger procedures in the EUF-CMA game return H0 eTm,i e ← Decode(S−1 sT ; H)
corresponding to SpqsigRM are defined as follows: else return (eQ, i)
j←j+1
proc Init(λ) proc Hash(m, i) return sj
(PK , SK ) ← Gen(1λ ) return h(m, i)
H0 ← PK
(H, S, Q) ← SK Index j is initialized to 0 in the Init procedure. We introduce
return H0 λ
the list Lm , which contains qH random elements of F2 0 for
proc Sign(m) proc Finalize(m, e, i) each message m. The list is sufficiently large so that all
i ←- {0, 1}λ0 s ← Hash(m, i) queries are satisfied. The Hash procedure returns H0 eTm,r if
s ← Hash(m, i) return
and only if i ∈ Lm ; otherwise, it returns sj . The Sign process
e ← Decode(S−1 sT ; H) H0 eT = ST ∧ wt(e) = w
return (eQ, i) is unchanged unless i ∈ Lm .
The statistical distance between the syndromes generated
by matrix H0 and the uniform distribution over F2n−k is
We note that the procedures in Definition 5 simplify 0
ρ(DwH , Us ). This is the difference between Hash in Game 1
Algorithm 4. We can now modify the security reduction and Game 2 when i ∈ Lm . There are at most qH such
in [29], [35] and prove the EUF-CMA security of the modi- instances. Thus, by Proposition 1, it follows that
fied pqsigRM as follows.
0
Theorem 1 (Security Reduction): Let SuccEUF−CMA
SpqsigRM (t, P(S2 ) ≤ P(S1 ) + qH EH0 ρ(DwH , Us ) . (2)
qH , q6 ) be the success probability of the EUF-CMA game
corresponding to SpqsigRM for time t when the number of Game 3 is obtained from Game 2 by replacing Decode
queries to the hash oracle (resp. signing oracle) is qH (resp. with em,i in Sign procedure as follows:
q6 ). Then, in the random oracle model, we have for all t Game 3 Game 5
proc Sign(m) proc Finalize(m, e, i)
i ← Lm .next()
SpqsigRM (t, qH , q6 )
SuccEUF−CMA
s ← Hash(m, i)
s ← Hash(m, i)
n,k,q,w
0
e ← em,i b ← H0 eT = ST ∧ wt(e) = w
≤ 2SuccDOOMHull (tc ) + qH EH0 ρ(DwH , Us ) return (e, i)
/ Lm )
return b ∧ (i ∈
1
+ q6 ρ(Dw , Uw ) + ρc (Dpub , Drand )(tc ) + ,
2λ e is drawn according to the proposed decoding algorithm
0
where tc = t + O(qH · n2 ), DwH is the distribution of the Decode in Game 2, whereas it is now drawn according to the
syndromes H0 eT when e is drawn uniformly from the binary uniform distribution Uw . By Proposition 1, we have
vectors of weight w, Us is the uniform distribution over Fn−k
2 , P(S3 ) ≤ P(S2 ) + q6 ρ(Dw , Uw ). (3)
Dw is the distribution of the decoding result of Algorithm 3,
Uw is the uniform distribution over the binary vectors of H0
Game 4 is the game in which is replaced with H0 . This
weight w, Drand is the uniform distribution over the random implies that the adversary is forced to construct a solution
codes with high-dimensional hull, and Dpub is the uniform for DOOM with high-dimensional hull. Here, if a difference
distribution over the public keys of modified pqsigRM. between Game 3 and Game 4 is detected, then this yields a
Proof: Let A be a (t, qH , q6 , )-adversary against distinguisher between Dpub and Drand . According to [29], the
SpqsigRM , and let (H0 , s1 , . . . , sqH ) be a random instance cost to call Hash does not exceed O(n2 ), and thus the running
time of the challenger is tc = t + O(qH · n2 ). Therefore, The EUF-CMA security proof requires the indistinguisha-
we have bility between public and random codes, i.e., ρc (Dpub ,
Drand )(tc ) should be negligible. We will discuss the design
P(S4 ) ≤ P(S3 ) + ρc (Dpub , Drand )(tc ). (4) methodology and how these modifications can ensure indis-
Game 5 is modified in Finalize. The success of Game 5 tinguishability.
/ Lm and the success of Game 4. A valid forgery
implies i ∈ Considering the key recovery attack in [29], a (U , U + V )-
m∗ has never been queried by Sign, and the adversary has code used in code-based crypto-algorithms should have a
never accessed Lm∗ . As there are q6 signing queries, we have high-dimensional hull for security. Even though the public
code of the proposed signature scheme is not a (U , U + V )-
P(S5 ) = (1 − 2λ0 )q6 P(S4 ). code, it should contain a (U , U + V ) subcode for efficient
decoding.
Moreover, (1 − 2λ0 )q6 ≥ 21 because we assumed λ0 = λ + The attack on SURF in [29] uses the fact that for any
2 log2 (q6 ). Thus, this can be simplified to (U , U + V )-code, the hull of the public code is highly
1 probable to have a (u|u) structure when U ⊥ ∩ V = {0},
P(S5 ) ≥ P(S4 ). (5) dim(U ) ≥ dim(V ). This (u|u) reveals information about the
2
secret permutation Q and enables the attacker to locate the U
P(S5 ) is the probability that A returns a solution for DOOM
and U + V codes. To avoid this, we should maintain the high
with high-dimensional hull, which yields
dimension of U ⊥ ∩ V , implying that the public code should
n,k,q,w have a high-dimensional hull. Hence, we define DOOM with
P(S4 ) ≤ 2SuccDOOMHull (tc ). (6)
high-dimensional hull and assume that the public code of
Combining (1)–(6) concludes the proof. pqsigRM is indistinguishable from a random code with a hull
of the same dimension as that of the public code, rather than
V. INDISTINGUISHABILITY OF CODE AND SIGNATURE any random linear code.
IN THE PROPOSED SCHEME Moreover, kapp random rows are appended to the gener-
It is challenging to prove the hardness of distinguishing a ator matrix, and 2r rows of the generator matrix, that is,
public code of a code-based cryptographic algorithm from the repeated RM(r,r) , are replaced by krep random rows;
a random code. As it is difficult to prove the hardness of furthermore, a codeword from the dual code is appended to
distinguishing the public code from a random code, several the generator matrix. These modifications are equivalent to
cryptographic algorithms are designed by assuming it. In this increasing the dimension of the code itself, the hull, and the
section, we will consider possible attack algorithms and con- dual of the code, respectively, by appending random code-
sider the difficulty of distinguishing the public code and sig- words. Moreover, by adding random codewords, the code is
natures. Moreover, the difficulty of distinguishing signatures no longer a (U , U + V )-code, and thus distinguishing attacks
from random errors is also analyzed. are more difficult to perform.
We now explain the rationale for the aforementioned modi-
A. MODIFICATIONS OF PUBLIC CODE fications, which are applied in addition to partial permutation.
For successful decoding of any received vector, a (U , U +V )-
1) kapp RANDOM ROWS ARE APPENDED TO THE
code should be used in the modified RM codes. To resist
GENERATOR MATRIX
the attack on (U , U + V )-codes proposed in [29], we design
The Hamming weights of a random code are distributed.
a code with high-dimensional hull. Generally, the expected
However, the partially permuted RM code has only code-
dimension of the hull of a random code is O(1), which is
words with even Hamming weight. This is because the Ham-
smaller than d with probability ≥ 1 − O(d) [25]. This is a
ming weights of codewords of RM(r,m) are even numbers, and
difference between random and public codes. However, there
partial permutations do not affect parity.
is currently no algorithm for solving the syndrome decod-
By appending a random row with odd Hamming weight
ing problem by taking advantage of the hull. We consider
to the generator matrix, the Hamming weights of the public
that a high-dimensional hull is not a significant drawback
code become distributed binomially. The problem is that if
unless the hull has a certain structure that may reveal the
only one row with odd Hamming weight is appended, it can
secret. Moreover, in [25], it is demonstrated that there are
easily be extracted. This can be resolved by appending more
a large number of codes with the high-dimensional hull.
than one codeword. Hence, we append kapp random rows such
Hence, we can expect the one-wayness of DOOM with the
that at least one has odd Hamming weight. By the nature of
high-dimensional hull as in Definition 4.
the decoding process, it is still possible to decode the resulting
Cryptanalysis using hulls is widely used in code-based
code.
cryptography. However, this is valid if the hull has a specific
structure that allows information leakage about the secret key. 2) APPENDING A RANDOM CODEWORD OF THE DUAL
Therefore, using only the fact that the dimension of the hull CODE TO THE GENERATOR MATRIX
is large, it is difficult to distinguish whether the code is public The Hamming weights of the codewords in the hull of the par-
or random code with the high-dimensional hull. tially permuted RM code are only multiples of four. However,
the Hamming weight of the codewords in the hull of a random h1 · c + h2 · c = 0, i.e., h1 · c = h2 · c. This implies that
code may be an arbitrary even number, not only a multiple h1 = h2 . Hence, wt(h) is even.
of four. As in the previous modification, a random codeword By replacing the repeated RM(r,r) with a random code such
is appended to the hull. Thereby, we force the codewords of that its dual code has codewords of odd Hamming weight,
the hull of the public code to have arbitrary even Hamming we can force the dual of the public code to have codewords
weights. As a randomly appended row to the generator matrix with odd Hamming weight.
is unlikely to be appended to its hull, appending a codeword Clearly, the dual code of RM(r,r) is {0}. We replace RM(r,r)
to the hull is more complicated. The following is the process with a random (2r , krep ) code. We note that the dual code of
for appending a random codeword to the hull. this (2r , krep ) code must have codewords with odd Hamming
Let hull(C) be the hull of a code C. We define C 0 and C 00 by weight. The generator matrix is modified in this manner,
C = hull(C) + C 0 and C ⊥ = hull(C) + C 00 , where hull(C), C 0 , rather than by appending rows to the parity check matrix,
and C 00 are linearly independent. We can then generate a code to ensure that the entire code is decodable.
with a hull with dimension dim(hull(C)) + 1 by the following
procedure: B. PUBLIC CODE INDISTINGUISHABILITY
i) Find a codeword cdual ∈ C 00
such that cdual · cdual = 0. In the EUF-CMA security proof, ρc (Dpub , Drand ) is required
This is easy because a codeword with even Hamming to be negligible, that is, the modified RM code distinguishing
weight satisfies it. problem should be hard. As it is challenging to find the
ii) Let Cinc = C + {cdual } = (hull(C) + {cdual }) + C 0 . computational distance between public and random codes, in
iii) As cdual · (hull(C) + {cdual }) = {0} and cdual · C 0 = {0}, this section, we study the randomness of the public code and
we have cdual ∈ Cinc ⊥ , where for a vector x and a set of consider possible attacks.
vectors A, x · A is the set of all inner products of x and
elements of A. 1) PUBLIC CODE IS NOT A (U, U + V )-CODE
iv) It can be seen that Cinc ∩ Cinc ⊥ = (hull(C) + {c
dual }). After random rows have been appended to the generator
Hence, Cinc is a code that has a hull of which dimension matrix of a (U , U + V )-code, the resulting code is unlikely to
is dim(hull(C)) + 1. be a (U , U +V )-code. Considering the following proposition,
If the Hamming weights of the codewords of the hull are only it can be seen that with probability O(2kU −n/2 ), a (U , U +
multiples of 4, then another cdual is selected, and the above V )-code remains a (U , U + V )-code after a row has been
process is repeated. appended to its generator matrix.
Proposition 3: Let C be a (U , U + V )-code. Then, for all
3) REPEATED RM(r,r) IS REPLACED WITH RANDOM codewords (c0 |c00 ) ∈ C, (0|c0 − c00 ) ∈ C.
(2r , krep ) CODES It is expected that attacking the modified RM code is dif-
We note that by replacing repeated RM(r,r) by random ficult because the appended codewords change the algebraic
(2r , krep ) codes, the dimension of the code is reduced by 2r − structure of the code (i.e., the (U , U + V ) structure), there is
krep ; this is equivalent to appending 2r −krep rows to the parity considerable randomness, and there is currently no recovery
check matrix. The codewords of the dual code of the partially algorithm.
permuted RM code have only codewords of even Hamming
weight owing to a subcode of the partially permuted RM 2) DISTINGUISHING USING HULL
code. This can be resolved by replacing this subcode with When a random row is appended to the generator matrix,
another random code such that its MD decoder exists. The it is unlikely to be included in the hull. To achieve this,
partially permuted RM code includes (RM(r,r) | . . . |RM(r,r) ), the appended row should be a codeword of the dual code,
and the dual code of this has only codewords of even Ham- and its square should be zero. Hence, we append a codeword
ming weight by the proposition below. It is easy to verify from the dual code to the generator matrix.
that the dual code of the partially permuted RM code is a The appended row can be omitted when the attacker col-
subset of the dual code of (RM(r,r) | . . . |RM(r,r) ). That is, lects several independent codewords with Hamming weight 4
(RM(r,r) | . . . |RM(r,r) ) causes the dual code of the partially from the hull. However, for any random code with a
permuted RM code to have only codewords of even Hamming high-dimensional hull, the same process can be applied, and
weight. finally, there only remain codewords of which the Hamming
Proposition 2: Let C be a code such that its dual code has weight is a multiple of 4. Hence, this is not a valid distin-
only codewords of even Hamming weight. Then, the dual of guishing attack.
the concatenated code, {(c|c)|c ∈ C}, has only codewords of The hull of a random (U , U + V )-code is {0} when kU <
even Hamming weight. kV and is highly probable to have codewords of (u|u) form
Proof: Let h ∈ (C|C)⊥ , where C is an (n, k) code when kU ≥ kV . However, the hull of an RM code is also
and C|C is a concatenated code given as {(c|c)|c ∈ C}. We an RM code, and in our case, the partial permutation ran-
define vectors h1 and h2 of length n so that h = (h1 |h2 ). domizes its hull and retains its large dimension. As shown in
Clearly, if h1 ∈ C ⊥ , then h2 ∈ C ⊥ . If h1 ∈ / C ⊥ , we have Section VI, the hull is neither a subcode of the RM code nor
a (U , U + V )-code. Moreover, most of the hull depends on TABLE 3. Comparison of parameter sets of several code-based signature
schemes for given security.
the secret partial permutations σp1 and σp2 .
C. SIGNATURE LEAKS
In the EUF-CMA security proof, it is required that ρ(Dw , Uw )
is a negligible function of the security parameter λ. If this is
true, then the signature does not leak information. In several
signature schemes, such as Durandal, SURF, and Wave, this
is achieved and proved. In SURF and Wave, the rejection
sampling method is applied to render Dw indistinguishable.
To apply rejection sampling, the distribution of the decod- B. STATISTICAL ANALYSIS FOR DETERMINING NUMBER
ing output should be known. In SURF and Wave, a simple and OF PARTIAL PERMUTATIONS
efficient decoding algorithm is used, and thus it is easy to find If w is excessively small, there is a low probability of finding
the distribution of the decoding output. However, in our case, an error vector with Hamming weight less than equal to w.
the decoding output exhibits a high degree of randomness, We present two solutions. One is iterating until an appropri-
and the structure of the decoder is complex. Therefore, it is ate error vector is obtained, and the other is improving the
difficult to analyze the distribution of the decoding output. decoder. The number p of columns permuted in the partial
Instead, we conduct a proof-of-concept implementation of permutation varies from 0 to n/4. From numerical analysis,
the modified pqsigRM using SageMath. Then, we perform it is demonstrated that small values of p result in low Ham-
statistical randomness tests under NIST SP 800-22 [42] on ming weight of the decoding output. However, it should be
the decoding output, and we compare the results with random noted that when p = 0, the (U , U + V ) part of the modified
errors in Fn2 with Hamming weight w. No significant differ- RM codes becomes identical to the RM code except that
ence is observed. However, it should be noted that the success RM(r,r) is replaced. Hence, we propose the lower bound of p
of a statistical randomness test does not imply indistinguisha- that does not affect the randomness of the hull.
bility. Thus, the indistinguishability of the signature should be Regarding the modified RM code, its hull overlaps with
rigorously studied as future work. (but is not a subset of) the original RM code. If the hull is a
subset of the original RM code, and its dimension is large,
VI. PARAMETER SELECTION the codeword of minimum Hamming weight of the original
A. PARAMETER SETS RM code may be included in the hull. Then, attacks such as
The constraint here is that n is a power of two. We can the Minder–Shokrollahi attack may be applied using code-
numerically find the feasible ranges of w once n and k are words with minimum Hamming weight. Therefore, to prevent
determined. If the security level λ is achieved in this range, attacks, the hull of the public code should not be a subset of
we accept the value; otherwise, we increase n. Considering the original RM code, and hull(Cpub ) r (RM(r,m) permuted
DOOM, a smaller value of w implies higher security. If w is so by Q) should occupy a large portion of the hull, where
small that a large number of decoding iterations are required, Cpub denotes the public code, and r denotes the relative
we could reduce the partial permutation parameter p. p is at complement.
most n/4, and the characteristics of the codes are retained by As the permutation Q is not important for determining
lowering p to a certain degree. The method for obtaining the the parameter p, we ignore it in this subsection, and the
minimum values is described in the following subsection. The term permutation refers to the partial permutations σp1 and
discussed state-of-the-art algorithm for DOOM is used as a σp2 . When p = n/4, which implies that σp1 and σp2 are
basis for the parameters proposed in Table 2. We set kapp = 2 full permutations, the average dimension of the hull and the
(the minimum value) and krep = 2r −2 (the maximum value). dimension of hull(Cpub ) r RM(r,m) are given in Table 4. The
Regarding the key size, the public key is a parity check values may slightly change according to the permutation.
matrix given in the systematic form and requires (n−k)n bits. If p is small, the Hamming weight of the errors decreases.
The secret key does not include a scrambler matrix S because Hence, the signing time can be reduced by using partial
it can be obtained from H0 , Q, and H. Moreover H can be permutation with p rather than full permutation. The aim is
represented by σp1 , σp2 , replacing code, and appending rows. to find a smaller value for p maintaining the dimension of
The comparison of parameter sets is given in Table 3. hull(Cpub ) r RM(r,m) as large as that by the full permutation.
The key size of the proposed modified pqsigRM is small It can be seen that the average of the dimension of hull(Cpub )r
compared to other algorithms. We note that it is for reference RM(r,m) tends to increase as p increases, and it is saturated
only, and the actual parameter size is given variously along when p is above a certain value, as in Figure 3. Specifically,
with trade-off with signing complexity, etc. The security level the dimension of hull(Cpub ) r RM(r,m) is saturated when p is
in parallel-CFS is based on the generalized birthday algo- approximately equal to the average dimension of hull(Cpub )r
rithm [5], and the distinguisher for high-rate Goppa code [4] RM(r,m) with full permutation. Hence, we determine p as 130,
is not considered. For detailed information, see [3] and [35]. 386, and 562 in Table 2.
[33] N. Aragon, O. Blazy, P. Gaborit, A. Hauteville, and G. Zémor, ‘‘Durandal: WIJIK LEE received the B.S. and Ph.D. degrees
A rank metric based signature scheme,’’ in Proc. Annu. Int. Conf. Theory in electrical and computer engineering from Seoul
Appl. Cryptograph. Techn. Darmstadt, Germany: Springer, pp. 728–758, National University. He has been with Sam-
2019. sung Electronics, Hwaseong, South Korea, since
[34] C. Gentry, C. Peikert, and V. Vaikuntanathan, ‘‘Trapdoors for hard lattices 2018. He is also a submitter for a candidate algo-
and new cryptographic constructions,’’ in Proc. 14th Annu. ACM Symp. rithm (pqsigRM) in the first round for the NIST
Theory Comput. STOC, 2008, pp. 197–206. Post Quantum Cryptography Standardization. His
[35] T. Debris-Alazard, N. Sendrier, and J.-P. Tillich, ‘‘Wave: A new family
research interests include post-quantum cryptogra-
of trapdoor one-way preimage sampleable functions based on codes,’’ in
phy and homomorphic encryption.
Proc. Int. Conf. Theory Appl. Cryptol. Inf. Secur. Kobe, Japan: Springer,
2019, pp. 21–51.
[36] P.-L. Cayrel, A. Otmani, and D. Vergnaud, ‘‘On kabatianskii-krouk-smeets
signatures,’’ in Proc. Int. Workshop Arithmetic Finite Fields. Madrid, YOUNG SIK KIM (Member, IEEE) received the
Spain: Springer, 2007, pp. 237–251. B.S., M.S., and Ph.D. degrees in electrical engi-
[37] A. Otmani and J.-P. Tillich, ‘‘An efficient attack on all concrete KKS neering and computer science from Seoul National
proposals,’’ in Proc. Int. Workshop Post-Quantum Cryptography. Taipei, University, in 2001, 2003, and 2007, respectively.
Taiwan: Springer, 2011, pp. 98–116. He joined the Semiconductor Division, Samsung
[38] I. Dumer, ‘‘On minimum distance decoding of linear codes,’’ in Proc. 5th Electronics, where he performed research and
Joint Soviet-Swedish Int. Workshop Inform. Theory, 1991, pp. 50–52. development of security hardware IPs for various
[39] A. Becker, A. Joux, A. May, and A. Meurer, ‘‘Decoding random binary embedded systems, including modular exponenti-
linear codes in 2n /20: How 1 + 1 = 0 improves information set decoding,’’ ation hardware accelerator (called Tornado 2MX2)
in Proc. Annu. Int. Conf. Theory Appl. Cryptograph. Techn. Cambridge, for RSA and elliptic curve cryptography in smart
U.K.: Springer, 2012, pp. 520–536. card products and mobile application processors of Samsung Electronics,
[40] A. May and I. Ozerov, ‘‘On computing nearest neighbors with applications until 2010. He is currently a Professor with Chosun University, Gwangju,
to decoding of binary linear codes,’’ in Proc. Eurocrypt. Sofia, Bulgaria: South Korea. He is also a submitter for two candidate algorithms (McNie
Springer, 2015, pp. 203–228.
and pqsigRM) in the first round for the NIST Post Quantum Cryptography
[41] V. Shoup, ‘‘Sequences of games: A tool for taming complexity in security
Standardization. His research interests include post-quantum cryptography,
proofs,’’ IACR Cryptol. ePrint Arch., vol. 2004, p. 332, 2004.
[42] I. Lawrence et al., ‘‘SP 800-22 Rev. 1a. A statistical test suite the IoT security, physical layer security, data hiding, channel coding, and
for random and pseudorandom number generators for cryptographic signal design. He is selected as one of 2025’s 100 Best Technology Leaders
applications,’’ Nat. Inst. Stand. Technol., Gaithersburg, MD, USA, (for Crypto-Systems) by the National Academy of Engineering of Korea.
Tech. Rep. SP 800-22 Rev. 1a, 2010. Accessed: Sep. 14, 2020.
JONG-SEON NO (Fellow, IEEE) received the
B.S. and M.S.E.E. degrees in electronics engineer-
ing from Seoul National University, Seoul, South
Korea, in 1981 and 1984, respectively, and the
Ph.D. degree in electrical engineering from the
University of Southern California, Los Angeles,
CA, USA, in 1988. He was a Senior MTS with
Hughes Network Systems, from 1988 to 1990.
He was an Associate Professor with the Depart-
ment of Electronic Engineering, Konkuk Univer-
sity, Seoul, from 1990 to 1999. He joined the faculty of the Depart-
ment of Electrical and Computer Engineering, Seoul National University,
in 1999, where he is currently a Professor. His research interests include
error-correcting codes, cryptography, sequences, LDPC codes, interference
alignment, and wireless communication systems. He became an IEEE Fellow
through the IEEE Information Theory Society in 2012. He became a member
YONGWOO LEE (Graduate Student Member, of the National Academy of Engineering of Korea (NAEK), in 2015, where
IEEE) received the B.S. degree in electrical engi- he is currently the Division Chair of Electrical, Electronic, and Information
neering and computer science from the Gwangju Engineering. He was a recipient of the IEEE Information Theory Society
Institute of Science and Technology, Gwangju, Chapter of the Year Award in 2007. From 1996 to 2008, he has served as
South Korea, in 2015, and the M.S. degree in the Founding Chair of the Seoul Chapter of the IEEE Information Theory
electrical and computer engineering from Seoul Society. He was the General Chair of Sequence and Their Applications
National University, in 2017, where he is currently 2004 (SETA 2004), Seoul. He also served as the General Co-Chair of the
pursuing the Ph.D. degree. He is also a submitter International Symposium on Information Theory and Its Applications 2006
for a candidate algorithm (pqsigRM) in the first (ISITA 2006) and the International Symposium on Information Theory 2009
round for the NIST Post Quantum Cryptography (ISIT 2009), Seoul. He has served as the Co-Editor-in-Chief for the IEEE
Standardization. His current research interests include homomorphic encryp- JOURNAL OF COMMUNICATIONS AND NETWORKS from 2012 to 2013.
tion and code-based cryptography.