Cryptanalysis of a One-Time Code-Based Digital Signature Scheme
Paolo Santini, Marco Baldi, Franco Chiaraluce

TL;DR
This paper demonstrates a practical key recovery attack on a recent one-time code-based digital signature scheme, exploiting signature sparsity and statistical analysis to significantly reduce the security level.
Contribution
It introduces a novel attack method that recovers secret keys from a single intercepted signature, challenging the claimed security of the scheme.
Findings
Successful key recovery with low complexity
Attack exploits signature sparsity and statistical analysis
Security level claims are significantly undermined
Abstract
We consider a one-time digital signature scheme recently proposed by Persichetti and show that a successful key recovery attack can be mounted with limited complexity. The attack we propose exploits a single signature intercepted by the attacker, and relies on a statistical analysis performed over such a signature, followed by information set decoding. We assess the attack complexity and show that a full recovery of the secret key can be performed with a work factor that is far below the claimed security level. The efficiency of the attack is motivated by the sparsity of the signature, which leads to a significant information leakage about the secret key.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Cryptanalysis of a One-Time
Code-Based Digital Signature Scheme
Paolo Santini, Marco Baldi, and Franco Chiaraluce
Dipartimento di Ingegneria dell’Informazione
Università Politecnica delle Marche
Ancona, Italy
Email: [email protected], {m.baldi, f.chiaraluce}@univpm.it
Abstract
We consider a one-time digital signature scheme recently proposed by Persichetti and show that a successful key recovery attack can be mounted with limited complexity. The attack we propose exploits a single signature intercepted by the attacker, and relies on a statistical analysis performed over such a signature, followed by information set decoding. We assess the attack complexity and show that a full recovery of the secret key can be performed with a work factor that is far below the claimed security level. The efficiency of the attack is motivated by the sparsity of the signature, which leads to a significant information leakage about the secret key.
Index Terms:
Code-based cryptography, cryptanalysis, digital signatures.
I Introduction
Code-based cryptosystems, introduced by McEliece in 1978 [1], rely on the hardness of the Syndrome Decoding Problem (SDP), which has been proven to be NP-complete for general random codes [2]. The best SDP solvers for general codes, known as Information Set Decoding (ISD) algorithms, were first introduced by Prange in 1962 [3] and significantly improved over years (see [4, 5] and references therein). However, all current ISD algorithms are characterized by an exponential complexity, even when implemented on quantum computers [6]. Since SDP is one of the oldest and most studied hard problems, and no polynomial time solver is currently known, code-based cryptosystems are among the most promising solutions for post-quantum cryptography [7].
However, designing a secure and efficient digital signature scheme based on coding theory is still an open problem. The main difficulty is represented by the fact that, typically, in these systems the plaintext and ciphertext domains do not coincide. Therefore, applying decryption on a general string, for example obtained through a hash function, may result in a failure unless special solutions are adopted. Proposals trying to address this issue have been proven not to be either efficient or secure (or both, in the worst cases). A clear evidence of the hardness of finding efficient digital signature schemes based on codes is represented by the fact that no proposal of this type is surviving within the National Institute of Standards and Technology (NIST) competition for the standardization of post-quantum primitives [8].
Historically, the first digital signature scheme based on error correcting codes is the Courtois-Finiasz-Sendrier (CFS) scheme [9], that uses high rate Goppa codes and follows a hash-and-sign approach. This scheme is known to be unpractical, since it has some security flaws (high rate Goppa codes can be distinguished from random codes [10]) and requires very large public-keys and long signature times.
In particular, some schemes might suffer from statistical attacks, i.e., procedures that can break the system through the observation of a sufficiently large number of signatures. In such a case, the attacked systems are reduced to few-signatures schemes or, in the most conservative assumption, to one-time schemes (each key-pair is refreshed after just one signature). For instance, the BBC+ scheme proposed in [11], which is based on low-density generator matrix (LDGM) codes, has been cryptanalized in [12] with a procedure that allows forging valid signatures after the observation of thousands of signatures, which limits the life of its keypairs [13]. Another recent proposal is Wave [14], based on generalized codes. A cryptanalysis procedure of Wave based on the statistical analysis of hundreds of signatures has been proposed in [15]. However, such a procedure has been disproved in [16], since it is referred to a degraded version of the scheme.
In this paper, we consider a one-time signature scheme that was recently proposed by Persichetti [17]. Such a scheme is obtained as a modification of Stern’s identification protocol [18], and relies on Quasi-Cyclic (QC) codes, which allow for both compact keys and low computational complexity. However, as we show afterwards, this scheme suffers from an attack which leads to a full recovery of the secret key and whose complexity is far below the claimed security level. Our attack is based on a statistical analysis performed on a single signature, combined with an ISD algorithm.
Another attack against the same scheme has been independently developed in [19]. The attack in [19] is based on Bit Flipping (BF) decoding, which has the advantage of being vary fast compared to other decoders. However, the success probability of BF decoding cannot be predicted analytically. Moreover, in case of a decoding failure, it is not possible to perform further randomized attempts of decoding through BF. Differently from [19], our attack exploits ISD, which permits us to obtain a closed-form formula for the average number of iterations and the relevant complexity needed for a successful attack, which depend only on the system parameters. So, while the feasibility of the attack in [19] can only be assessed through numerical simulations, we do not rely on simulations: through an theoretical approach, we show that the security of the scheme is reduced to the complexity of an SDP instance, which is far below any reasonable security level. In particular, our analysis shows how the security of the system is related to the hardness of solving an SDP instance in which the weight of the searched vector is particularly low. Through this approach, we can make general statements about the effectiveness of the attack on modified parameters sets, showing that meaningful security levels cannot be achieved even resorting to extreme choices for the parameters set.
The paper is organized as follows. In Section II we introduce the notation used. In Section III we describe the scheme and its design strategy. In Section IV we describe our attack procedure and derive estimates of its complexity. Finally, in Section V we report some conclusions.
II Notation
We denote as the polynomial ring , where is an integer and is a symbolic variable. We use bold letters to denote vectors over , in the form , with . Each can be unambiguously represented as a vector in the form
[TABLE]
where is the -th coefficient of the -th polynomial, , in . Let be the binary field. Given a vector over , we denote as the vector obtained by lifting its entries over the integer domain ; the same notation is used for vectors of polynomials. Operations involving lifted vectors are performed in the integer domain (i.e., ). Given a polynomial , we define its Hamming weight, , as the number of its non-null coefficients. For a vector of polynomials , the Hamming weight corresponds to the sum of the Hamming weights of its elements. The support of a polynomial , denoted as , is the set containing the indexes of the non-null coefficients of . Clearly, the Hamming weight of a polynomial corresponds to the cardinality of its support.
We denote as the uniform distribution of all binary -uples with weight . Then, the expression \mathbf{a}\xleftarrow[]{\}\mathcal{D}{n,w}\mathbf{a}\mathcal{D}{n,w}w1/\binom{n}{w}n=2p\mathbf{a}(x)\xleftarrow[]{$}\mathcal{D}_{n,w}\mathbf{a}(x)\mathcal{R}^{2}w$.
III System description
The one-time digital signature scheme we are considering is built upon a public polynomial , that is fixed by the protocol. We denote as the function that takes as input a vector and outputs . The scheme additionally requires a hash function that takes as input and outputs a weight- polynomial. Parameters of the scheme are the integers , , , (with ).
The key generation is shown in Algorithm 1; the signing key (i.e., secret key) is a vector , such that . The verification key (i.e., public key) is obtained through the application of on the secret key. The signature generation and verification are shown, respectively, in Algorithms 2 and 3. The signature verification algorithm returns a boolean variable that is false when the signature is valid and true otherwise.
III-A Security analysis
The security of the scheme is based on the hardness of the SDP that, in the binary case, is defined as follows.
**Syndrome Decoding Problem **
*Given , and , find such that and .
The SDP is a well-known problem in coding theory, and has been proven to be NP-complete [2]; in particular, the solution of the SDP can be unique only when does not exceed the Gilbert-Varshamov (GV) distance , that is defined as the greatest integer such that . When , the best solvers for SDP are ISD algorithms, whose complexity crucially depends on and on the code rate.
The security of the scheme is based on the fact that the inversion of requires the solution of an SDP instance. Let be the QC matrix obtained by concatenating the identity with the circulant matrix having as first column. Let and denote, respectively, the vectors associated to the secret and the public key: then, the following relation holds
[TABLE]
An opponent trying to recover the secret key must solve an SDP instance; thus, the weight of cannot be smaller than some security threshold value. In the verification procedure, a crucial aspect is represented by the weight of , which has maximum value equal to . Indeed, the authenticity of the signature is guaranteed if there is only one vector such that
[TABLE]
since this proves that has been computed through the signing key. Then, a necessary condition for such a vector to be unique is
[TABLE]
Obviously, the system is fully broken also if the opponent can perform ISD on : then, even cannot be lower than some security threshold value.
Finally, we must take into account that is obtained through linear operations involving sparse polynomials, one of them being , which is part of the signature, and is hence public. In [17], the possibility of attacks exploiting such facts has been considered; for this reason, the scheme has been proposed only for the one-time signature case. However, as we show next, the analysis of a single signature, combined with an ISD algorithm, is enough to recover the secret key.
IV An efficient key recovery attack
We remember that the signature is composed by the pair , with . Let us write
[TABLE]
where contains distinct integers.
An opponent can compute the polynomials , for and for all ; we have
[TABLE]
The opponent can then lift all such polynomials in the integers domain, and compute the sum
[TABLE]
for . We expect high coefficients in to be associated to ones in . In fact, all polynomials are obtained as the sum of with other sparse polynomials that depend on the shift . Hence, if an entry belongs to the support of a large number of polynomials , then it also belongs to the support of with high probability.
The opponent can exploit this fact to estimate the coefficients of . In particular, let be a vector with coefficients
[TABLE]
where is an integer . The vector represents an estimate of , whose accuracy depends on the choice of .
The opponent can then compute
[TABLE]
where . If , then , otherwise ISD can be used to obtain from , and then the secret key can be recovered as .
The complexity of the whole attack crucially depends on the weight of , which is related to the accuracy of the estimate . As shown in the next section, for the system we consider it is always possible to choose such that the weight of has a high probability of being very small.
IV-A Attack complexity
Let us denote as and the weights of and , respectively. A specific weights partition is uniquely determined by and , as and . The probability to have this partition is
[TABLE]
Recall (IV), and let us define
[TABLE]
from which . Let be the probability that a particular coefficient in the sum is null. We can assume that each is a random polynomial with weight , and define
[TABLE]
such that can be estimated as
[TABLE]
Each null coefficient in results in a match between and ; thus, the probability that a set coefficient in is also set in can be estimated as
[TABLE]
Similarly, the probability that a null coefficient in is set in can be obtained as
[TABLE]
Let us denote as and as the number of coefficients that are correctly and incorrectly set in ; then, we have
[TABLE]
Let us define
[TABLE]
The probability that has weight results in
[TABLE]
where . Let , then
[TABLE]
Through the probability distribution of , we can estimate the effectiveness and the complexity of our cryptanalysis. The first part of the attack consists in the computation of : since it only involves a limited number of shifts, multiplications and sums, we can neglect the complexity of this step. If , then the opponent has already fully recovered the secret key. In all the other cases, the opponent applies ISD on , in order to determine the vector , whose weight is unknown and is distributed according to Eq. (IV-A). For the sake of simplicity we consider the Lee-Brickell ISD algorithm [20], which takes as input an integer and, at each iteration, picks an information set and tests all patterns having a maximum of ones in the selected positions: an iteration is successful if the selected information set contains a maximum of errors. In particular, the complexity of each iteration can be estimated as
[TABLE]
Let denote the probability of success for a single iteration. Then, we have
[TABLE]
where is a sufficiently large integer. The average complexity of ISD can then be estimated as
[TABLE]
As we show next, for all instances proposed in [17] we can determine a value of for which holds with high probability or applying ISD on has extremely low complexity. In particular, these statements are motivated by the fact that, with overwhelming probability, has an extremely low weight, such that finding it through an ISD algorithm requires just a small number of iterations.
IV-B Results
In Fig. 1 we report the distribution of the weights of for two instances proposed in [17]. The empirical distributions have been obtained through numerical simulations on pairs of verification keys and signatures, and have been compared with the theoretical ones expressed by (IV-A), showing everywhere an excellent agreement. As we can see, the weight of assumes very low values with high probability. This is a clear evidence of the system weakness against the attack.
In Table I we have considered the applicability of the attack on the instances proposed in [17]; as we can see, all the instances can be completely broken. Indeed, always has high values: thus, with non-negligible probability, the secret key can be fully recovered without invoking ISD. When , it is highly probable that has an extremely low weight: this results in having very high values, only slightly influenced by the choice of (i.e., choosing is already enough to guarantee ). This means that the application of ISD normally requires a very limited number of operations.
We can also show that changing the system parameters is not enough to significantly raise the security level of the scheme. In order to give an evidence of this fact, we have considered the case of , for which , and tested different values of , and . The results are reported in Table II. As we can see, there are no significant changes in the security of the system. In particular, the last three instances in the table have been designed with a maximum weight of that is close to . This choice is clearly extreme since, as explained in Section III, this way the uniqueness of the signature is no longer achievable. One might think to apply some modifications to the protocol, to take into account also this possibility in the signature verification algorithm. However, our results should discourage the attempt.
V Conclusion
We have discussed a serious weakness of a recently proposed one-time digital signature scheme. Our analysis shows that the secret key can be fully recovered with very low complexity, and that changes in the system parameters are not able to restore meaningful security levels. We point out that, with a few modifications, our attack procedure can be applied to structures different from the QC one. This is because it exploits the sparsity of the signature. As this is an inherent feature of the considered scheme, restoring its security might require deep and structural changes.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] R. J. Mc Eliece, “A public-key cryptosystem based on algebraic coding theory.” DSN Progress Report , pp. 114–116, 1978.
- 2[2] E. Berlekamp, R. Mc Eliece, and H. van Tilborg, “On the inherent intractability of certain coding problems,” IEEE Trans. Inf. Theory , vol. 24, no. 3, pp. 384–386, May 1978.
- 3[3] E. Prange, “The use of information sets in decoding cyclic codes,” IRE Trans. Inf. Theory , vol. 8, no. 5, pp. 5–9, 1962.
- 4[4] J. Stern, “A method for finding codewords of small weight,” in Coding Theory and Applications , ser. Lecture Notes in Computer Science, G. Cohen and J. Wolfmann, Eds. Springer Verlag, 1989, vol. 388, pp. 106–113.
- 5[5] A. Becker, A. Joux, A. May, and A. Meurer, “Decoding random binary linear codes in 2 n / 20 superscript 2 𝑛 20 2^{n/20} : How 1 + 1 = 0 improves information set decoding,” in Advances in Cryptology - EUROCRYPT 2012 , ser. Lecture Notes in Computer Science, D. Pointcheval and T. Johansson, Eds. Springer Verlag, 2012, vol. 7237, pp. 520–536.
- 6[6] D. J. Bernstein, “Grover vs. Mc Eliece,” in Post-Quantum Cryptography , N. Sendrier, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 73–80.
- 7[7] L. Chen, Y.-K. Liu, S. Jordan, D. Moody, R. Peralta, R. Perlner, and D. Smith-Tone, “Report on post-quantum cryptography,” National Institute of Standards and Technology, Tech. Rep. NISTIR 8105, 2016.
- 8[8] National Institute of Standards and Technology. (2016, Dec.) Post-quantum crypto project. [Online]. Available: http://csrc.nist.gov/groups/ST/post-quantum-crypto/
