TL;DR
This paper enhances the security and speed of CSIDH, a quantum-resistant cryptographic primitive, by fixing vulnerabilities in constant-time implementations and proposing a dummy-free variant suitable for embedded hardware.
Contribution
It identifies and repairs oversights in existing constant-time CSIDH algorithms, introduces the fastest constant-time version using Edwards arithmetic, and proposes a dummy-free variant for fault injection resistance.
Findings
Fastest constant-time CSIDH achieved to date.
Dummy-free CSIDH variant offers improved fault injection resistance.
Performance remains within a small factor of less-protected versions.
Abstract
CSIDH is a recent quantum-resistant primitive based on the difficulty of finding isogeny paths between supersingular curves. Recently, two constant-time versions of CSIDH have been proposed: first by Meyer, Campos and Reith, and then by Onuki, Aikawa, Yamazaki and Takagi. While both offer protection against timing attacks and simple power consumption analysis, they are vulnerable to more powerful attacks such as fault injections. In this work, we identify and repair two oversights in these algorithms that compromised their constant-time character. By exploiting Edwards arithmetic and optimal addition chains, we produce the fastest constant-time version of CSIDH to date. We then consider the stronger attack scenario of fault injection, which is relevant for the security of CSIDH static keys in embedded hardware. We propose and evaluate a dummy-free CSIDH algorithm. While these CSIDH…
| Implementation | CSIDH Algorithm | M | S | A | Ratio |
|---|---|---|---|---|---|
| Castryck et al. [6] | unprotected, unmodified | 0.252 | 0.130 | 0.348 | 0.26 |
| Meyer–Campos–Reith [21] | unmodified | 1.054 | 0.410 | 1.053 | 1.00 |
| Onuki et al. [26] | unmodified | 0.733 | 0.244 | 0.681 | 0.67 |
| This work | MCR-style | 0.901 | 0.309 | 0.965 | 0.83 |
| OAYT-style | 0.657 | 0.210 | 0.691 | 0.59 | |
| No-dummy | 1.319 | 0.423 | 1.389 | 1.19 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: CINVESTAV - Centro de Investigaciòn y de Estudios Avanzados del Instituto Politécnico Nacional, Mexico City, Mexico 22institutetext: École polytechnique, Institut Polytechnique de Paris, Palaiseau, France 33institutetext: Inria, équipe-projet GRACE, Université Paris–Saclay, France 44institutetext: Université Paris Saclay – UVSQ, Versailles, France
Stronger and Faster Side-Channel Protections for CSIDH
Daniel Cervantes-Vázquez 11
Mathilde Chenu 2233
Jesús-Javier Chi-Domínguez 11
Luca De Feo 44
Francisco Rodríguez-Henríquez 11
Benjamin Smith 3322
Abstract
CSIDH is a recent quantum-resistant primitive based on the difficulty of finding isogeny paths between supersingular curves. Recently, two constant-time versions of CSIDH have been proposed: first by Meyer, Campos and Reith, and then by Onuki, Aikawa, Yamazaki and Takagi. While both offer protection against timing attacks and simple power consumption analysis, they are vulnerable to more powerful attacks such as fault injections. In this work, we identify and repair two oversights in these algorithms that compromised their constant-time character. By exploiting Edwards arithmetic and optimal addition chains, we produce the fastest constant-time version of CSIDH to date. We then consider the stronger attack scenario of fault injection, which is relevant for the security of CSIDH static keys in embedded hardware. We propose and evaluate a dummy-free CSIDH algorithm. While these CSIDH variants are slower, their performance is still within a small constant factor of less-protected variants. Finally, we discuss derandomized CSIDH algorithms.
Note:
A previous version of this article incorrectly claimed that a test in Algorithm 2 leaks information through the timing channel. We are grateful to Prof. Onuki for point out our mistake.
1 Introduction
Isogeny-based cryptography was introduced by Couveignes [10], who defined a key exchange protocol similar to Diffie–Hellman based on the action of an ideal class group on a set of ordinary elliptic curves. Couveignes’ protocol was independently rediscovered by Rostovtsev and Stolbunov [27, 28], who were the first to recognize its potential as a post-quantum candidate. Recent efforts to make this system practical have put it back at the forefront of research in post-quantum cryptography [13]. A major breakthrough was achieved by Castryck, Lange, Martindale, Panny, and Renes with CSIDH [6], a reinterpretation of Couveignes’ system using supersingular curves defined over a prime field.
The first implementation of CSIDH completed a key exchange in less than 0.1 seconds, and its performance has been further improved by Meyer and Reith [22]. However, both [6] and [22] recognized the difficulty of implementing CSIDH with constant-time algorithms, that is, algorithms whose running time, sequence of operations, and memory access patterns do not depend on secret data. The implementations of [6] and [22] are thus vulnerable to simple timing attacks.
The first attempt at implementing CSIDH in constant-time was realized by Bernstein, Lange, Martindale, and Panny [3], but their goal was to obtain a fully deterministic reversible circuit implementing the class group action, to be used in quantum cryptanalyses. The distinct problem of efficient CSIDH implementation with side-channel protection was first tackled by Jalali, Azarderakhsh, Mozaffari Kermani, and Jao [16], and independently by Meyer, Campos, and Reith [21], whose work was improved by Onuki, Aikawa, Yamazaki, and Takagi [26].
The approach of Jalali et al. is similar to that of [3], in that they achieve a stronger notion of constant time (running time independent from all inputs), at the cost of allowing the algorithm to fail with a small probability. In order to make the failure probability sufficiently low, they introduce a large number of useless operations, which make the performance significantly worse than the original CSIDH algorithm. This poor performance and possibility of failure reduces the interest of this implementation; we will not analyze it further here.
Meyer et al. take a different path: the running time of their algorithm is independent of the secret key, but not of the output of an internal random number generator. They claim a speed only times slower than the unprotected algorithm in [22]. Onuki et al. introduced new improvements, claiming a speed-up of over Meyer et al., i.e., a net slow-down factor of compared to [22].
Our contribution.
In this work we take a new look at side-channel protected implementations of CSIDH. We start by reviewing the implementations in [21] and [26]. We highlight a flaw that makes their constant-time claims disputable, and propose a fix for it. Since this fix introduces some minor slow-downs, we report on the performance of the revised algorithms.
Then, we introduce new optimizations to make both [21] and [26] faster: we improve isogeny formulas for the model, and we introduce the use of optimal addition chains in the scalar multiplications. With these improvements, we obtain a version of CSIDH protected against timing and some simple power analysis (SPA) attacks that is 39% more efficient than [21].
Then, we shift our focus to stronger security models. All constant-time versions of CSIDH presented so far use so-called “dummy operations”, i.e., computations whose result is not used, but whose role is to hide the conditional structure of the algorithm from timing and SPA attacks that read the sequence of operations performed from a single power trace. However, this countermeasure is easily defeated by fault-injection attacks, where the adversary may modify values during the computation. We propose a new constant-time variant of CSIDH without dummy operations as a first-line defence. The new version is only twice as slow as the simple constant-time version.
We conclude with a discussion of derandomized variants of CSIDH. The versions discussed previously are “constant-time” in the sense that their running time is uncorrelated to the secret key, however it depends on some (necessarily secret) seed to a PRNG. While this notion of “constant-time” is usually considered good enough for side-channel protection, one may object that a compromise of the PRNG or the seed generation would put the security of the implementation at risk, even if the secret was securely generated beforehand (with an uncomprised PRNG) as part of a long-term or static keypair. We observe that this dependence on additional randomness is not necessary: a simple modification of CSIDH, already considered in isogeny-based signature schemes [11, 14], can easily be made constant-time and free of randomness. Unfortunately this modification requires increasing substantially the size of the base field, and is thus considerably slower and not compatible with the original version. On the positive side, the increased field size makes it much more resistant to quantum attacks, a non-negligible asset in a context where the quantum security of CSIDH is still unclear; it can thus be seen as CSIDH variant for the paranoid.
Organization.
In §2 we briefly recall ideas, algorithms and parameters from CSIDH [6]. In §3 we highlight a shortcoming in [21] and [26] and propose a way to fix it. In §4 we introduce new optimizations compatible with all previous versions of CSIDH. In §5 we introduce a new algorithm for evaluating the CSIDH group action that is resistant against timing and some simple power analysis attacks, while providing protection against some fault injections. Finally, in §6 we discuss a more costly variant of CSIDH with stronger security guarantees.
Notation.
M, S, and A denote the cost of computing a single multiplication, squaring, and addition (or subtraction) in , respectively. We assume that a constant-time equality test is defined, returning if and [math] otherwise. We also assume that a constant-time conditional swap is defined, exchanging if (and not if ).
2 CSIDH
CSIDH is an isogeny based primitive, similar to Diffie–Hellman, that can be used for key exchange and encapsulation [6], signatures [11, 14, 4], and other more advanced protocols. Compared to the other main isogeny-based primitive SIDH [17, 12], CSIDH is slower. On the positive side, CSIDH has smaller public keys, is based on a better understood security assumption, and supports an easy key validation procedure, making it better suited than SIDH for CCA-secure encryption, static-dynamic and static-static key exchange. In this work we will use the jargon of key exchange when we refer to cryptographic concepts.
CSIDH works over a finite field , where is a prime of the special form
[TABLE]
with a set of small odd primes. Concretely, the original CSIDH article [6] defined a 511-bit with the first 73 odd primes, and .
The set of public keys in CSIDH is a subset of all supersingular elliptic curves defined over , in Montgomery form , where is called the -coefficient of the curve.111Following [8], we represent as a projective point ; see §4.1.1.
The endomorphism rings of these curves are isomorphic to orders in the imaginary quadratic field . Castryck et al. [6] choose to restrict the public keys to the horizontal isogeny class of the curve with , so that all endomorphism rings are isomorphic to .
2.1 The class group action
Let be an elliptic curve with . If is a nonzero ideal in , then it defines a finite subgroup , where we identify each with its image in . We then have a quotient isogeny with kernel ; this isogeny and its codomain is well-defined up to isomorphism. If is principal, then and . Hence, we get an action of the ideal class group on the set of isomorphism classes of elliptic curves over with ; this action is faithful and transitive. We write for the image of (the class of) under the action of , which is (the class of) above.
For CSIDH, we are interested in computing the action of small prime ideals. Consider one of the primes dividing ; the principal ideal splits into two primes, namely and , where is the element of mapping to the Frobenius endomorphism of the curves. Since is principal, we have in , and hence
[TABLE]
for all with .
2.2 The CSIDH algorithm
At the heart of CSIDH is an algorithm that evaluates the class group action described above on any supersingular curve over . Cryptographically, this plays the same role as modular exponentiation in classic Diffie–Hellman.
The input to the algorithm is an elliptic curve , represented by its -coefficient, and an ideal class represented by its list of exponents . The output is the (-coefficient of the) elliptic curve .
The isogenies corresponding to can be efficiently computed using Vélu’s formulæ and their generalizations: exploiting the fact that , one looks for a point of order in (i.e., a point that is in the kernels of both the multiplication-by- map and ), computes the isogeny with kernel , and sets . Iterating this procedure lets us compute for any exponent .
The isogenies corresponding to are computed in a similar fashion: this time one looks for a point of order in the kernel of , i.e., a point in of the form where both and are in (since is in and satisfies ). Then one proceeds as before, setting .
In the sequel we assume that we are given an algorithm QuotientIsogeny which, given a curve and a point in , computes the quotient isogeny , and returns the pair . We refer to this operation as isogeny computation. Algorithm 1, taken from the original CSIDH article [6], computes the class group action.
For cryptographic purposes, the exponent vectors must be taken from a space of size at least , where is the (classical) security parameter. The CSIDH-512 parameters in [6] take , and all in the interval , so that , consistent with the NIST-1 security level. With this choice, the implementation of [6] computes one class group action in 40 ms on average. Meyer and Reith [22] further improved this to 36 ms on average. Neither implementation is constant-time.
2.3 The Meyer–Campos–Reith constant-time algorithm
As Meyer, Campos and Reith observe in [21], Algorithm 1 performs fewer scalar multiplications when the key has the same number of positive and negative exponents than it does in the unbalanced case where these numbers differ. Algorithm 1 thus leaks information about the distribution of positive and negative exponents under timing attacks. Besides this, analysis of power traces would reveal the cost of each isogeny computation, and the number of such isogenies computed, which would leak the exact exponents of the private key.
In view of this vulnerability, Meyer, Campos and Reith proposed in [21] a constant-time CSIDH algorithm whose running time does not depend on the private key (though, unlike [16], it still varies due to randomness). The essential differences between the algorithm of [21] and classic CSIDH are as follows. First, to address the vulnerability to timing attacks, they choose to use only positive exponents in for each , instead of in the original version, while keeping the same prime . To mitigate power consumption analysis attacks, their algorithm always computes the maximal amount of isogenies allowed by the exponent, using dummy isogeny computations if needed.
Since these modifications generally produce more costly group action computations, the authors also provide several optimizations that limit the slow-down in their algorithm to a factor of compared to [22]. These include the Elligator 2 map of [2] and [3], multiple batches for isogeny computation (SIMBA), and sample the exponents from intervals of different sizes depending on .
2.4 The Onuki–Aikawa–Yamazaki–Takagi constant-time algorithm
Still assuming that the attacker can perform only power consumption analysis and timing attacks, Onuki, Aikawa, Yamazaki and Takagi proposed a faster constant-time version of CSIDH in [26].
The key idea is to use two points to evaluate the action of an ideal, one in (i.e., in ) and one in (i.e., in with -coordinate in ). This allows them to avoid timing attacks, while keeping the same primes and exponent range as in the original CSIDH algorithm. Their algorithm also employs dummy isogenies to mitigate some power analysis attacks, as in [21]. With these improvements, they achieve a speed-up of compared to [21].
We include pseudo-code for the algorithm of [26] in Algorithm 2, to serve both as a reference for a discussion of a subtle leak in §3 and also as a departure point for our dummy-free algorithm in §5.
3 Repairing constant-time versions based on Elligator
Both [21] and [26] use the Elligator 2 map to sample a random point on the current curve in step 2 of Algorithm 2. Elligator takes as input a random field element and the Montgomery -coefficient from the current curve and returns a pair of points in and respectively.
To avoid a costly inversion of , instead of sampling randomly, Meyer, Campos and Reith222Presumably, Onuki et al. do the same, however their exposition is not clear on this point, and we do not have access to their code. follow [3] and precompute a set of ten pairs ; they try them in order until one that produces a point passing the test in Step 2 is found. When this happens, the algorithm moves to the next curve, and Elligator can keep on using the next precomputed value of , going back to the first value when the tenth has been reached. This is a major departure from [3], where all precomputed values of are tried for each isogeny computation, and the algorithm succeeds if at least one passes the test. And indeed the implementation of [21] leaks information on the secret via the timing channel:333The Elligator optimization is described in §5.3 of [21]. The unoptimized constant-time version described in Algorithm 2 therein is not affected by this problem. since Elligator uses no randomness for , its output only depends on the -coefficient of the current curve, which itself depends on the secret key; but the running time of the algorithm varies and, not being correlated to , it is necessarily correlated to and thus to the secret.
Fortunately this can be easily fixed by (re)introducing randomness in the input to Elligator. To avoid field inversions, we use a projective variant: given and assuming , we write , and we want to determine whether is the abscissa of a projective point on . Plugging into the homogeneous equation
[TABLE]
gives
[TABLE]
We can test the existence of a solution for by computing the Legendre symbol of the right hand side: if it is a square, the points with projective -coordinates
[TABLE]
are in and respectively, otherwise their roles are swapped.
We are left with the case . Following [3], Meyer, Campos and Reith precompute once and for all a pair of generators of and , and output those instead of random points. This choice suffers from a similar issue to the previous one: because the points are output in a deterministic way, the running time of the whole algorithm will be correlated to the number of times the curve is encountered during the isogeny walk.
In practice, is unlikely to ever be encountered in a random isogeny walk, except as the starting curve in the first phase of a key exchange, thus this flaw seems hard to exploit. Nevertheless, we find it not significantly more expensive to use a different approach, also suggested in [3]: with , only on , we define the output of Elligator as when is a square, and we swap the points when is not a square.
We summarize our implementation of Elligator in Algorithm 3, generalizing it to the case of Montgomery curves represented by projective coefficients (see also Section 4.1.1).
We can now prove that Algorithm 2 runs in time independent from the secret vector . First, assume that Elligator always returns points of order , then it is clear that the condition in Line 2 is always true, and thus that the outer loop runs exactly times. The only branching depending on secrets is then the one at Line 2, however the two branches take exactly the same time, thanks to the dummy computations.
In general, we cannot assume that Elligator always returns full order points, however, under reasonable heuristics experimentally verified in [3],
[TABLE]
for any prime and any curve .444Note that the joint probability of and having order divisible by is not independent of , however this will not be a problem in our algorithms. Then, even though the value of in Line 2 depends on the sign of the secret exponent , the probability that the test in Line 2 passes is independent of all secrets.
4 Optimizing constant-time implementations
In this section we propose several optimizations that are compatible with both non-constant-time and constant-time implementations of CSIDH.
4.1 Isogeny and point arithmetic on twisted Edwards curves
In this subsection, we present efficient formulas in twisted-Edwards coordinates for four fundamental operations: point addition, point doubling, isogeny computation (as presented in [25]; cf. §2.2), and isogeny evaluation (i.e. computing the image of a point under an isogeny). Our approach obtains a modest but still noticeable improvement with respect to previous proposals based on Montgomery representation, or hybrid strategies that propound combinations of Montgomery and twisted-Edwards representations [5, 18, 19, 20, 23].
Castryck, Galbraith, and Farashahi [5] proposed using a hybrid representation to reduce the cost of point doubling on certain Montgomery curves, by exploiting the fact that converting between Montgomery and twisted Edwards models can be done at almost no cost. In [23], Meyer, Reith and Campos considered using twisted Edwards formulas for computing isogeny and elliptic curve arithmetic, but concluded that a pure twisted-Edwards-only approach would not be advantageous in the context of SIDH. Bernstein, Lange, Martindale, and Panny observed in [3] that the conversion from Montgomery XZ coordinates to twisted Edwards YZ coordinates occurs naturally during the Montgomery ladder. Kim, Yoon, Kwon, Park, and Hong presented a hybrid model in [19] using Edwards and Montgomery modelds for isogeny computations and point arithmetic, respectively; in [18] and [20], they suggested computing isogenies using a modified twisted Edwards representation that introduces a fourth coordinate .
To the best of our knowledge, the quest for more efficient elliptic curve and isogeny arithmetic than that offered by pure Montgomery and twisted-Edwards-Montgomery representations remains an open problem. As a step forward in this direction, Moody and Shumow [25] showed that when dealing with isogenies of odd degree with twisted Edwards representation offers a cheaper formulation for isogeny computation than the corresponding one using Montgomery curves; nevertheless, they did not address the problem of getting a cheaper twisted Edwards formulation for the isogeny evaluation operation.
4.1.1 Montgomery curves
A Montgomery curve [24] is defined by the equation , such that and (we often write for ). We refer to [9] for a survey on Montgomery curves. When performing isogeny computations and evaluations, it is often more convenient to represent the constant in the projective space as such that Montgomery curves are attractive because they are exceptionally well-suited to performing the differential point addition operation which computes from , , and . Equations (1) and (2) describe the differential point doubling and addition operations proposed by Montgomery in [24]:
[TABLE]
Montgomery curves can be used to efficiently compute isogenies using Vélu’s formulas [30]. Suppose we want the image of a point under an -isogeny , where . For each we let , where . Equation (3) computes from .
[TABLE]
4.1.2 Twisted Edwards curves
In [1] we see that every Montgomery curve is birationally equivalent to a twisted Edwards curve ; the curve constants are related by
[TABLE]
and the rational maps and are defined by
[TABLE]
Rewriting this relationship for Montgomery curves with projective constants, is equivalent to the Montgomery curve with constants
[TABLE]
To avoid notational ambiguities, we write for the projection of the -coordinate of the point . Let . In projective coordinates, the map of (4) becomes
[TABLE]
Comparing (5) with (1) reveals that and appear in the doubling formula, so we can substitute them at no cost. Replacing and with their twisted Edwards equivalents and , respectively, we obtain a doubling formula for twisted Edwards coordinates:
[TABLE]
Similarly, the coordinates and appear in (2), and thus we derive differential addition formulas for twisted Edwards coordinates:
[TABLE]
The computational costs of doubling and differential addition are (the same as evaluating (1)) and (the same as (2)), respectively.
The Moody–Shumow formulas for isogeny computation [25] are given in terms of twisted Edwards -coordinates. It remains to derive a twisted Edwards -coordinate isogeny-evaluation formula for -isogenies where . We do this by applying the map in (5) to (3), which yields
[TABLE]
The main advantage of the approach outlined here is that by only using points given in coordinates, we can compute point doubling, point addition and isogeny construction and evaluation at a lower computational cost. Indeed, isogeny evaluation in costs , whereas the above coordinate formula costs , thus saving field additions.
4.2 Addition chains for a faster scalar multiplication
Since the coefficients in CSIDH scalar multiplications are always known in advance (they are essentially system parameters), there is no need to hide them by using constant-time scalar multiplication algorithms such as the classical Montgomery ladder. Instead, we can use shorter differential addition chains.555A differential addition chain is an addition chain such that for every chain element computed as , the difference is already present in the chain.
In the CSIDH group action computation, any given scalar is the product of a subset of the collection of the small primes dividing . We can take advantage of this structure to use shorter differential addition chains than those we might derive for general scalars of a comparable size. First, we pre-computed the shortest differential addition chains for each one of the small primes . One then computes the scalar multiplication operation as the composition of the differential addition chains for each prime dividing .
Power analysis on the coefficient computation might reveal the degree of the isogeny that is currently being computed, but, since we compute exactly one -isogeny for each per loop, this does not leak any secret information.
This simple trick allows us to compute scalar multiplications using differential addition chains of length roughly . This yields a saving of about 25% compared with the cost of the classical Montgomery ladder.
5 Removing dummy operations for fault-attack resistance
The use of dummy operations in the previous constant-time algorithms implies that the attacker can obtain information on the secret key by injecting faults into variables during the computation. If the final result is correct, then she knows that the fault was injected in a dummy operation; if it is incorrect, then the operation was real. For example, if one of the values in Line 2 of Algorithm 2 is modified without affecting the final result, then the adversary learns whether the corresponding exponent was zero at that point.
Fault injection attacks have been considered in the context of SIDH ([15], [29]), but to the best of our knowledge, they have not been studied yet on dummy operations in the context of CSIDH. Below we propose an approach to constant-time CSIDH without dummy computations, making every computation essential for a correct final result. This gives us some natural resistance to faults, at the cost of approximately a twofold slowdown.
Our approach to avoiding fault-injection attacks is to change the format of secret exponent vectors . In both the original CSIDH and the Onuki et al. variants, the exponents are sampled from an integer interval centered in [math]. For naive CSIDH, evaluating the action of requires evaluating between [math] and isogenies, corresponding to either the ideal (for positive ) or (for negative ). If we follow the approach of [26], then we must also compute dummy -isogenies to ensure a constant-time behaviour.
For our new algorithm, the exponents are uniformly sampled from sets
[TABLE]
i.e., centered intervals containing only even or only odd integers. The interesting property of these sets is that a vector drawn from can always be rewritten (in a non-unique way) as a sum of vectors with entries (i.e., vectors in ). But the action of a vector drawn from can clearly be implemented in constant-time without dummy operations: for each coefficient , we compute and evaluate the isogeny associated to if , or the one associated to if . Thus, we can compute the action of vectors drawn from by repeating times this step.
More generally, we want to evaluate the action of vectors drawn from . Algorithm 4 achieves this in constant-time and without using dummy operations. The outer loop at line 4 is repeated exactly times, but the inner “if” block at line 4 is only executed times for each ; it is clear that this flow does not depend on secrets. Inside the “if” block, the coefficients are implicitly interpreted as
[TABLE]
i.e., the algorithm starts by acting by for iterations, then alternates between and for iterations. We assume that the operation is implemented in constant time, and that . If one is careful to implement the isogeny evaluations in constant-time, then it is clear that the full algorithm is also constant-time.
However, Algorithm 4 is only an idealized version of the CSIDH group action algorithm. Indeed, like in [21, 26], it may happen in some iterations that Elligator outputs points of order not divisible by , and thus the action of or cannot be computed in that iteration. In this case, we simply skip the loop and retry later: this translates into the variable not being decremented, so the total number of iterations may end up being larger than . Like in Section 3, if the input value fed to Elligator is random, its output is uncorrelated to secret values666Assuming the usual heuristic assumptions on the distribution of the output of Elligator, see [21]., and thus the fact that an iteration is skipped does not leak information on the secret. The resulting algorithm is summarized in Algorithm 5.
To maintain the security of standard CSIDH, the bounds must be chosen so that the key space is at least as large. For example, the original implementation [6] samples secrets in , which gives a key space of size ; hence, to get the same security we would need to sample secrets in . But a constant-time version of CSIDH à la Onuki et al. only needs to evaluate five isogeny steps per prime , whereas the present variant would need to evaluate ten isogeny steps. We thus expect an approximately twofold slowdown for this variant compared to Onuki et al., which is confirmed by our experiments.
6 Derandomized CSIDH algorithms
As we stressed in Section 3, all of the algorithms presented here depend on the availability of high-quality randomness for their security. Indeed, the input to Elligator must be randomly chosen to ensure that the total running time is uncorrelated to the secret key. Typically, this would imply the use of a PRNG seeded with high quality true randomness that must be kept secret. An attack scenario where the attacker may know the output of the PRNG, or where the quality of PRNG output is less than ideal, therefore degrades the security of all algorithms. This is true even when the secret was generated with a high-quality PRNG if the keypair is static, and the secret key is then used by an algorithm with low-quality randomness.
We can avoid this issue completely if points of order , where is the maximum possible exponent (in absolute value) for , are available from the start. Unfortunately this is not possible with standard CSIDH, because such points are defined over field extensions of exponential degree.
Instead, we suggest modifying CSIDH as follows. First, we take a prime such that , where is a security parameter, and we restrict to exponents of the private key sampled from . Then, we compute two points of order on the starting public curve, one in and the other in , where is the Frobenius endomorphism. This computation involves no secret information and can be implemented in variable-time; furthermore, if the starting curve is the initial curve with , or a public curve corresponding to a long term secret key, these points can be precomputed offline and attached to the system parameters or the public key. We also remark that even for ephemeral public keys, a point of order must be computed anyway for key validation purposes, and thus this computation only slows down key validation by a factor of two.
Since we have restricted exponents to , every -isogeny in Algorithm 2 can be computed using only (the images of) the two precomputed points. There is no possibility of failure in the test of Line 2, and no need to sample any other point.
We note that this algorithm still uses dummy operations. If fault-injection attacks are a concern, the exponents can be further restricted to , and the group action evaluated as in (a stripped down form of) Algorithm 5. However this further increases the size of , as must now be equal to .
This protection comes at a steep price: at the 128 bits security level, the prime goes from 511 bits to almost 1500. The resulting field arithmetic would be considerably slower, although the global running time would be slightly offset by the smaller number of isogenies to evaluate.
On the positive side, the resulting system would have much stronger quantum security. Indeed, the best known quantum attacks are exponential in the size of the key space ( here), but only subexponential in (see [7, 13, 6]). Since our modification more than doubles the size of without changing the size of the key space, quantum security is automatically increased. For this same reason, for security levels beyond NIST-1 (64 quantum bits of security), the size of increases more than linearly in , and the variant proposed here becomes natural. Finally, parameter sets with a similar imbalance between the size of and the security parameter have already been considered in the context of isogeny based signatures [11], where they provide tight security proofs in the QROM.
Hence, while at the moment this costly modification of CSIDH may seem overkill, we believe further research is necessary to try and bridge the efficiency gap between it and the other side-channel protected implementations of CSIDH.
7 Experimental results
Tables 1 and 2 summarize our experimental results, and compare our algorithms with those of [6], [21], and [26]. Table 1 compares algorithms in terms of elementary field operations, while Table 2 compares cycle counts of C implementations. All of our experiments were ran on a Intel(R) Core(TM) i7-6700K CPU 4.00GHz machine with 16GB of RAM. Turbo boost was disabled. The software environment was the Ubuntu 16.04 operating system and gcc version 5.5.
In all of the algorithms considered here (except the original [6]), the group action is evaluated using the SIMBA method (Splitting Isogeny computations into Multiple BAtches) proposed by Meyer, Campos, and Reith in [21]. Roughly speaking, SIMBA-- partitions the set of primes into disjoint subsets (batches) of approximately the same size. SIMBA-- proceeds by computing isogenies for each batch ; after steps, the unreached primes from each batch are merged.
Castryck et al.
We used the reference CSIDH implementation made available for download by the authors of [6]. None of our countermeasures or algorithmic improvements were applied.
Meyer–Campos–Reith.
We used the software library freely available from the authors of [21]. This software batches isogenies using SIMBA-5-11. The improvements we describe in §3 and §4 were not applied.
Onuki et. al.
Unfortunately, the source code for the implementation in [26] was not freely available, so direct comparison with our implementation was impossible. Table 1 includes their field operation counts for their unmodified algorithm using SIMBA-3-8. We did not apply the optimizations of §4 here. (We do not replicate the cycle counts from [26] in Table 2, since they may have been obtained using turbo boost, thus rendering any comparison invalid.)
Our implementations.
We implemented three constant-time CSIDH algorithms, using the standard primes with the exponent bounds from [26, §5.2].
MCR-style
This is essentially our version of Meyer–Campos–Reith (with one torsion point and dummy operations, batching isogenies with SIMBA-5-11), but applying the techniques of §3 and §4.
OAYT-style
This is essentially our version of Onuki et. al. (using two torsion points and dummy operations, batching isogenies with SIMBA-3-8), but applying the techniques of §3 and §4.
No-dummy
This is Algorithm 5 (with two torsion points and no dummy operations), batching isogenies using SIMBA-5-11.
In each case, the improvements and optimizations of §3-4 are applied, including projective Elligator, short differential addition chains, and twisted Edwards arithmetic and isogenies. Our software library is freely available from
https://github.com/JJChiDguez/csidh .
The field arithmetic is based on the Meyer–Campos–Reith software library [21]; since the underlying arithmetic is essentially identical, the performance comparisons below reflect differences in the CSIDH algorithms.
Results.
We see in Table 2 that the techniques we introduced in §3 and §4 produce substantial savings compared with the implementation of [21]. In particular, our OAYT-style implementation yields a 39% improvement over [21]. Since the implementations use the same underlying field arithmetic library, these improvements are entirely due to the techniques introduced in this paper. While our no-dummy variant is (unsurprisingly) slower, we see that the performance penalty is not prohibitive: it is less than twice as slow as our fastest dummy-operation algorithm, and only 22% slower than [21].
8 Conclusion and perspectives
We studied side-channel protected implementations of the isogeny based primitive CSIDH. Previous implementations failed at being constant time because of a subtle mistake. We fixed the problem, and proposed new improvements, to achieve the most efficient version of CSIDH protected against timing and simple power analysis attacks to date. All of our algorithms were implemented in C, and the source made publicly available online.
We also studied the security of CSIDH in stronger attack scenarios. We proposed a protection against some fault-injection and timing attacks that only comes at a cost of a twofold slowdown. We also sketched an alternative version of CSIDH “for the paranoid”, with much stronger security guarantees, however at the moment this version seems too costly for the security benefits; more work is required to make it competitive with the original definition of CSIDH.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Daniel J. Bernstein, Peter Birkner, Marc Joye, Tanja Lange, and Christiane Peters. Twisted Edwards curves. In Serge Vaudenay, editor, Progress in Cryptology - AFRICACRYPT 2008 , volume 5023 of Lecture Notes in Computer Science , pages 389–405. Springer, 2008.
- 2[2] Daniel J. Bernstein, Mike Hamburg, Anna Krasnova, and Tanja Lange. Elligator: elliptic-curve points indistinguishable from uniform random strings. In 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS’13, Berlin, Germany, November 4-8, 2013 , pages 967–980, 2013.
- 3[3] Daniel J. Bernstein, Tanja Lange, Chloe Martindale, and Lorenz Panny. Quantum circuits for the CSIDH: optimizing quantum evaluation of isogenies. In Advances in Cryptology - EUROCRYPT 2019 - 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Darmstadt, Germany, May 19-23, 2019, Proceedings, Part II , pages 409–441, 2019.
- 4[4] Ward Beullens, Thorsten Kleinjung, and Frederik Vercauteren. CSI-Fi Sh: Efficient isogeny based signatures through class group computations. IACR Cryptology e Print Archive , 2019:498, 2019.
- 5[5] Wouter Castryck, Steven D. Galbraith, and Reza Rezaeian Farashahi. Efficient arithmetic on elliptic curves using a mixed Edwards–Montgomery representation. Cryptology e Print Archive , 2008:218, 2008.
- 6[6] Wouter Castryck, Tanja Lange, Chloe Martindale, Lorenz Panny, and Joost Renes. CSIDH: an efficient post-quantum commutative group action. In Advances in Cryptology - ASIACRYPT 2018 - 24th International Conference on the Theory and Application of Cryptology and Information Security, Brisbane, QLD, Australia, December 2-6, 2018, Proceedings, Part III , pages 395–427, 2018.
- 7[7] Andrew M. Childs, David Jao, and Vladimir Soukharev. Constructing elliptic curve isogenies in quantum subexponential time. J. Mathematical Cryptology , 8(1):1–29, 2014.
- 8[8] Craig Costello, Patrick Longa, and Michael Naehrig. Efficient algorithms for supersingular isogeny Diffie–Hellman. In Advances in Cryptology - CRYPTO 2016 - 36th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 14-18, 2016, Proceedings, Part I , pages 572–601, 2016.
