Algebraic aspects of solving Ring-LWE, including ring-based improvements   in the Blum-Kalai-Wasserman algorithm

Katherine E. Stange

arXiv:1902.07140·cs.CR·July 14, 2020

Algebraic aspects of solving Ring-LWE, including ring-based improvements in the Blum-Kalai-Wasserman algorithm

Katherine E. Stange

PDF

TL;DR

This paper introduces a reduction technique for Ring-LWE problems using ring-structured samples and proposes Ring-BKW, a ring-aware variant of the BKW algorithm, enhancing efficiency and parallelization for cryptographic applications.

Contribution

It presents a novel reduction of Ring-LWE to subring problems and introduces Ring-BKW, a ring-structured BKW algorithm that improves efficiency and parallelization.

Findings

01

Reduction of Ring-LWE to subring problems using restricted samples

02

Ring-BKW algorithm respects ring structure and enables parallelization

03

Exploits symmetry to reduce computational resources

Abstract

We provide a reduction of the Ring-LWE problem to Ring-LWE problems in subrings, in the presence of samples of a restricted form (i.e. $(a, b)$ such that $a$ is restricted to a multiplicative coset of the subring). To create and exploit such restricted samples, we propose Ring-BKW, a version of the Blum-Kalai-Wasserman algorithm which respects the ring structure. Off-the-shelf BKW dimension reduction (including coded-BKW and sieving) can be used for the reduction phase. Its primary advantage is that there is no need for back-substitution, and the solving/hypothesis-testing phase can be parallelized. We also present a method to exploit symmetry to reduce table sizes, samples needed, and runtime during the reduction phase. The results apply to two-power cyclotomic Ring-LWE with parameters proposed for practical use (including all splitting types).

Tables1

Table 1. Table 1. The term “Ring-blind” refers to Algorithm 2 but with j = 0 𝑗 0 j=0 to 0 0 in line 3, i.e. without rotating any initial samples. A fixed list of samples was generated pseudorandomly for each experiment; ‘Initial Samples’ refers to how many were used from the beginning of the list. ‘Reduced Samples’ refers to the number of samples eventually contained in the last table. ‘Runtime’ refers to the wall time as measured in Sage Mathematics Software. ‘Table Size’ refers to the total number of rows stored not counting the final samples. ‘OD’ (One Difference) refers to the algorithms as presented in the paper. ‘AD’ (All Differences) refers to a modification in which every sample that matches a row is also stored in that row, and when a match is found, the differences with everything in the row are passed on.

$n = 2^{3}$ , $B = 2^{2}$ , $q = 211$	Ring-blind	Algorithm 2	Algorithm 3
Initial Samples	$4000 \cdot 2^{3}$	$4000$	$4000$
OD Table Size	$31999$	$31996$	$7999$
OD Reduced Samples	$1$	$1$	$1$
OD Runtime	$1.43$ s	$1.80$ s	$2.42$ s
AD Table Size	$31999$	$31996$	$7999$
AD Reduced Samples	$1$	$1$	$1$
AD Runtime	$1.73$ s	$1.92$ s	$2.46$ s
$n = 2^{4}$ , $B = 2^{2}$ , $q = 17$	Ring-blind	Algorithm 2	Algorithm 3
Initial Samples	$2000 \cdot 2^{4}$	$2000$	$2000$
OD Table Size	$31985$	$31988$	$7997$
OD Reduced Samples	$15$	$3$	$3$
OD Runtime	$6.13$ s	$6.24$ s	$4.69$ s
AD Table Size	$36448$	$36623$	$9163$
AD Reduced Samples	$81$	$21$	$21$
AD Runtime	$8.43$ s	$9.73$ s	$6.27$ s
$n = 2^{5}$ , $B = 2^{2}$ , $q = 7$	Ring-blind	Algorithm 2	Algorithm 3
Samples	$200 \cdot 2^{5}$	$200$	$200$
OD Table Size	$6368$	$6386$	$1596$
OD Reduced Samples	$31$	$13$	$4$
OD Runtime	$7.39$ s	$8.07$ s	$4.23$ s
$n = 2^{6}$ , $B = 2^{3}$ , $q = 3$	Ring-blind	Algorithm 2	Algorithm 3
Samples	$250 \cdot 2^{6}$	$250$	$250$
OD Table Size	$15988$	$15993$	$1998$
OD Reduced Samples	$12$	$7$	$2$
OD Runtime	$27.0$ s	$29.0$ s	$10.2$ s

Equations69

⟨ α, β ⟩ = σ R \sum σ (α β) + \frac{1}{2} σ C \sum Re (σ (α) \overline{σ (β)}) .

⟨ α, β ⟩ = σ R \sum σ (α β) + \frac{1}{2} σ C \sum Re (σ (α) \overline{σ (β)}) .

ρ_{r} : K_{R} \to (0, 1], ρ_{r} (x) = exp (- π ∣∣ x ∣ ∣^{2} / r^{2}) .

ρ_{r} : K_{R} \to (0, 1], ρ_{r} (x) = exp (- π ∣∣ x ∣ ∣^{2} / r^{2}) .

ρ_{r} (L) = λ \in L \sum ρ_{r} (λ)

ρ_{r} (L) = λ \in L \sum ρ_{r} (λ)

\frac{ρ _{r} ( λ )}{ρ _{r} ( L )} .

\frac{ρ _{r} ( λ )}{ρ _{r} ( L )} .

R_{q} ≅ i = 1 ⨁ g R / q_{i}^{e_{i}} .

R_{q} ≅ i = 1 ⨁ g R / q_{i}^{e_{i}} .

K, R, q, R_{q}, n

K, R, q, R_{q}, n

m, ζ := ζ_{m}, χ, χ_{0}, E_{χ_{0}}

m, ζ := ζ_{m}, χ, χ_{0}, E_{χ_{0}}

R = Z [ζ_{2 n}] = Z [x] / (x^{n} + 1) .

R = Z [ζ_{2 n}] = Z [x] / (x^{n} + 1) .

1, ζ_{m}, ζ_{m}^{2}, \dots, ζ_{m}^{n - 1} .

1, ζ_{m}, ζ_{m}^{2}, \dots, ζ_{m}^{n - 1} .

R_{q} = R / q R ≅ (Z / q Z) [x] / (x^{n} + 1),

R_{q} = R / q R ≅ (Z / q Z) [x] / (x^{n} + 1),

R / a ≅ F_{q} [x] / (g (x))

R / a ≅ F_{q} [x] / (g (x))

ρ : R_{q} \to R / a .

ρ : R_{q} \to R / a .

χ_{0}^{'} = i = 0 \sum n / k - 1 ρ (ζ_{m}^{k})^{i} χ_{0} .

χ_{0}^{'} = i = 0 \sum n / k - 1 ρ (ζ_{m}^{k})^{i} χ_{0} .

ord_{2} (q^{2^{i}} - 1) = r + i

ord_{2} (q^{2^{i}} - 1) = r + i

ρ (ζ_{m}^{ik + j}) = ρ (ζ_{m}^{k})^{i} ρ (ζ_{m}^{j}) = ρ (ζ_{m}^{k})^{i} ζ_{m}^{j} .

ρ (ζ_{m}^{ik + j}) = ρ (ζ_{m}^{k})^{i} ρ (ζ_{m}^{j}) = ρ (ζ_{m}^{k})^{i} ζ_{m}^{j} .

χ_{0}^{'} = χ_{0} + ρ (ζ_{m}^{k}) χ_{0} .

χ_{0}^{'} = χ_{0} + ρ (ζ_{m}^{k}) χ_{0} .

Tr_{S}^{R} (x) mod q S = Tr_{S_{q}}^{R_{q}} (x mod q R) .

Tr_{S}^{R} (x) mod q S = Tr_{S_{q}}^{R_{q}} (x mod q R) .

s^{'} = \frac{T ( a _{0} s )}{T ( a _{0} )} .

s^{'} = \frac{T ( a _{0} s )}{T ( a _{0} )} .

T (a s) = a^{'} T (a_{0} s) .

T (a s) = a^{'} T (a_{0} s) .

(T (a), T (a s + e)) = (a^{'} T (a_{0}), a^{'} T (a_{0}) (\frac{T ( a _{0} s )}{T ( a _{0} )}) + T (e))

(T (a), T (a s + e)) = (a^{'} T (a_{0}), a^{'} T (a_{0}) (\frac{T ( a _{0} s )}{T ( a _{0} )}) + T (e))

R_{q}

R_{q}

= (Z + ζ_{k} Z + \dots + ζ_{k}^{k /2 - 1} Z) + ζ_{m} (Z + ζ_{k} Z + \dots + ζ_{k}^{k /2 - 1} Z)

+ \dots + ζ_{m}^{m / k - 1} (Z + ζ_{k} Z + \dots + ζ_{k}^{k /2 - 1} Z)

= S_{q} + ζ_{m} S_{q} + \dots + ζ_{m}^{m / k - 1} S_{q} .

Tr_{S_{q}}^{R_{q}} (ζ_{m}^{i})

Tr_{S_{q}}^{R_{q}} (ζ_{m}^{i})

= ζ_{m}^{i} a = 0 \sum m / k - 1 ζ_{m}^{iak}

\displaystyle=\left\{\begin{array}[]{ll}0&i\not\equiv 0~{}(\textup{mod}~{}\frac{m}{k})\\ \frac{m}{k}\zeta_{m}^{i}&i\equiv 0~{}(\textup{mod}~{}\frac{m}{k})\\ \end{array}\right..

\frac{1}{2}T^{R_{q}}_{S_{q}}(\zeta_{m}^{i})=\left\{\begin{array}[]{ll}0&i\equiv 1~{}(\textup{mod}~{}2)\\ \zeta_{m}^{i}&i\equiv 0~{}(\textup{mod}~{}2)\end{array}\right..

\frac{1}{2}T^{R_{q}}_{S_{q}}(\zeta_{m}^{i})=\left\{\begin{array}[]{ll}0&i\equiv 1~{}(\textup{mod}~{}2)\\ \zeta_{m}^{i}&i\equiv 0~{}(\textup{mod}~{}2)\end{array}\right..

\frac{T ( a _{0} ζ ^{i} s )}{T ( a _{0} )}

\frac{T ( a _{0} ζ ^{i} s )}{T ( a _{0} )}

(\frac{k}{m} T (a), \frac{k}{m} T (ζ^{i} b))

(\frac{k}{m} T (a), \frac{k}{m} T (ζ^{i} b))

(\frac{k}{m} T (a), \frac{k}{m} T (a ζ^{i} s + ζ^{i} e))

= (a^{'} \frac{k}{m} T (a_{0}), a^{'} \frac{k}{m} T (a_{0}) \cdot (\frac{T ( a _{0} ζ ^{i} s )}{T ( a _{0} )}) + \frac{k}{m} T (ζ^{i} e)),

c_{j} := \frac{T ( a _{0} ζ ^{j} s )}{T ( a _{0} )} .

c_{j} := \frac{T ( a _{0} ζ ^{j} s )}{T ( a _{0} )} .

T (a_{0} ζ^{j} s) = c_{j} T (a_{0}), j = 0, \dots, m / k - 1.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Algebraic aspects of solving Ring-LWE, including ring-based improvements in the Blum-Kalai-Wasserman algorithm

Katherine E. Stange

Department of Mathematics, University of Colorado, Campux Box 395, Boulder, Colorado 80309-0395

[email protected]

Abstract.

We provide a reduction of the Ring-LWE problem to Ring-LWE problems in subrings, in the presence of samples of a restricted form (i.e. $(a,b)$ such that $a$ is restricted to a multiplicative coset of the subring). To create and exploit such restricted samples, we propose Ring-BKW, a version of the Blum-Kalai-Wasserman algorithm which respects the ring structure. Off-the-shelf BKW dimension reduction (including coded-BKW and sieving) can be used for the reduction phase. Its primary advantage is that there is no need for back-substitution, and the solving/hypothesis-testing phase can be parallelized. We also present a method to exploit symmetry to reduce table sizes, samples needed, and runtime during the reduction phase. The results apply to two-power cyclotomic Ring-LWE with parameters proposed for practical use (including all splitting types).

Key words and phrases:

Ring learning with errors, Ring-LWE, Blum-Kalai-Wasserman, post-quantum cryptography, cyclotomic field

2010 Mathematics Subject Classification:

Primary: 94A60, 11T71, 11R18

This research was supported by NSF-CAREER CNS-1652238 and NSF EAGER DMS-1643552.

1. Introduction

Ring Learning with Errors (Ring-LWE) [24] [25], and Learning with Errors (LWE) [27] more generally, are leading candidates for post-quantum cryptography. The cryptographic hard problem (Search Ring-LWE) is formally similar to discrete logarithm problems, so that protocols can be transferred from the latter context to the former. But it also allows for new applications, such as homomorphic encryption [8]. Ring-LWE is also fortunate in having security reductions from other lattice problems.

Ring-LWE is distinguished from Learning with Errors (LWE) by the use of lattices from number fields. This injection of number-theoretical structure leads to performance improvements, but may add vulnerabilities. So far, the number-theoretical structure has been only weakly exploited for attacks. The ring structure plays a role in security when the error distribution is skewed [9] [10] [11] [15] [16], or the secret is chosen from a subring or other ring-related non-uniform distribution [7]. In the related NTRU cryptosystem, the norm and trace maps to subfields play a role in attacks [1, 12, 17, 23].

However, the best known attacks on Ring-LWE parameters suggested for implementation are still generic attacks for LWE, e.g. [3]. The Blum-Kalai-Wasserman (BKW) algorithm is one such attack, which proceeds (in the first phase) combinatorially to create new samples in a linear subspace of the original problem, while controlling error expansion [5]. BKW has the drawback of requiring exponentially many samples, unless sample amplification is used [21]. Nevertheless, its performance has been of significant interest: for analysis and recent improvements, see [2] [14] [18] [19] [20] [22]. (Note that sample amplification does not immediately transfer from the LWE to the Ring-LWE setting, at least if one wishes the amplified samples to have Ring-LWE, and not just underlying LWE, format; the analogue would be the ‘sample rotation’ described below.)

This paper focuses on two-power-cyclotomic unital (but equivalently, dual [13] [26]) Search Ring-LWE, with no restriction on the splitting behaviour of the prime $q$ . The core of the paper is a reduction from higher-dimensional Ring-LWE problems with samples of a restricted form, to lower-dimensional Ring-LWE problems with the same error width, which is given in Theorem 5.2. The restricted form is as follows: samples $(a,b)$ such that $a$ lies in a cyclotomic subring, or a fixed multiplicative coset of such a subring. In the context of these theorems, it is natural to ask about creating samples of this restricted form using a ring variant of the Blum-Kalai-Wasserman algorithm.

One thus obtains a Ring-BKW algorithm, which uses the reduction phase of BKW, including all known speedups, to reduce the Ring-LWE problem to a subring. Then, the symmetry of the ring structure allows us to engineer an entire suite of subring problems in polynomially more time, whose solutions collectively solve the original Ring-LWE problem, again in polynomial time. Thus, the ‘hypothesis testing’ phase of BKW is parallelized, and the exponential ‘back-substitution’ phase is eliminated (Theorem 5.2). State-of-the-art off-the-shelf code for the BKW reduction phase and hypothesis testing phase may be used. Note that the reduction phase of BKW is the dominant phase for runtime, and hypothesis testing is typically polynomial, but the now-eliminated back-substitution phase runs in time which is also exponential, but differs only by a smaller polynomial factor from the reduction phase; hence the overall runtime savings is a polynomial factor. In Section 8, we describe the Ring-BKW algorithm.

The paper also addresses the use of symmetry to reduce the table sizes in BKW, here termed advanced keying in Section 9. Compared to a BKW reduction phase completely blind to the ring structure, this reduces the table size and samples needed by a factor of the block size, as well as reducing runtime, but requires that block sizes be taken to be a (possibly varying) power of $2$ .

We also discuss a square-root speedup over exhaustive search (which may be used, for example, in hypothesis testing); see Corollary 5.3.

See Section 10 for more discussion of practical runtime.

The key theoretical properties which are potentially advantageous (to an attacker) of Ring-LWE vs. plain LWE, are:

(1)

Ring homomorphisms into smaller instances of the problem (the main tool of [10] [11] [15] [16]). 2. (2)

The ability to rotate samples, e.g. replacing $(a,b)$ with $(\zeta a,\zeta b)$ or $(a,\zeta b)$ , which are different but related Ring-LWE samples (see notation in Section 2); these represent symmetries of the lattice (previously used in lattice sieving [6] [28]; more generally, manipulation of samples by multiplication was exploited in [4]). 3. (3)

The existence of subrings as linear subspaces (which is important in [7]). 4. (4)

More generally, the multiplicative structure of certain linear subspaces. 5. (5)

In the case of 2-power cyclotomics, the orthogonality of the lattice of the ring of integers and the orthogonal nature of the trace.

For us, all five of these attributes play an important role. It is a secondary purpose of this paper to lay out these advantages in a clear manner, to facilitate future analysis of the security of ring aspects of Ring-LWE. See Section 4.

Finally, it is also a secondary purpose of this paper to provide a treatment of the Ring-LWE problem which is inviting to the mathematical community.

Code demonstrating the correctness of the algorithm is available at:

https://math.katestange.net/code/ring-bkw/.

Acknowledgements

First, I would like to thank the anonymous referees on an earlier draft of this paper, who pointed out an important simplification. Second, I would like to thank my mother, Ursula Stange, and my husband, Jonathan Wise, without whose childcare help in the face of snowstorms, viruses, cancellations and fender-benders, this paper simply would not have been completed. To mathematician moms (and dads) everywhere: take heart.

2. Background and Setup for Ring-LWE

It is typical to set notation for Ring-LWE as in, for example, [7]; here we briefly review this notation in our context, and define the Ring-LWE problems.

2.1. Number field $K$ and ring $R$

Let $K$ be a number field over the rationals, of degree $n$ . Then $K$ is equipped with a bilinear form given by a modification of the trace pairing,

[TABLE]

Here the sums are over real and complex embeddings, respectively (note that including both elements of each pair of conjugate complex embeddings necessitates the factor of $\frac{1}{2}$ ). This gives an isomorphism of $K_{\mathbb{R}}:=\mathbb{R}\otimes_{\mathbb{Q}}K$ with $\mathbb{R}^{n}$ , taking the pairing above to the standard inner product, and (1) is chosen in such a way that the isomophism is exactly that arising from the Minkowski or canonical embedding of algebraic number theory. We can also denote the norm by $||\mathbf{x}||=\sqrt{\langle\mathbf{x},\mathbf{x}\rangle}$ .

The ring of integers $R$ of $K$ forms a lattice in $K_{\mathbb{R}}$ .

2.2. Gaussian distribution

Having geometry (in particular a norm $||\cdot||^{2}$ ) on $K_{\mathbb{R}}$ allows us to define Gaussian distributions. For a Gaussian parameter $r>0$ , we write

[TABLE]

Normalizing this to obtain a probability distribution function $r^{-n}\rho_{r}$ , we obtain the continuous Gaussian probability distribution of width $r$ on $K_{\mathbb{R}}$ , denoted $D_{r}$ .

Note that, when considered with respect to an orthonormal basis, such a distribution is the sum of independent distributions in each coordinate, each having width $r$ . In this paper, we are concerned exclusively with this case.

With this normalization, the variance is $r^{2}/2\pi$ , and one standard deviation is $r/\sqrt{2\pi}$ . It is a sum of independent Gaussians in each coordinate for which the range $[-r,r]$ corresponds to $\sqrt{\pi/2}\sim 1.25\ldots$ standard deviations.

In practice, the tails of the Gaussian may be cut off, so that the number of possible values in each coordinate is finite.

One may discretize a Gaussian distribution to obtain a distribution $\mathcal{D}_{r}$ on a lattice $\mathcal{L}\subset K_{\mathbb{R}}$ . That is, one takes

[TABLE]

and one samples element $\lambda\in\mathcal{L}$ with probability

[TABLE]

If $\mathcal{L}$ has an orthonormal basis, then again this distribution consists of independent distributions on the coefficients of the basis.

2.3. Prime $q$ and quotient ring $R_{q}$

Let $qR$ be the ideal generated by $q$ in $R$ . The fundamental setting of the Ring-LWE problem is the ring $R_{q}:=R/qR$ .

Letting $q=\mathfrak{q}_{1}^{e_{1}}\cdots\mathfrak{q}_{g}^{e_{g}}$ be the unique decomposition of $q$ into distinct prime ideals $\mathfrak{q}_{i}$ in $R$ , the Chinese remainder theorem gives

[TABLE]

If $q$ is unramified (which is typically the case), then $e_{i}=1$ for all $i$ . If $K$ is Galois (also typically the case in the cryptographic setting), then the Galois group acts transitively on the $\mathfrak{q}_{i}$ and they all have the same residue degree (the residue degree is the dimension of the quotient field $R/\mathfrak{q}_{i}$ as an $\mathbb{F}_{q}$ -vector space).

2.4. Ring-LWE distributions

For any $s\in R_{q}$ (the secret), and any distribution $\psi$ over $R_{q}$ (the error distribution), we write $A_{s,\psi}$ for the associated Ring-LWE distribution for secret $s$ over $R_{q}\times R_{q}$ , given by sampling $a$ uniformly over $R_{q}$ , sampling $e$ from $\psi$ , and outputting $(a,b:=as+e)$ .

Such outputs $(a,b)$ are called samples, and in a crytographic application, these are observed publicly, while the secret is not meant to be exposed.

For the error distribution, we wish to define a ‘small’ distribution on $R_{q}$ , i.e. concentrated near the origin (in comparison to $q$ , which is large). It is typical to choose for the error distribution a discretized Gaussian distribution as described above (considered post factum modulo $qR$ ). This is the context in which security reductions apply. In implementations, it is sometimes suggested to approximate this by a uniform distribution on a box around the origin, etc.

2.5. Ring-LWE problems

The two fundamental Ring-LWE problems are (a) search: to compute the secret, upon observing sufficiently many samples; or (b) decision: to determine if the samples are hiding a secret at all, as opposed to being random noise. We state them more formally as follows.

Definition 2.1.

The search Ring-LWE problem, for error distribution $\psi$ and secret distribution $\varphi$ , is as follows: Given an error distribution $\psi$ over $R_{q}$ and a secret distribution $\varphi$ over $R_{q}$ , and some number of samples drawn from the distribution $A_{s,\psi}$ for some fixed $s$ drawn from $\varphi$ , compute $s$ .

Definition 2.2.

The decisional Ring-LWE problem, for error distribution $\psi$ and secret distribution $\varphi$ , is as follows: Given an error distribution $\psi$ over $R_{q}$ and a secret distribution $\varphi$ over $R_{q}$ , distinguish with non-negligible advantage, between

(1)

samples drawn from the distribution $A_{s,\psi}$ for some fixed $s$ drawn from $\varphi$ ; and 2. (2)

samples drawn uniformly from $R_{q}\times R_{q}$ .

We remark that Ring-LWE is frequently defined in the context of the dual $R^{\vee}$ (the inverse of the different ideal). However, in the case that $K$ is a $2^{N}$ -th cyclotomic field, $R\cong 2^{N-1}R^{\vee}$ and this isomorphism is realized as a scaling in the canonical embedding, and thus preserves the error distribution up to scaling, so we can interchange the dual version with the simpler ‘unital’ version considered here [24].

Search-to-decision reductions are known in a variety of contexts [24]. This paper concerns both problems, but especially the search problem.

The Ring-LWE problem is formally similar to the discrete logarithm problem, which could be phrased in terms of samples $(a,a^{s})$ in a finite field: given $(a,a^{s})$ , find $s$ . In the ring $R_{q}$ , solving for $s$ given $(a,as)$ can be accomplished using linear algebra (Gaussian elimination), or by multiplication by $a^{-1}$ in the ring. By introducing a small error $e$ , so we have $(a,as+e)$ , multiplication by $a^{-1}$ is no longer helpful, and Gaussian elimination becomes useless, as it amplifies the errors to the point of washing out all useful information. From another perspective, the security stems from the fact that addition of an error value is somehow unpredictably mixing with respect to the multiplicative structure.

Another consequence of this setup is that given just one sample $(a,b)$ , one has as many solutions $s$ to $b=as+e$ as there are possible values for $e$ . In fact, the problem only has a unique solution once we have enough samples. If the samples are not Ring-LWE samples at all, then with sufficiently many samples, it becomes overwhelmingly likely that there are no values of $s$ so that $b_{i}-a_{i}s$ is in the support of the error distribution for all samples $s$ . If the samples are Ring-LWE, this is the point at which the true secret is the only solution, with overwhelming probability.

3. Specializing to $2$ -power cyclotomic Ring-LWE

We will now specialize to the $2$ -power cyclotomic case, fixing values for the variables

[TABLE]

from the last section, and defining

[TABLE]

for the $2$ -power cyclotomic case. Whenever we say refer to $2$ -power cyclotomic Ring-LWE, we refer to all the conventions in this section.

3.1. Ring $R$

We let $K$ and $R$ be the $2n$ -th cyclotomic field and ring of integers, respectively, where $n$ is a power of two. This is of dimension $n$ (note that $\varphi(2n)=n$ ), and can be presented as

[TABLE]

We will use the notation $m=2n$ and $\zeta_{m}$ for a primitive $m$ -th root of unity in $R$ and for its image in quotients of this ring.

3.2. The $\zeta$ -basis for $R$ and its quotients

A basis for $R$ is

[TABLE]

This will be called the $\zeta$ -basis. We have the relation $\zeta_{m}^{n}+1=0$ in $R$ and in all its quotients (this is the $2n$ -th cyclotomic polynomial evaluated at $\zeta_{m}$ ), but the minimal polynomial for $\zeta_{m}$ varies in these quotients, and may be a proper divisor of this cyclotomic polynomial. Nevertheless, in all quotients of $R$ , we still obtain a $\zeta$ -basis, i.e. a power basis in terms of $\zeta:=\zeta_{m}$ .

3.3. Prime $q$

Let $q$ be an odd prime, unramified in $R$ .

3.4. Ring $R_{q}$ and further quotients

We consider the quotient ring

[TABLE]

which is an $\mathbb{F}_{q}$ -vector space of dimension $n$ . We may use the same $\zeta$ -basis for this ring (to be explicit, the images of the $\zeta$ basis for $R$ under the reduction modulo $q$ ).

We may also consider further quotients $R/\mathfrak{a}$ for $\mathfrak{a}\mid qR$ . We may also use a $\zeta$ -basis for these rings, although it may be of lower dimension over $\mathbb{F}_{q}$ (so fewer powers required). We have

[TABLE]

where $g(x)\mid x^{n}+1$ . In particular, identifying $\zeta\in R/qR$ with its image in $R/\mathfrak{a}$ , the latter has an $\mathbb{F}_{q}$ -basis $1,\zeta,\zeta^{2},\ldots,\zeta^{\deg(g)-1}$ .

3.5. Error distribution $\chi$ , coefficient distribution $\chi_{0}$ and coefficient support $E_{\chi_{0}}$

We will denote the error distribution by $\chi$ . If this error distribution is formed using independent identically distributed coefficients on the $\zeta$ -basis, with coefficient distribution $\chi_{0}$ supported on a subset $E_{\chi_{0}}\subseteq\mathbb{F}_{q}$ , then we say that $\chi$ is formed on the $\zeta$ -basis with coefficients distributed according to $\chi_{0}$ . This is true, for example, of a discrete Gaussian distribution on two-power cyclotomics, or a distribution formed by choosing coefficients uniformly from some subset of $\mathbb{F}_{q}$ . For the former observation, the relevant fact is the following: the power basis associated to $\zeta_{m}$ is orthonormal (after scaling) in the canonical embedding. To see this, use (1) and observe that if $\zeta_{m}^{a}$ has order $2^{\ell}\geq 2$ , then $-\overline{\zeta_{m}}^{a}$ does also, hence the real parts of the complex embeddings of roots of unity form a collection symmetrical about zero. For this paper, we will concern ourselves exclusively with this case.

3.6. Secret distribution

We will not make any particular assumption on the secret distribution. It may be taken to be uniform on $R_{q}$ . Note, however, that the method of [4, Section 3, Targeting $e_{i}$ ] could be used to manipulate the samples so the secret can be taken from the error distribution, preserving the Ring-LWE structure of the samples.

4. Key theoretical properties

In this section we highlight several key aspects of Ring-LWE absent in LWE.

4.1. Ring homomorphisms

If a Ring-LWE problem is presented in $R_{q}$ , then for any $\mathfrak{a}\mid qR$ , we have a ring homomorphism

[TABLE]

This transports samples distributed according to $A_{s,\chi}$ to samples distributed according to $A_{\rho(s),\rho(\chi)}$ .

In general, the effect of $\rho$ on $\chi$ is problematic, i.e. it spreads out the error widely. As an illustration, we give a proposition governing the behaviour of $\rho$ on $\chi$ in the $2$ -power cyclotomic case, when $q\equiv 1~{}(\textup{mod}~{}4)$ .

Proposition 4.1.

Suppose we are in the $2$ -power cyclotomic case, and $R/\mathfrak{a}\cong\mathbb{F}_{q^{k}}$ , and $q\equiv 1~{}(\textup{mod}~{}4)$ . If, in $R_{q}$ , the error distribution $\chi$ is formed on the $\zeta$ -basis in $R_{q}$ with coefficients drawn from $\chi_{0}$ on $\mathbb{F}_{q}$ , then $\chi^{\prime}:=\rho(\chi)$ is formed on the $\zeta$ -basis in $\mathbb{F}_{q^{k}}$ with coefficients drawn from $\chi_{0}^{\prime}$ on $\mathbb{F}_{q}$ , where $\rho(\zeta_{m}^{k})\in\mathbb{F}_{q}$ and

[TABLE]

Proof.

Define $r=\operatorname{ord}_{2}(q-1)$ , meaning that $2^{r}\mid q-1$ but $2^{r+1}\nmid q-1$ . Since $q\equiv 1~{}(\textup{mod}~{}4)$ , we have $r\geq 2$ . Furthermore, $q^{i}+1\equiv 2~{}(\textup{mod}~{}4)$ for all $i$ , so that $\operatorname{ord}_{2}(q^{2}-1)=\operatorname{ord}_{2}((q-1)(q+1))=r+1$ and, by induction

[TABLE]

for all $i\geq 1$ . As $k$ is defined as the embedding degree of the $2n$ -th roots of unity, we obtain $k=\frac{2n}{2^{r}}$ .

The element $\rho(\zeta_{m}^{k})$ satisfies $\rho(\zeta_{m}^{k})^{2n/k}=1$ in $R/\mathfrak{a}$ . Hence it is itself a primitive $2n/k$ -th root of unity, i.e. $2^{r}$ -th root of unity. Hence $\rho(\zeta_{m}^{k})\in\mathbb{F}_{q}$ by the definition of $r$ .

The main statement now follows from the fact that $1,\zeta_{m},\ldots,\zeta_{m}^{k-1}$ is an $\mathbb{F}_{q}$ -basis of $\mathbb{F}_{q^{k}}$ , that $\rho(\zeta_{m}^{k})\in\mathbb{F}_{q}$ and that for $0\leq j<k$ and $0\leq i<n/k$ , we have

[TABLE]

∎

For example, in the case that $k=n/2$ , we obtain

[TABLE]

This means the coefficients of $\chi^{\prime}$ are chosen from a sum of two Gaussian distributions with different coefficients. This is less controlled than twice a single Gaussian. For, twice a Gaussian is simply a wider Gaussian, and the size of its support grows by approximately $\sqrt{2}$ . However, in an uneven linear combination the size of the support $E_{\chi^{\prime}}$ is approximately the square of the size of $E_{\chi}$ . (To be explicit, since $\chi_{0}$ is discrete, $c\chi_{0}$ is “spaced out” into isolated spikes, and each spike of support is transformed into a small gaussian by the addition of $\chi_{0}$ to form $c\chi_{0}+\chi_{0}$ .) This is a symptom of the protective property of these ring homomorphisms: they transform the error to something less amenable to attack. In fact, very quickly the image of a Gaussian error approaches uniform in the image ring as the dimension of the image ring decreases. And Ring-LWE samples with uniform error are informationless.

4.2. Rotating samples

The ring structure allows us to generate new (but not independent) samples from old.

Proposition 4.2.

Suppose $\chi$ is invariant under multiplication by $\zeta$ . Then if $(a,b)$ is distributed according to $A_{s,\chi}$ , then

(1)

$(\zeta a,\zeta b)$ * is also distributed according to $A_{s,\chi}$ ,* 2. (2)

$(a,\zeta b)$ * is distributed according to $A_{\zeta s,\chi}$ .*

In particular, in the $2$ -power cyclotomic case, a discrete Gaussian is invariant under multiplication by $\zeta_{m}$ and all its powers.

We call these rotated samples. One could also rotate by other small values, e.g. $1+\zeta_{m}$ in the $2$ -power cyclotomic case, at a small cost in changing the error distribution. (This may allow for adapting the notion of sample amplification to the Ring-LWE case; see [21].)

4.3. Subrings and trace maps

If considering Ring-LWE in $R_{q}$ , where $R$ is the ring of integers of a number field $K$ , then any subfield $L\subseteq K$ gives rise to a subring $S\subseteq R$ (i.e., the ring of integers of $L$ ) and, modulo $q$ , to a subring $S_{q}\subseteq R_{q}$ . Then $S_{q}$ is an $\mathbb{F}_{q}$ -vector subspace of $R_{q}$ , and $R_{q}$ has a module structure over $S_{q}$ . The dimensions of $K$ over $L$ , $R$ over $S$ and $R_{q}$ over $S_{q}$ agree.

There is a linear map $T:=\operatorname{Tr}^{R_{q}}_{S_{q}}:R_{q}\rightarrow S_{q}$ satisfying the following relationship to the usual trace map from $R$ to $S$ :

[TABLE]

To see this, remark that $qS$ is elementwise fixed by the Galois group of $K/L$ and $qR$ is the extension of $qS$ to $R$ , so the Galois group takes $qR$ to itself. Therefore the Galois group acts on $R_{q}$ fixing $S_{q}$ . Therefore we may define $\operatorname{Tr}^{R_{q}}_{S_{q}}(x)$ to be the sum of $\sigma(x)$ for $\sigma$ in the Galois group of $K/L$ , and the relationship above holds.

The ring $R$ is always an $S$ -module, but the reader is cautioned that in a general number field, $R$ may not be a free module over $S$ .

4.4. Multiplicative cosets of subrings

The set $a_{0}S_{q}$ , for any invertible $a_{0}\in R_{q}$ , is an $\mathbb{F}_{q}$ -vector subspace of $R_{q}$ of dimension equal to the dimension of $S_{q}$ . Distinct such subspaces intersect only at subspaces consisting of non-invertible elements of $R_{q}$ , and $R_{q}^{*}$ (the invertible elements of $R_{q}$ ) lie in the union of all such subspaces.

Let us write $A_{a_{0}S_{q},s,\chi}$ for the distribution on $a_{0}S_{q}\times R_{q}$ given by choosing $a$ uniformly in $a_{0}S_{q}$ , choosing $e$ according to error distribution $\chi$ and outputting $(a,b:=as+e)$ .

Proposition 4.3.

If $(a,b)$ is distributed according to $A_{a_{0}S_{q},s,\chi}$ where $\chi$ is invariant under multiplication by $\zeta$ , then

(1)

$(\zeta a,\zeta b)$ * is distributed according to $A_{\zeta a_{0}S_{q},s,\chi}$ , and* 2. (2)

$(a,\zeta b)$ * is distributed according to $A_{a_{0}S_{q},\zeta s,\chi}$ .*

The multiplicative coset structure gives rise to another type of sample reduction, beyond ring homomorphism. We have

Proposition 4.4.

Suppose $s\in R_{q}$ is fixed. Define $T:=Tr^{R_{q}}_{S_{q}}$ , the trace map described above. Consider a collection of samples distributed according to $A_{a_{0}S_{q},s,\chi}$ , where $a_{0}\in R_{q}^{*}$ is fixed and $T(a_{0})$ is invertible. Then $T$ maps such samples to samples distributed according to $A_{s^{\prime},T(\chi)}$ in $S_{q}$ , where

[TABLE]

Proof.

For $a=a_{0}a^{\prime}\in a_{0}S_{q}$ , since $T$ is $S_{q}$ -linear, we have

[TABLE]

This implies that

[TABLE]

This proves the proposition. ∎

4.5. Trace maps for two-power cyclotomics

The final piece to the puzzle is the behaviour of the trace map $T$ in the previous section. In the case of the $2$ -power cyclotomics, the trace map is particularly well-behaved in terms of its effect on the error distribution. In fact, it takes very many of the basis elements $\zeta_{m}$ to zero. This is a feature of the orthogonality of the basis $1,\zeta_{m},\ldots,\zeta_{m}^{n-1}$ , and it may be proved with reference to basic algebraic number theory, as follows.

Using the notation of Section 4.3 in the case of the $2$ -power $m$ -th cyclotomics $K$ , let $L$ be the $k$ -th cyclotomic subfield. One may take $\zeta_{k}=\zeta_{m}^{m/k}$ and $S_{q}$ has a basis $1,\zeta_{k},\ldots,\zeta_{k}^{k/2-1}$ over $\mathbb{F}_{q}$ . We collect terms to write

[TABLE]

In other words, $R_{q}$ has a $\zeta$ -basis over $S_{q}$ .

The elements of the Galois group of $K/L$ are given by $\zeta_{m}\mapsto\zeta_{m}^{a}$ for $a\in(\mathbb{Z}/m\mathbb{Z})^{*}$ satisfying $a\equiv 1~{}(\textup{mod}~{}k)$ , and so

[TABLE]

In particular, for the trace to the index two subfield, we have:

[TABLE]

This special case can be seen directly by observing that if $i$ is even, then $\zeta_{m}^{i}\in S$ , while if $i$ is odd, then $\zeta_{m}^{i}$ is the square root of something in $S$ , i.e. it satisfies the minimal polynomial $x^{2}-\zeta_{m}^{2i}$ , and hence has trace zero. An alternate proof of the general case then follows by application of the special case $\log_{2}(m/k)$ times.

In summary then, the trace map preserves the error distribution up to small factors. The following proposition, which is now immediate, makes this explicit.

Proposition 4.5.

Suppose we are in the two-power cyclotomic case as in Section 3, where in particular $R$ is the ring of integers of the $m$ -th cyclotomics, with $m$ a power of two. Let $S$ be the subring of integers of the $k$ -th cyclotomics (hence $k$ is also a power of two). Write $T:=Tr^{R_{q}}_{S_{q}}$ for the trace map described in Section 4.3. Suppose that $\chi$ is an error distribution formed on the $\zeta$ -basis of $R_{q}$ with coefficients chosen according to $\chi_{0}$ . Then $\frac{k}{m}T$ takes values in $S_{q}$ and $\frac{k}{m}T(\chi)$ is the error distribution formed on the $\zeta$ -basis of $S_{q}$ with coefficients from $\chi_{0}$ .

The efficacy of the trace map with respect to the error distribution is due to its being an orthogonal projection to the space spanned by a subset of an orthonormal basis.

5. Reducing to a smaller ring

We demonstrate that if one can find sufficiently many samples whose $a$ values are restricted to a fixed multiplicative coset of a subring, then we can reduce the Ring-LWE problem to multiple independent Ring-LWE instances in the subring, without error inflation.

For this section, we are in the two-power cyclotomic case. Let $R$ be the ring of $m$ -th cyclotomic integers, where $m$ is a power of two (which have dimension $n$ , where $m=2n$ ), and $S$ be the ring of $k$ -th cyclotomic integers, where $k\mid m$ . Then we have an extension of rings, $S\subseteq R$ of degree $m/k$ . Suppose that the rational prime $q$ is unramified in $R$ .

Proposition 5.1.

Consider a Ring-LWE instance in $R_{q}$ with secret $s$ and error distribution $\chi$ . Let $a_{0}\in R_{q}$ be a fixed invertible element. Let $T:=\operatorname{Tr}^{R_{q}}_{S_{q}}$ , and suppose that $T(a_{0})$ is invertible.

Let $i$ be an integer. Then in time polynomial in $n$ and $\log q$ , one can reduce a Ring-LWE sample from distribution $A_{a_{0}S_{q},s,\chi}$ to a Ring-LWE sample in $S_{q}$ drawn according to secret

[TABLE]

and error distribution $\frac{k}{m}T(\zeta^{i}\chi)\subseteq S_{q}$ .

In particular, by Proposition 4.5, coefficient distributions of a $\zeta$ -invariant $\chi$ and its resulting distribution $\frac{k}{m}T(\zeta^{i}\chi)$ are of the same size; it is in this sense that the errors do not inflate.

Proof.

Consider the sample $(a,b)$ where $b=as+e$ . Multiplying the second coordinate of the sample by $\zeta^{i}$ and taking the trace $\frac{k}{m}T$ , we obtain as in Proposition 4.4, a sample

[TABLE]

where $a^{\prime}:=aa_{0}^{-1}\in S_{q}$ .

Multiplication in the ring, and taking the trace, are polynomial in the ring size. ∎

The following is the main theorem of the paper.

Theorem 5.2.

Suppose $R$ is the ring of $m$ -th cyclotomic integers, for $m=2n$ a power of two, and $S$ is the ring of $k$ -th cyclotomic integers, where $k\mid m$ , so the extension $S\subseteq R$ is of degree $m/k$ . Suppose that the rational prime $q$ is unramified in $R$ .

Consider a Ring-LWE instance in $R_{q}$ with secret $s$ and error distribution $\chi$ which is invariant under multiplication by $\zeta=\zeta_{m}$ , a primitive $m$ -th root of unity. Let $a_{0}\in R_{q}$ be a fixed invertible element. Let $T:=\operatorname{Tr}^{R_{q}}_{S_{q}}$ as defined in Section 4.3, and suppose that $T(a_{0})$ is invertible.

Suppose one obtains $N$ samples $(a,b)$ distributed according to $A_{a_{0}S_{q},s,\chi}$ (notation from Section 4.4).

Then in time linear in the number of samples $N$ , and polynomial in $n$ and $\log q$ , one can reduce the computation of the secret $s\in R_{q}$ to the solution of $m/k$ Search Ring-LWE problems in $S_{q}$ with error distribution $\frac{k}{m}T(\chi)$ , having $N$ samples each. These $m/k$ problems are independent in the sense that setting up any one of them does not require having solved any other one.

Furthermore, if $\chi$ is formed on the $\zeta$ -basis from coefficient distribution $\chi_{0}$ on $\mathbb{F}_{q}$ (see Section 3.5 for definition), then so is $\frac{k}{m}T(\chi)$ .

Proof.

Set $i=j$ in Proposition 5.1 for each $j$ in the range $j=0,\ldots,m/k-1$ , to obtain $N$ samples having secret

[TABLE]

Using an oracle that solves Search Ring-LWE in $S_{q}$ , obtain $c_{j}$ .

Collecting all the values $c_{j}$ , we have a linear system of $m/k$ equations over $S_{q}$ , whose indeterminates are the coefficients of $s$ (expressed in terms of a basis for $R_{q}$ over $S_{q}$ ), of the form

[TABLE]

The linear equations are independent provided that $\{a_{0}\zeta^{j}\}$ is a set of $S_{q}$ -independent vectors in $R_{q}$ . We saw above that $\{\zeta^{j}\}_{j=0,\ldots,m/k-1}$ is a basis for $R_{q}$ over $S_{q}$ . Thus independence is guaranteed by the fact that $a_{0}$ is invertible. Note that we can consider this system to consist of $n$ independent linear equations over $\mathbb{F}_{q}$ . The system can be solved by Gaussian elimination to recover $s$ .

All the field operations concerned are polynomial in the size of the ring. We must apply the trace to $N$ samples $m/k$ times, and we must carry out Gaussian elimination of dimension $n=m/2$ over $\mathbb{F}_{q}$ , which is polynomial in $m$ and $\log q$ . ∎

As a small corollary, note that in any small Ring-LWE situation where exhaustive search may apply, it is equally possible to use the above for a square-root speedup, provided many samples are available. As an example, if we have a coefficient distribution with support not including all of $\mathbb{F}_{q}$ , then the following statement demonstrates the approach.

Corollary 5.3.

Consider a Ring-LWE instance in $R_{q}$ with secret $s$ and error distribution $\chi$ formed on a $\zeta$ -basis with coefficient distribution having support strictly smaller than $\mathbb{F}_{q}$ .

There is an algorithm to solve this problem, with success probability $1/2$ , in time and number of samples $q^{n/2}$ times factors polynomial in $n\log q$ , using space polynomial in $n\log q$ .

Proof.

Note that the hypotheses guarantee $\chi$ is invariant under multiplication by $\zeta$ . Let $S_{q}$ be the ring of index two in $R_{q}$ (i.e. $n$ -th roots of unity). Collect samples, discarding all but those with $a\in S_{q}$ . In time $O(Nq^{n/2})$ we can accumulate $N$ samples with $a\in S_{q}$ . Apply Theorem 5.2 to reduce to two Ring-LWE problems in $S_{q}$ with $N$ samples each. The error distribution $\chi$ on $R_{q}$ gives an error distribution $\frac{1}{2}T^{R_{q}}_{S_{q}}(\chi)$ on $S_{q}$ . If $\chi$ is formed on a $\zeta$ -basis with coefficients supported in $E_{\chi}\subsetneq\mathbb{F}_{q}$ , then $\frac{1}{2}T^{R_{q}}_{S_{q}}(\chi)$ is formed on a $\zeta$ -basis with coefficients supported in $E_{\chi}\subsetneq\mathbb{F}_{q}$ . Therefore, if the number of samples is sufficient, the reduced Ring-LWE problems are solvable using exhaustive search through possible $s$ values.

In our case, we need $N$ large enough so that a Ring-LWE problem in $S_{q}$ with $N$ samples has a unique solutions with probability $1/\sqrt{2}$ . Although $N$ depends upon $|E_{\chi}|$ , for the worst case $|E_{\chi}|=q-1$ , $N$ is still polynomial in $n\log q$ . Solve the reduced problems by exhaustive search, which takes time $O(q^{n/2})$ and each succeeds with probability $1/\sqrt{2}$ . ∎

6. Background on the Blum-Kalai-Wasserman algorithm

First, we will give a very brief overview of the Blum-Kalai-Wasserman (BKW) algorithm in the context of LWE [5]. It is a combinatorial algorithm in which samples are collected and stored so as to facilitate the creation of new samples, as iterated sums and differences of established ones. The goal is to create new samples for which $a$ is restricted to a linear subspace. This is the reduction phase of the full BKW algorithm.

In BKW, after reduction, there is a hypothesis testing phase, in which one solves a lower-dimensional Ring-LWE problem (that given by restricting $a$ to the subspace) by exhaustive search over possible secrets. And then there is a back-substitution phase, where the small piece of the secret recovered in hypothesis testing is used to rework the problem to prepare the next small piece for hypothesis testing.

One can think of BKW as a sort of controlled Gaussian elimination on a matrix whose rows are samples, in which one wants to obtain as much simplification as possible using just one sum or difference of rows. By keeping the coefficients of the linear combinations small, we prevent the error ‘blow-up’ that occurs with regular Gaussian elimination. The cost is in needing many more matrix rows (samples) in order to be able to choose good linear combinations. The back-substitution phase is analogous to the eponymous phase of Gaussian elimination, with the recovered portion of the secret taking the role of the free variable. From another point of view, BKW reduction is a sort of iterated birthday attack, in which one searches for and exploits collisions which eliminate entries of the vectors, reducing to a subspace, where one searches again for collisions, and so on.

Now let us be more precise. During the reduction phase, only the $a$ -value of a sample matters, considered as a vector in a vector space $V$ , and the goal is to create samples with $a\in W$ , a linear subspace of $V$ . Suppose, for the sake of explanation, that $W$ is defined by the first $r$ coefficients of its vectors being [math]. One generates an ordered list of the first $r$ entries of all the vectors $a$ which are observed. Whenever a new vector $a$ is observed, it is compared to the ordered list. If it is not already present, it is added. Otherwise, we have discovered two samples $(a,b)$ and $(a^{\prime},b^{\prime})$ for which $(a-a^{\prime},b-b^{\prime})$ is a new sample for which $a-a^{\prime}$ lies in $W$ . The penalty is that the error distribution of these new samples is widened. We begin a new table of such vectors as they are generated. In this way, we produce a large number of samples in a smaller subspace at the cost of inflating the error widths.

Instead of performing this reduction all at once, one chooses an appropriate block size $\beta$ for BKW (which is fixed throughout in the naïve implementation), which is to say, the codimension of $W$ as a subspace of $V$ . Once we have produced enough samples in $W$ , we can use these to perform another BKW reduction to a subspace $W^{\prime}\subseteq W$ of codimension $\beta$ in $W$ . The cost of a reduction step is exponential in $\beta$ , so we keep $\beta$ as small as possible. We perform block reductions until the samples are all taken from a small enough subspace to run an exhaustive search or other strategy to finish off the problem. The limiting factor on shrinking $\beta$ is an upper limit on the number of blocks used overall. Each reduction into codimension $\beta$ has a cost in error-inflation. We have a limit on the total error inflation (because hypothesis testing will fail if the error is so inflated as to appear uniform), which limits the total number of blocks.

The BKW algorithm has been improved in recent years, including using coding theory to reduce the number of values that need to be stored and compared, sieving at each step, allowing the block size to vary, using the Fourier transform to speed up hypothesis testing; see [2] [14] [18] [19] [20] [22].

7. Reduction using BKW

In this section, we address the problem of finding sufficiently many samples $(a,b)$ having $a$ from an appropriate subring $S_{q}\subseteq R_{q}$ , so that Theorem 5.2 will apply. For this, we use the reduction phase of the BKW algorithm. We emphasize that it is possible, once the samples have been given in an appropriate basis, to use an off-the-shelf BKW reduction algorithm, including coded BKW with sieving etc., for the reduction phase. The window size may be chosen at will, for example, and need not depend upon the ring structure. Then, Theorem 5.2, which is polynomial time, replaces all the other phases of BKW.

The only adaptor necessary to connect BKW to Theorem 5.2 is an attention to the basis used. In order to perform the reduction, we begin with the $\zeta$ -basis of $R_{q}$ over $\mathbb{F}_{q}$ , namely

[TABLE]

and then reorder it to produce a prioritized basis. The most important property we desire for our purposes is that if one of $\zeta^{i}$ and $\zeta^{j}$ has lower multiplicative order than the other, then it comes later than the other. One computationally convenient way to accomplish this is to take the bit-reversal permutation on $n$ elements (i.e. $a$ maps to $b$ if the binary representation of $a$ in $\log_{2}(n)$ bits, read backwards, is $b$ ), then reserve the order. For concreteness, the prioritized basis (in part) is as follows:

[TABLE]

Using any type of BKW reduction, one now reduces, with respect to this basis. To be precise, one seeks to eliminate the earlier coefficients of the elements $a$ , as expressed in this basis. At the end, at most the last $2^{k}$ coefficients are non-zero, for some small $k$ . For example, one may reduce until only the last $1$ , $2$ , $4$ or $8$ coefficients are possibly non-zero.

The varying block sizes during the reduction algorithm itself need not respect any restrictions, and improvements such as coded-BKW with sieving, may be used. For example, coded-BKW, under the assumption the secret $s$ is small, associates to each $a$ a codeword $c$ from a linear code. Then the sample $(a,as+e)$ is replaced with $(c,as+e)$ , which is a valid sample with a larger error, before it is fed to the BKW tables. The tables then have fewer rows because their rows are chosen from codewords. In sieving, imagine that one has stored the original $a$ along with each new sample $(c,as+e)$ . The difference between $a$ and $c$ measures the error inflation introduced by coding. A collision between $(c,a_{1}s+e)$ and $(c,a_{2}s+e)$ being passed to another table has an $a=a_{1}-a_{2}$ that is not actually [math] in the first few entries, only small. Among the vectors being fed from one table to the next, one can pause to sieve them, creating vectors whose $a$ ’s are somewhat smaller. This reduces the error inflation introduced by the coding process.

The important thing is that, whatever technique is used, after reduction, one has obtained samples with $a\in S_{q}=S/qS$ for some $S$ of dimension $2^{k}$ . One then applies Theorem 5.2.

8. The Ring-BKW algorithm

In this section we summarize the Ring-BKW algorithm for completeness. In short, one uses an off-the-shelf BKW reduction algorithm on samples with respect to a particular choice of basis, then applies Theorem 5.2. The important point is that the back-substitution phase of BKW is no longer needed, and the hypothesis-testing phase can be parallelized. The hypothesis-testing phase can also be off-the-shelf, including recent improvements using the Fourier transform etc. [14]. However, we will elaborate somewhat.

Ring-BKW algorithm

Choose a subring $S\subseteq R$ of dimension $B$ over $\mathbb{Z}$ (corresponding to a lower-degree $2$ -power cyclotomic field), to which we wish to reduce. Define $R_{q}$ and $S_{q}$ as before. The Ring-BKW Algorithm is given as Algorithm 1.

The ring structure is not relevant in step (1); one uses BKW reduction as for any LWE problem (in particular, the window size can be chosen without regard to the ring structure). In fact, any reduction algorithm to obtain values $a\in S_{q}$ will do as well.

The following theorem relates any reduction algorithm to the solution of Search Ring-LWE. For the following, we consider Gaussian error with a well-defined width; an expansion factor refers to a multiplicative factor on the width.

Theorem 8.1.

Suppose that $\mathcal{B}$ is an algorithm which, given a Ring-LWE problem of dimension $n$ over $\mathbb{F}_{q}$ , produces $N$ Ring-LWE samples of dimension $B$ with error expansion factor of $f$ , in time $t_{\mathcal{B}}(n,B,f,N)$ , and using $r_{\mathcal{B}}(n,B,f,N)$ original samples.

Suppose that $\mathcal{R}$ is an algorithm which solves Ring-LWE in dimension $B$ over $\mathbb{F}_{q}$ in time $t_{\mathcal{R}}(B)$ , given error width less than or equal to $w$ and at least $N$ samples.

Then, there is an algorithm $\mathcal{A}$ which solves Ring-LWE in $R_{q}$ having width $\sigma$ in time

[TABLE]

using $r_{\mathcal{B}}(n,B,w/\sigma,N)$ samples.

Proof.

We will use Algorithm 1. We will set $f=w/\sigma$ . The time to run the reduction phase is $t_{\mathcal{B}}(n,B,w/\sigma,N)$ . The time to create the smaller Ring-LWE problems is linear in $N$ and polynomial in $n\log q$ from Theorem 5.2. Solving the $\frac{n}{B}$ smaller Ring-LWE problems (guaranteed to succeed by the choice of $f$ ) takes time $t_{\mathcal{R}}(B)$ each. Then reconstructing the secret (as in Theorem 5.2) again takes polynomial time. ∎

9. Advanced Keying

In the previous section, one uses BKW on LWE to perform reduction, say with block size $B$ . Given a Ring-LWE sample, there are in fact $n$ rotated samples one could feed into the reduction:

[TABLE]

Naïvely, one may include them all, or include the first one. Probably the best course of action is to include them all, to increase the number of collisions located amongst the available samples (since the number of samples needed is the downside to BKW in general). By including all rotations, one catches all collisions of the form $a_{1}\pm\zeta^{i}a_{2}$ for some $i$ . These are all perfectly useful collisions for the algorithm, if the error term is $\zeta$ -invariant. In this section we propose a space-saving approach based on symmetries, which is equivalent, in terms of collisions obtained per sample, to storing all rotations of the samples. (If one chooses to compare to running BKW without rotating samples at all, i.e. ring-blind, it will both reduce storage and require fewer samples.)

In the discussion that follows, the reduction algorithm described in Section 6 will be called traditional BKW reduction to distinguish it from the advanced keying BKW reduction proposed in this section. There are a variety of modern speedups and alternatives (such as coded-BKW and sieving) which could also be combined with advanced keying, but for purposes of clarity we will ignore these until later in this section. In particular, in traditional BKW reduction, when a collision is recorded, nothing is added to the current table, but the difference is passed to the next table. (Later, it will prove helpful to call this one-difference and compare it to all-differences where new samples are stored as well as passed on, to increase the number of collisions.)

Our proposal in this section is an analogue of the space-saving technique used in traditional BKW, wherein for each sample $(a,b)$ we may derive two samples $(a,b)$ and $(-a,-b)$ : we choose one canonically (where the first non-zero coefficient of $a$ is in $\left\{1,\ldots,\frac{q-1}{2}\right\}$ , say), and save only this one. By doing so, we will catch all collisions between samples where their sum or their difference vanishes, and save half the table rows in the process. More precisely, the number of rows of the table for each block never exceeds $(q^{B}-1)/2$ , since the possible non-zero vectors come in pairs of which we store at most one. Furthermore, this is also a time efficiency issue. If instead one simply included $(a,b)$ and $(-a,-b)$ among the incoming samples, then without this trick, the collisions $a_{1}+a_{2}$ and $-a_{1}-a_{2}$ are both sent on to the next table, both are multiplied by $-1$ thereafter, and we actually end up with repeat samples that must be weeded out at a later stage. For reference, traditional BKW reduction, with this space-saving technique, is given explicitly in Algorithm 3.

The fundamental observation is that the prioritized basis proposed in the last section is particularly well-suited to this type of strategy, because of the resulting ‘negacyclic permutation’ effect of multiplication by $\zeta$ . It results in a savings of $1/2B$ instead of $1/2$ and is completely analogous to the trick above in both space and efficiency savings. It requires that the block size $B$ be a power of $2$ .

Write $\mathbf{a}\in\mathbb{F}_{q}^{n}$ for the vector of coefficients of $a$ in the prioritized basis. The action of $\zeta^{h}$ (taking $a$ to $\zeta^{h}a$ ) on such a vector permutes the entries, and swaps the sign on some of them (since $\zeta^{n}=-1$ ). Suppose $h$ is exactly divisible by $2^{\ell}$ (i.e. $\operatorname{ord}_{2}(h)=\ell$ ). With regards to the permutation only (ignoring the signs), the permutation has the property that it stabilizes each consecutive block of length $n/2^{\ell}$ throughout (that is, it permutes each block individually). For fixed $\ell$ , there are exactly $n/2^{\ell}$ such integers $h$ (note that $h$ is taken modulo $2^{n}$ , for $h=2^{n}$ results in the identity permutation). The following consequence is key:

Property 1.

Let $B\mid n$ denote block size. Then applying $\zeta^{n/B}$ preserves the property that $\mathbf{a}$ has first block (or series of any number of first blocks) consisting of zero entries.

This property will allow us to rotate samples by any of the $B$ quantities $1,\zeta^{n/B},\zeta^{2n/B},\ldots,\zeta^{(B-1)n/B}$ during BKW reduction with block size $B$ .

Next, one must specify a canonical choice of representative from the set of possible rotations $\{a,\zeta^{n/B}a,\ldots,\zeta^{(B-1)n/B}a\}$ , depending only on the first non-zero block of entries, up to an overall sign. A possible canonical choice is the ordering which has smallest first entry (in absolute value), together with some tie-breaking conventions, e.g. smallest second entry, etc., and if all entries are equal in absolute value, then some appropriate convention on sign changes between $\mathbf{a}$ and $|\mathbf{a}|$ , etc. However, any ordering of the possible length- $B$ vectors modulo overall sign, will do. It is not possible to break a tie if the first $B$ entries of the two rotations actually agree up to overall sign under one of the rotations. However, in this case we have found a “self-match,” meaning that two of the rotations have a difference which has all zero in the block under consideration, and so at most one of the two rotations need be stored, and the difference is sent to the following block, as with any collision, as in a traditional BKW algorithm.

The advanced keying BKW reduction is given in Algorithm 3, and for comparison purposes, the traditional BKW reduction using all rotations of each sample is given in Algorithm 2.

Correctness of Algorithm 3 is a consequence of Property 1. Furthermore, Algorithms 2 and 3 catch the same collisions in the following heuristic sense. For each collision $\zeta^{i}a_{1}-\zeta^{j}a_{2}$ , there will be another collision at $\zeta^{i+k}a_{1}-\zeta^{j+k}a_{2}$ for any $k\equiv 0~{}(\textup{mod}~{}n/B)$ . In Algorithm 2, all $B$ of these collisions are passed on to the next table after storing $B$ new rows in the current table. But any one of the samples sent on can generate the others via rotation, so only one of them is actually needed at the next table. In Algorithm 3, only one of them is stored and only one is sent onward (but only one is needed). However, there is some difference in the final output because we are only keeping one sample per row, and the order of input samples to a given table may differ, resulting in a different table entry. If one uses the all-differences variation, this difference disappears and the output of the two algorithms will be the same.

The following is immediate from Algorithm 3.

Proposition 9.1.

Each table in Algorithm 3 has at most $\frac{q^{B}-1}{2B}$ rows in total.

Finally, we will remark again that BKW reduction improvements for LWE, such as coded-BKW and sieving, may also be adapted to use the advanced keying demonstrated here, provided block sizes can be maintained to be powers of $2$ (varying them is ok). As some modern algorithms vary block size, this may be an impediment. The naïve way to do this would be to code samples first, then choose a canonical rotation of each codeword. Perhaps better, one could also code each rotation and choose the one with smallest error, which may introduce a significant improvement to the error inflation, depending on the choice of code. (Note that, for those familiar with coded-BKW, the notion of advanced keying is not so different than coding, as it provides a sort of ’codeword’ for each sample, without an error inflation.)

Algorithms 2 and 3, as well as a completely ring-blind version of BKW reduction were coded in Python in Sage Mathematics Software for comparison purposes. Some example results are given in Table 1. In short, the advanced keying did reduce table sizes and samples needed as described, and had a faster overall runtime. A few remarks are in order:

(1)

The experiments were chosen to represent a range of small parameter sets, where timings were in the range of seconds or minutes on a Lenovo X1 laptop. 2. (2)

After parameters were chosen, the number of samples was chosen to be a round number where the final table began to have a few samples on average; the timing therefore roughly represents the time until the final table begins to populate. 3. (3)

To compare meaningfully, the ring-blind algorithm uses $n$ times as many initial samples, which is equal to the total number of rotations of incoming samples for the other algorithms. The fact that the final table is populated but not full in all cases is evidence that the number of samples needed by Algorithms 3 and 2 is $1/n$ of those needed naïvely. 4. (4)

For some smaller parameter sets, we also tested a version of the algorithm (labelled AD = ‘All Differences’) in which every sample encountered is stored (so each row of the table can contain multiple samples) and every difference is passed on (i.e. the new sample is compared to everything already in its row). The purpose of this is to demonstrate that the advanced keying will still find the same number of samples. However, the AD version is significantly slower in all cases, so it was only implemented for some of the smaller parameter sets in the table. 5. (5)

Algorithms 2 and 3 are pseudocode; the implementation necessarily addressed details not covered in the pseudocode presentation. For example, some moderate attention was given to efficiency in the rotation of samples. For example, when only certain coefficients of the rotation were needed, only those were computed.

Some experimental observations:

(1)

The table sizes observed in Algorithm 3 are very close to $1/B$ of the number observed in Algorithm 2, as expected. 2. (2)

The faster runtime of Algorithm 3 is a result of the fact that fewer samples are handled ( $1/B$ as many are fed to the first table compared to Algorithm 2), although they must be handled in more detail, so the speedup is less than a $1/B$ factor. 3. (3)

Algorithms 2 and 3 use the exact same starting data, and it is reassuring that the reduced sample counts are similar, and the same in the AD version. 4. (4)

Algorithm 2 tends to find more samples than Algorithm 3. The difference is in which matches are found when more than two samples collide in a row in the table, and therefore is more pronounced as the number of rows grows.

10. In practice

It is evident that the runtime of Ring-BKW is expected to be better than that of standard BKW (in any of its current forms), since the reduction and hypothesis testing phases may be taken to be the same, but the backsubstitution phase is no longer required. Furthermore, the smaller Ring-LWE problems of hypothesis testing can be solved in parallel.

Albrecht et al. computed the runtime for BKW [2]. This work has been rendered out of date by many of the modern speedups mentioned in the introduction, but it is likely safe to say a few things that still hold true about modern BKW runtimes. First, the reduction phase is the dominant cost. Second, however, the backsubstitution phase differs from the reduction phase by a polynomial factor, so eliminating it can be expected to give a polynomial factor speeedup.

Advanced keying also offers a visible benefit when compared to a ring-blind implementation of BKW. For, compared to a ring-blind implementation, table sizes are reduced to $1/B$ of their former size and the number of samples used is reduced to approximately $1/n$ as many. Each sample must be treated rather more carefully however: it is rotated and a canonical choice made. However, experiments still indicate increasing runtime gains with dimension, even against traditional BKW with every sample rotated before beginning. Nevertheless, advanced keying requires block sizes to be a power of $2$ , and therefore may or may not be useful or extendable in view of the changing block sizes sometimes employed in BKW reduction.

The Ring-LWE Challenges [13] are in the form of Tweaked Ring-LWE, which refers to dual Ring-LWE transfered to the unital version (see [13, §2.3]), so that the parameter assumptions in this paper apply to the two-power cyclotomic challenges included therein. It would be very interesting to test these algorithms on those parameters, but it is beyond the scope of this paper.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Albrecht, M., Bai, S., Ducas, L.: A subfield lattice attack on overstretched NTRU assumptions: cryptanalysis of some FHE and graded encoding schemes. In: Advances in cryptology—CRYPTO 2016. Part I, Lecture Notes in Comput. Sci., vol. 9814, pp. 153–178. Springer, Berlin (2016), https://doi.org/10.1007/978-3-662-53018-4_6 · doi ↗
2[2] Albrecht, M.R., Cid, C., Faugère, J.C., Fitzpatrick, R., Perret, L.: On the complexity of the BKW algorithm on LWE. Des. Codes Cryptogr. 74(2), 325–354 (2015), https://doi.org/10.1007/s 10623-013-9864-x · doi ↗
3[3] Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange—a new hope. In: 25th USENIX Security Symposium (USENIX Security 16). pp. 327–343. USENIX Association, Austin, TX (2016), https://www.usenix.org/conference/usenixsecurity 16/technical-sessions/presentation/alkim
4[4] Bernstein, D.J., Lange, T.: Never trust a bunny. In: Radio Frequency Identification. Security and Privacy Issues. RFID Sec 2012., Lecture Notes in Comput. Sci., vol. 7739, pp. 137–148. Springer, Berlin (2013), https://doi.org/10.1007/978-3-662-49890-3_6 · doi ↗
5[5] Blum, A., Kalai, A., Wasserman, H.: Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM 50(4), 506–519 (2003), https://doi.org/10.1145/792538.792543 · doi ↗
6[6] Bos, J.W., Naehrig, M., van de Pol, J.: Sieving for shortest vectors in ideal lattices: a practical perspective. Int. J. Appl. Cryptogr. 3(4), 313–329 (2017), https://doi-org.colorado.idm.oclc.org/10.1504/IJACT.2017.089353
7[7] Brakerski, Z., Perlman, R.: Order-LWE and the hardness of Ring-LWE with entropic secrets. Cryptology e Print Archive, Report 2018/494 (2018), https://eprint.iacr.org/2018/494
8[8] Brakerski, Z., Vaikuntanathan, V.: Fully homomorphic encryption from ring-LWE and security for key dependent messages. In: Advances in cryptology—CRYPTO 2011, Lecture Notes in Comput. Sci., vol. 6841, pp. 505–524. Springer, Heidelberg (2011), https://doi.org/10.1007/978-3-642-22792-9_29 · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Algebraic aspects of solving Ring-LWE, including ring-based improvements in the Blum-Kalai-Wasserman algorithm

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

Acknowledgements

2. Background and Setup for Ring-LWE

2.1. Number field KKK and ring RRR

2.2. Gaussian distribution

2.3. Prime qqq and quotient ring RqR_{q}Rq​

2.4. Ring-LWE distributions

2.5. Ring-LWE problems

Definition 2.1**.**

Definition 2.2**.**

3. Specializing to 222-power cyclotomic Ring-LWE

3.1. Ring RRR

3.2. The ζ\zetaζ-basis for RRR and its quotients

3.3. Prime qqq

3.4. Ring RqR_{q}Rq​ and further quotients

3.5. Error distribution χ\chiχ, coefficient distribution χ0\chi_{0}χ0​ and coefficient support Eχ0E_{\chi_{0}}Eχ0​​

3.6. Secret distribution

4. Key theoretical properties

4.1. Ring homomorphisms

Proposition 4.1**.**

Proof.

4.2. Rotating samples

Proposition 4.2**.**

4.3. Subrings and trace maps

4.4. Multiplicative cosets of subrings

Proposition 4.3**.**

Proposition 4.4**.**

Proof.

4.5. Trace maps for two-power cyclotomics

Proposition 4.5**.**

5. Reducing to a smaller ring

Proposition 5.1**.**

Proof.

Theorem 5.2**.**

Proof.

Corollary 5.3**.**

Proof.

6. Background on the Blum-Kalai-Wasserman algorithm

7. Reduction using BKW

8. The Ring-BKW algorithm

Ring-BKW algorithm

Theorem 8.1**.**

Proof.

9. Advanced Keying

Property 1**.**

Proposition 9.1**.**

10. In practice

2.1. Number field $K$ and ring $R$

2.3. Prime $q$ and quotient ring $R_{q}$

Definition 2.1.

Definition 2.2.

3. Specializing to $2$ -power cyclotomic Ring-LWE

3.1. Ring $R$

3.2. The $\zeta$ -basis for $R$ and its quotients

3.3. Prime $q$

3.4. Ring $R_{q}$ and further quotients

3.5. Error distribution $\chi$ , coefficient distribution $\chi_{0}$ and coefficient support $E_{\chi_{0}}$

Proposition 4.1.

Proposition 4.2.

Proposition 4.3.

Proposition 4.4.

Proposition 4.5.

Proposition 5.1.

Theorem 5.2.

Corollary 5.3.

Theorem 8.1.

Property 1.

Proposition 9.1.