Faster provable sieving algorithms for the Shortest Vector Problem and   the Closest Vector Problem on lattices in $\ell_p$ norm

Priyanka Mukhopadhyay

arXiv:1907.04406·cs.DS·December 21, 2021

Faster provable sieving algorithms for the Shortest Vector Problem and the Closest Vector Problem on lattices in $\ell_p$ norm

Priyanka Mukhopadhyay

PDF

Open Access

TL;DR

This paper introduces faster provable sieving algorithms for SVP and CVP on lattices in all $\, ext{l}_p$ norms, significantly improving time complexity over previous methods.

Contribution

It presents a new linear sieving procedure applicable to all $\, ext{l}_p$ norms and a mixed sieving method that enhances efficiency, especially in $\, ext{l}_2$ norm.

Findings

01

Achieves time complexity of $2^{2.751n+o(n)}$ for SVP and CVP.

02

Improves $\, ext{l}_2$ norm sieving to $2^{2.25n+o(n)}$.

03

Provides approximation algorithms with time complexity $2^{2.001n+o(n)}$.

Abstract

In this work, we give provable sieving algorithms for the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP) on lattices in $ℓ_{p}$ norm ( $1 \leq p \leq \infty$ ). The running time we obtain is better than existing provable sieving algorithms. We give a new linear sieving procedure that works for all $ℓ_{p}$ norm ( $1 \leq p \leq \infty$ ). The main idea is to divide the space into hypercubes such that each vector can be mapped efficiently to a sub-region. We achieve a time complexity of $2^{2.751 n + o (n)}$ , which is much less than the $2^{3.849 n + o (n)}$ complexity of the previous best algorithm. We also introduce a mixed sieving procedure, where a point is mapped to a hypercube within a ball and then a quadratic sieve is performed within each hypercube. This improves the running time, especially in the $ℓ_{2}$ norm, where we achieve a time complexity of $2^{2.25 n + o (n)}$ , while…

Tables1

Table 1. Table 1: Comparison of the performance of various sieving algorithms in different ℓ p subscript ℓ 𝑝 \ell_{p} norm. In the last row DGS stands fro Discrete Gaussian Sampling based sieve.

$p$	Ref.	Type of sieve	Time complexity	Space complexity
$1 \leq p \leq \infty$	[47]	Quadratic	$2^{3.849 n + o (n)}$	$2^{2.023 n + o (n)}$
$1 \leq p \leq \infty$	This work	Linear	$2^{2.751 n + o (n)}$	$2^{2.751 n + o (n)}$
$p = \infty$	[49]	Linear	$2^{2.443 n + o (n)}$	$2^{2.443 n + o (n)}$
$p = 2$	[26]	Quadratic	$2^{2.571 n + o (n)}$	$2^{1.407 n + o (n)}$
	This work	Linear	$2^{2.49 n + o (n)}$	$2^{2.49 n + o (n)}$
	[27, 26]	Quadratic	$2^{2.465 n + o (n)}$	$2^{1.233 n + o (n)}$
	This work	Mixed	$2^{2.25 n + o (n)}$	$2^{2.25 n + o (n)}$
	[20]	DGS	$2^{n + o (n)}$	$2^{n + o (n)}$

Equations40

L = L (b_{1}, \dots, b_{n}) := {i = 1 \sum n z_{i} b_{i} : z_{i} \in Z} .

L = L (b_{1}, \dots, b_{n}) := {i = 1 \sum n z_{i} b_{i} : z_{i} \in Z} .

∥ x ∥_{p}

∥ x ∥_{p}

and ∥ x ∥_{\infty}

B_{n}^{(p)} (x, r) = {y \in R^{n} : ∥ y - x ∥_{p} \leq r}

B_{n}^{(p)} (x, r) = {y \in R^{n} : ∥ y - x ∥_{p} \leq r}

bd (B_{n}^{(p)} (x, r)) = {y \in R^{n} : ∥ y - x ∥_{p} = r} .

bd (B_{n}^{(p)} (x, r)) = {y \in R^{n} : ∥ y - x ∥_{p} = r} .

\displaystyle\mathcal{L}=\mathcal{L}(\mathbf{B})=\Big{\{}\sum_{i=1}^{n}x_{i}\mathbf{b}_{i}:x_{i}\in\mathbb{Z}\quad\text{ for }\quad 1\leq i\leq n\Big{\}}

\displaystyle\mathcal{L}=\mathcal{L}(\mathbf{B})=\Big{\{}\sum_{i=1}^{n}x_{i}\mathbf{b}_{i}:x_{i}\in\mathbb{Z}\quad\text{ for }\quad 1\leq i\leq n\Big{\}}

P (B) = {Bx : x \in [0, 1)^{n}}

P (B) = {Bx : x \in [0, 1)^{n}}

λ_{i}^{(p)} (L) = in f {r : dim (span (L \cap B_{n}^{(p)} (r))) \geq i}

λ_{i}^{(p)} (L) = in f {r : dim (span (L \cap B_{n}^{(p)} (r))) \geq i}

λ_{1}^{(p)} (L) = min {∥ v ∥_{p} : v \in L ∖ {0}}

λ_{1}^{(p)} (L) = min {∥ v ∥_{p} : v \in L ∖ {0}}

N_{h} * vol (L) \leq vol (K + 2 L) \leq (1 + \frac{2 r n ^{1/ p}}{R})^{n} vol (K)

N_{h} * vol (L) \leq vol (K + 2 L) \leq (1 + \frac{2 r n ^{1/ p}}{R})^{n} vol (K)

\dots [- 5 r, - 3 r), [- 3 r, - r), [- r, r), [r, 3 r), [3 r, 5 r), \dots

\dots [- 5 r, - 3 r), [- 3 r, - r), [- r, r), [r, 3 r), [3 r, 5 r), \dots

R_{k}

R_{k}

∥ y - e ∥_{p}

∥ y - e ∥_{p}

σ (e) = ⎩ ⎨ ⎧ e + u e - u e if e \in D_{1} if e \in D_{2} else

σ (e) = ⎩ ⎨ ⎧ e + u e - u e if e \in D_{1} if e \in D_{2} else

q

q

c_{s p a ce} (B N)

c_{s p a ce} (B N)

and c_{t im e} (B N)

c_{s p a ce} (B N^{'})

c_{s p a ce} (B N^{'})

and c_{t im e} (B N^{'})

c_{s p a ce} (B N)

c_{s p a ce} (B N)

and c_{t im e} (B N)

\displaystyle\mathcal{L}^{\prime}=\mathscr{L}\Big{(}\{(\mathbf{v},0):\mathbf{v}\in\mathcal{L}\}\cup\{(\mathbf{t},\tau d/2)\}\Big{)}\;.

\displaystyle\mathcal{L}^{\prime}=\mathscr{L}\Big{(}\{(\mathbf{v},0):\mathbf{v}\in\mathcal{L}\}\cup\{(\mathbf{t},\tau d/2)\}\Big{)}\;.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Data Security · Privacy-Preserving Technologies in Data · Complexity and Algorithms in Graphs

Full text

Faster Provable Sieving Algorithms for the Shortest Vector Problem and the Closest Vector Problem on Lattices in $\ell_{p}$ Norm

Priyanka Mukhopadhyay [email protected], [email protected] Institute for Quantum Computing, University of Waterloo, Waterloo ON, Canada

Dept. of Combinatorics and Optimization, University of Waterloo, Waterloo ON, Canada

Abstract

In this work, we give provable sieving algorithms for the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP) on lattices in $\ell_{p}$ norm ( $1\leq p\leq\infty$ ). The running time we obtain is better than existing provable sieving algorithms. We give a new linear sieving procedure that works for all $\ell_{p}$ norm ( $1\leq p\leq\infty$ ). The main idea is to divide the space into hypercubes such that each vector can be mapped efficiently to a sub-region. We achieve a time complexity of $2^{2.751n+o(n)}$ , which is much less than the $2^{3.849n+o(n)}$ complexity of the previous best algorithm. We also introduce a mixed sieving procedure, where a point is mapped to a hypercube within a ball and then a quadratic sieve is performed within each hypercube. This improves the running time, especially in the $\ell_{2}$ norm, where we achieve a time complexity of $2^{2.25n+o(n)}$ , while the List Sieve Birthday algorithm has a running time of $2^{2.465n+o(n)}$ . We adopt our sieving techniques to approximation algorithms for SVP and CVP in $\ell_{p}$ norm ( $1\leq p\leq\infty$ ) and show that our algorithm has a running time of $2^{2.001n+o(n)}$ , while previous algorithms have a time complexity of $2^{3.169n+o(n)}$ .

1 Introduction

A lattice $\mathcal{L}$ is the set of all integer combinations of linearly independent vectors $\mathbf{b}_{1},\dots,\mathbf{b}_{n}\in\mathbb{R}^{d}$ ,

[TABLE]

We call $n$ the rank of the lattice and $d$ the dimension of the lattice. The matrix $\mathbf{B}=(\mathbf{b}_{1},\dots,\mathbf{b}_{n})$ is called a basis of $\mathcal{L}$ . A lattice is said to be full-rank if $n=d$ . In this work, we only consider full-rank lattices unless otherwise stated.

The two most important computational problems on lattices are the Shortest Vector Problem (SVP) and the Closest Vector Problem (CVP). Given a basis for a lattice $\mathcal{L}\subseteq\mathbb{R}^{d}$ , the goal of SVP is to compute the shortest non-zero vector in $\mathcal{L}$ , while the goal of CVP is to compute a lattice vector at a minimum distance to a given target vector $\mathbf{t}$ . Typically, the length/distance is defined in terms of the $\ell_{p}$ norm, which is given by

[TABLE]

These lattice problems have been mostly studied in the Euclidean norm ( $p=2$ ). Starting with the seminal work of [1], algorithms for solving these problems either exactly or approximately have been studied intensely. These algorithms have found applications in various fields, such as factoring polynomials over rationals [1], integer programming [2, 3, 4, 5], cryptanalysis [6, 7, 8], checking the solvability by radicals [9], and solving low-density subset-sum problems [10]. More recently, many powerful cryptographic primitives have been constructed whose security is based on the worst-case hardness of these or related lattice problems [11, 12, 13, 14, 15, 16, 17, 18, 19].

1.1 Prior Work

The lattice algorithms that have been developed to solve SVP and CVP are either based on sieving techniques [21, 20], enumeration methods [22, 3], basis reduction [1, 23], or Voronoi cell-based deterministic computation [24, 4, 25]. The fastest of these run in a time of $2^{cn}$ , where $n$ is the rank of the lattice and $c$ is some constant. Since the aim of this paper is to improve time complexity of sieving algorithms, we mainly focus on these. For an overview of the other types of algorithms, interested readers can refer to the survey by Hanrot et al. [26].

1.1.1 Sieving Algorithms in the Euclidean Norm

The first algorithm to solve SVP in the time exponential in the dimension of the lattice was given by Ajtai, Kumar, and Sivakumar [21] who devised a method based on “randomized sieving”, whereby exponentially many randomly generated lattice vectors are iteratively combined to create increasingly short vectors, eventually resulting in the shortest vector in the lattice. The time complexity of this algorithm was shown to be $2^{3.4n+o(n)}$ by Micciancio and Voulgaris [27]. This was later improved by Pujol and Stehle [28], who analyzed it with the birthday paradox and gave a time complexity of $2^{2.571n+o(n)}$ . In [27] the authors introduced List Sieve, which was modified in [28] (List Sieve Birthday) to give a time complexity of $2^{2.465n+o(n)}$ . The current fastest provable algorithm for exact SVP runs in a time of $2^{n+o(n)}$ [20, 29], and the fastest algorithm that gives a large constant approximation runs in a time of $2^{0.802n+o(n)}$ [30].

To make lattice sieving algorithms more practical for implementation, heuristic variants were introduced in [31, 27]. Efforts have been made to decrease the asymptotic time complexity at the cost of using more space [32, 33, 34, 35] and to study the trade-offs in reducing the space complexity [35, 36, 37, 38]. Attempts have been made to make these algorithms competitive in high-performance computing environments [39, 40, 41, 42, 43]. The theoretically fastest heuristic algorithm that is conjectured to solve SVP runs in a time of $2^{0.29n+o(n)}$ [33] (LDSieve).

The CVP is considered to be a harder problem than SVP since there is a simple dimension and approximation-factor preserving reduction from SVP to CVP [44]. Based on a technique due to Kannan [3], Ajtai, Kumar, and Sivakumar [45] gave a provable sieving based algorithm that gives a $1+\alpha$ approximation of CVP in time $(2+1/\alpha)^{O(n)}$ . Later, exact exponential time algorithms for CVP were discovered [24, 46]. The current fastest algorithm for CVP runs in a time of $2^{n+o(n)}$ and is due to [46].

1.1.2 Algorithms in Other $\ell_{p}$ Norms

Blomer and Naewe [47] and then Arvind and Joglekar [48] generalized the AKS algorithm [21] to give exact provable algorithms for SVP that run in a time of $2^{O(n)}$ . Additionally, [47] gave a $1+\varepsilon$ approximation algorithm for CVP for all $\ell_{p}$ norms that runs in a time of $(2+1/\varepsilon)^{O(n)}$ . For the special case when $p=\infty$ , Eisenbrand et al. [5] gave a $2^{O(n)}\cdot(\log(1/\varepsilon))^{n}$ algorithm for $(1+\varepsilon)$ -approx CVP. Aggarwal and Mukhopadhyay [49] gave an algorithm for SVP and approximate CVP in the $\ell_{\infty}$ norm using a linear sieving technique that significantly improves the overall running time. In fact, for a large constant approximation factor, they achieved a running time of $3^{n}$ for SVP. The authors have argued that it is not possible for any of the above-mentioned algorithms to achieve this running time in the $\ell_{\infty}$ norm.

1.1.3 Hardness Results

The first NP hardness result for CVP in all $\ell_{p}$ norms and SVP in the $\ell_{\infty}$ norm was given by Van Emde Boas [50]. Ajtai [51] proved that SVP is NP-hard under randomized reductions. Micciancio [52] showed that SVP is NP-hard to approximate within some constant approximation factor. Subsequently, it was shown that approximating CVP in any $\ell_{p}$ norm and SVP in $\ell_{\infty}$ norm up to a factor of $n^{c/\log\log n}$ is NP-hard [53, 54]. This difficulty of the approximation factor has been improved to $n^{c}$ in [55], assuming the Projection Games Conjecture [56]. Furthermore, the difficulty of SVP up to factor $2^{\log^{1-\epsilon}n}$ has been obtained assuming $\text{NP}\nsubseteq\text{RTIME}(n^{\text{poly}(\log n)})$ [57, 58]. Recently, [59] showed that for almost all $p\geq 1$ , CVP in the $\ell_{p}$ norm cannot be solved in $2^{n(1-\varepsilon)}$ of time under the strong exponential time hypothesis. A similar difficulty result has also been obtained for SVP in the $\ell_{p}$ norm [60].

1.2 Our Results and Techniques

In this paper, we adopt the framework of [21, 45] and give sieving algorithms for SVP and CVP in $\ell_{p}$ norm for $1\leq p\leq\infty$ . The primary difference between our sieving algorithm and the previous AKS-style algorithms such as those in [21, 45, 47, 48] is in the sieving procedure—ours is a linear sieve, while theirs is a quadratic sieve. This results in an improvement in the overall running time.

Before describing our idea, we give an informal description of the sieving procedure of [21, 45, 47, 48]. The algorithm starts by randomly generating a set $S$ of $N=2^{O(n)}$ lattice vectors with a length of at most $R=2^{O(n)}$ . It then runs a sieving procedure a polynomial number of times. In the $i^{th}$ iteration, the algorithm starts with a list $S$ of lattice vectors of a length of at most $R_{i-1}\approx\gamma^{i-1}R$ , for some parameter $\gamma\in(0,1)$ . The algorithm maintains and updates a list of “centers” $C$ , which is initialized to be the empty set. Then, for each lattice vector $\mathbf{y}$ in the list, the algorithm checks whether there is a center $\mathbf{c}$ at a distance of at most $\gamma\cdot R_{i-1}$ from this vector. If there exists such a center, then the vector $\mathbf{y}$ is replaced in the list by $\mathbf{y}-\mathbf{c}$ , and otherwise it is deleted from $S$ and added to $C$ . This results in $N_{i-1}-|C|$ lattice vectors which have a length of at most $R_{i}\approx\gamma R_{i-1}$ , where $N_{i-1}$ is the number of lattice vectors at the end of $i-1$ sieving iterations. We would like to mention here that this description hides many details and in particular, in order to show that this algorithm succeeds eventually in obtaining the shortest vector, we need to add a small perturbation to the lattice vectors to start with. The details of this can be found in Section 3.

A crucial step in this algorithm is to find a vector $\mathbf{c}$ from the list of centers that is close to $\mathbf{y}$ . This problem is called the nearest neighbor search (NNS) problem and has been well studied, especially in the context of heuristic algorithms for SVP (see [33] and the references therein). A trivial bound on the running time for this is $|S|\cdot|C|$ , but much effort has been dedicated to improving this bound under heuristic assumptions (see Section 1.1.1 for some references). Since they require heuristic assumptions, such improved algorithms for the NNS have not been used to improve the provable algorithms for SVP.

One can also view such sieving procedures as a division of the “ambient” geometric space (consisting of all the vectors in the current list). In the $i^{th}$ iteration, the space of all vectors with a length of at most $R_{i-1}$ is divided into a number of sub-regions such that in each sub-region the vectors are within a distance of at most $\gamma R_{i-1}$ from a center. In the previous provable sieving algorithms such as those in [21, 47, 48, 27] or even the heuristic ones, these sub-regions have been an $\ell_{p}$ ball of certain radius (if the algorithm is in $\ell_{p}$ norm) or some sections of it (spherical cap, etc). Given a vector, one has to compare it with all the centers (and hence sub-regions formed so far) to determine in which of these sub-regions it belongs. If none is found, we make it a center and associate a new sub-region with it. Note that such a division of space depends on the order in which the vectors are processed.

The basic idea behind our sieving procedure (let us call it Linear Sieve) is similar to that used in [49, 61] in the special case of the $\ell_{\infty}$ norm. In fact, our procedure is a generalization of this method for all $\ell_{p}$ norm ( $1\leq p\leq\infty$ ). We select these sub-regions as hypercubes and divide the ambient geometric space a priori (before we start processing the vectors in the current list) considering only the maximum length of a vector in the list. A diagrammatic representation of such a division of space in two dimensions has been given in Figure 1. It must be noted that in this figure (for ease of illustration), the radius of the small hypercube (square) is the same for $\ell_{1},\ell_{2}$ , and $\ell_{\infty}$ balls (circles). However, in our algorithm, this radius depends on the norm. The advantage we obtain is that we can map a vector to a sub-region efficiently -in $O(n)$ time; i.e., in a sense we obtain better “decodability” property. If the vector’s hypercube (sub-region) does not contain a center, we select this point as the center; otherwise, we subtract this vector from the center to obtain a shorter lattice vector. Thus, the time complexity of each sieving procedure is linear in the number of sampled vectors. Overall, we obtain an improved time complexity at the cost of increased space complexity compared to previous algorithms [48, 47, 26]. A more detailed explanation can be found in Section 3.1.

Specifically, we obtain the following result.

Theorem 3.2 in Section 3.3

.

Let $\gamma\in(0,1)$ , and let $\xi>1/2$ . Given a full.rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ , there is a randomized algorithm for $\textsf{SVP}^{(p)}$ with a success probability of at least $1/2$ , space complexity of at most $2^{c_{space}n+o(n)}$ , and running time of at most $2^{c_{time}n+o(n)}$ , where $c_{space}=c_{s}+\max(c_{c},c_{b}/2)$ and $c_{time}=\max(c_{space},c_{b})$ , where $c_{c}=\log\left(2+\frac{2}{\gamma}\right),\quad c_{s}=-\log\Big{(}0.5-\frac{1}{4\xi}\Big{)}$ and $c_{b}=\log\left(1+\frac{2\xi(2-\gamma)}{1-\gamma}\right)$ .

A mixed sieving algorithm

In an attempt to gain as many advantages as possible, we introduce a mixed sieving procedure (let us call it Mixed Sieve). Here, we divide a hyperball into larger hypercubes so that we can map each point efficiently to a hypercube. Within a hypercube, we perform a quadratic sieving procedure such as AKS with the vectors in that region. This improves both time and space complexity, especially in the Euclidean norm.

Approximation algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$

We have adopted our sieving techniques to approximation algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$ . The idea is quite similar to that described in [49, 61] (where it was shown to work for only the $\ell_{\infty}$ norm). In Section 5.1, we have shown that our approximation algorithms are faster than those of [48, 47], but again they require more space.

*Remark 1.1**.*

It is quite straightforward to extend our algorithm to the Subspace Avoiding Problem (SAP) (or Generalized Shortest Vector Problem GSVP) [47, 48]: replace the quadratic sieve by any one of the faster sieves described in this paper. We thus obtain a similar improvement in running time. By Theorem 3.4 in [47], there are polynomial time reductions from other lattice problems such as the Successive Minima Problem (SMP) (given a lattice $\mathcal{L}$ with rank $n$ , the Successive Minima Problem (SMP) requires to find $n$ linearly independent vectors $\mathbf{v}_{1},\ldots,\mathbf{v}_{n}\in\mathcal{L}$ such that $\|\mathbf{v}_{i}\|_{p}\leq c\lambda_{i}^{(p)}(\mathcal{L})$ .) and Shortest Independent Vector Problem (SIVP) (given a rank $n$ lattice $\mathcal{L}$ the Shortest Independent Vector Problem (SIVP) requires to find $n$ linearly independent vectors $\mathbf{v}_{1},\ldots\mathbf{v}_{n}\in\mathcal{L}$ such that $\|\mathbf{v}_{i}\|_{p}\leq c\lambda_{n}^{(p)}(\mathcal{L})$ . The definition of $\lambda^{(p)}_{i}$ (and hence $\lambda^{(p)}_{n}$ ) has been given in Section 2 (Definition 2.5); $c$ is the approximation factor) with approximation factor $1+\epsilon$ to GSVP with approximation factor $1+\epsilon$ . Thus, we can obtain a similar improvement in running time for both these problems. Since in this paper, we focus mainly on SVP and CVP, we do not delve into further details for these other problems.

*Remark 1.2**.*

Our algorithm (and in that case any sieving algorithm) is quite different from deterministic algorithms such as those in [4, 62]. They reduce the problem in any norm to a $\ell_{2}$ norm and compute an approximation of the shortest vector length (or distance of the closest lattice point to a target in case of CVP) using the Voronoi cell-based deterministic algorithm in [27]. Then, they enumerate all lattice points within a convex region to find the shortest one. Constructing ellipsoidal coverings, it has been shown that the lattice points within a convex body can be computed in a time proportional to the maximum number of lattice points that the body can contain in any translation of an ellipsoid. Note for $\ell_{p}$ norm that any smaller $\ell_{q}$ ball (where $p=q$ or $p\neq q$ ) can serve this purpose, and the bound on the number of translates comes from standard packing arguments. For these deterministic algorithms, the target would be to chose a shape so that the upper bound (packing bound) on the number of translates can be reduced. Thus, the authors chose small $\ell_{p}$ balls to cover a larger $\ell_{p}$ ball.

In contrast, in our sieving algorithm, we aimed to map each lattice point efficiently within a sub-region. Thus, we divided any arbitrary $\ell_{p}$ ball into smaller hypercubes. The result was an increase in space complexity, but due to the efficient mapping, we reduced the running time. To the best of our knowledge, this kind of sub-divisions has not been used before in any sieving algorithm. The focus of our paper is to develop randomized sieving algorithms. Thus, we will not delve further into the details of the above-mentioned deterministic algorithms. Clearly, these are different procedures.

1.3 Organization of the Paper

In Section 2, we give some preliminary definitions and results that are useful for this paper. In Section 3, we introduce the linear sieving technique, while in Section 4, we describe the mixed sieving technique. In Section 5, we discuss how to extend our sieving methods to approximation algorithms.

2 Preliminaries

2.1 Notations

We write $\log_{q}$ to represent the logarithm to the base $q$ , and simply $\log$ when the base is $q=2$ . We denote the natural logarithm by $\ln$ .

We use bold lowercase letters (e.g., $\mathbf{v}^{n}$ ) for vectors and bold uppercase letters for matrices (e.g., $\mathbf{M}^{m\times n}$ ). We may drop the dimension in the superscript whenever it is clear from the context. Sometimes, we represent a matrix as a vector of column (vectors) (e.g., $\mathbf{M}^{m\times n}=[\mathbf{m}_{1}\mathbf{m}_{2}\ldots\mathbf{m}_{n}]$ where each $\mathbf{m}_{i}$ is an $m-$ length vector). The $i^{th}$ co-ordinate of $\mathbf{v}$ is denoted by $v_{i}$ .

Given a vector $\mathbf{x}=\sum_{i=1}^{n}x_{i}\mathbf{m}_{i}$ with $x_{i}\in\mathbb{Q}$ , the representation size of $\mathbf{x}$ with respect to $\mathbf{M}$ is the maximum of $n$ and the binary lengths of the numerators and denominators of the coefficients $x_{i}$ .

We denote the volume of a geometric body $A$ by $\text{vol}(A)$ .

2.2 $\ell_{p}$ Norm and Ball

Definition 2.1.

The $\ell_{p}$ ** norm** of a vector $\mathbf{v}\in\operatorname{\mathbb{R}}^{n}$ is defined by

$\|\mathbf{v}\|_{p}=\Big{(}\sum_{i=1}^{n}|v_{i}|^{p}\Big{)}^{1/p}$ for $1\leq p<\infty$ and $\|\mathbf{v}\|_{\infty}=\max\{|v_{i}|:i=1,\ldots n\}$ for $p=\infty$ .

Fact 2.1.

For $\mathbf{x}\in\operatorname{\mathbb{R}}^{n}\quad\|\mathbf{x}\|_{p}\leq\|\mathbf{x}\|_{2}\leq\sqrt{n}\|\mathbf{x}\|_{p}$ for $p\geq 2$ and

$\frac{1}{\sqrt{n}}\|\mathbf{x}\|_{p}\leq\|\mathbf{x}\|_{2}\leq\|\mathbf{x}\|_{p}$ for $1\leq p<2$ .

Definition 2.2.

A ball is the set of all points within a fixed distance or radius (defined by a metric) from a fixed point or center. More precisely, we define the (closed) ball centered at $\mathbf{x}\in\operatorname{\mathbb{R}}^{n}$ with radius $r$ as

[TABLE]

.

The boundary of $B^{(p)}_{n}(\mathbf{x},r)$ is the set

[TABLE]

We may drop the first argument when the ball is centered at the origin $\mathbf{0}$ and drop both arguments for a unit ball centered at the origin. Let

$B^{(p)}_{n}(\mathbf{x},r_{1},r_{2})=B^{(p)}_{n}(\mathbf{x},r_{2})\setminus B^{(p)}_{n}(\mathbf{x},r_{1})=\{\mathbf{y}\in\operatorname{\mathbb{R}}^{n}:r_{1}<\|\mathbf{y}-\mathbf{x}\|_{p}\leq r_{2}\}$ . We drop the first argument if the spherical shell or corona is centered at the origin.

Fact 2.2.

$|B^{(p)}_{n}(\mathbf{x},c\cdot r)|=c^{n}\cdot|B^{(p)}_{n}(\mathbf{x},r)|$ for all $c>0$ .

Fact 2.3.

$\text{vol}(B^{(p)}_{n}(R))=\frac{\Big{(}2\Gamma\left(\frac{1}{p}+1\right)R\Big{)}^{n}}{\Gamma\left(\frac{n}{p}+1\right)}$ . Specifically $\text{vol}(B^{(\infty)}_{n}(R))=(2R)^{n}$ .

The algorithm of Dyer, Frieze, and Kannan [63] almost uniformly selects a point in any convex body in polynomial time if a membership oracle is given [64]. For the sake of simplicity, we ignore the implementation detail and assume that we are able to uniformly select a point in $B^{(p)}_{n}(\mathbf{x},r)$ in polynomial time.

2.3 Lattice

Definition 2.3.

A lattice $\mathcal{L}$ is a discrete additive subgroup of $\operatorname{\mathbb{R}}^{d}$ . Each lattice has a basis $\mathbf{B}=[\mathbf{b}_{1},\mathbf{b}_{2},\ldots\mathbf{b}_{n}]$ , where $\mathbf{b}_{i}\in\operatorname{\mathbb{R}}^{d}$ and

[TABLE]

For algorithmic purposes, we can assume that $\mathcal{L}\subseteq\mathbb{Q}^{d}$ . We call $n$ the rank of $\mathcal{L}$ and $d$ the dimension. If $d=n$ , the lattice is said to be full-rank. Though our results can be generalized to arbitrary lattices, in the rest of the paper, we only consider full-rank lattices.

Definition 2.4.

For any lattice basis $\mathbf{B}$ , we define the fundamental parallelepiped as

[TABLE]

If $\mathbf{y}\in\mathscr{P}(\mathbf{B})$ , then $\|\mathbf{y}\|_{p}\leq n\|\mathbf{B}\|_{p}$ , as can be easily seen by triangle inequality. For any $\mathbf{z}\in\operatorname{\mathbb{R}}^{n}$ , there exists a unique $\mathbf{y}\in\mathscr{P}(\mathbf{B})$ such that $\mathbf{z}-\mathbf{y}\in\mathcal{L}(\mathbf{B})$ . This vector is denoted by $\mathbf{y}\equiv\mathbf{z}\mod\mathbf{B}$ and it can be computed in polynomial time given $\mathbf{B}$ and $\mathbf{z}$ .

Definition 2.5.

For $i\in[n]$ , the $i^{th}$ ** successive minimum** is defined as the smallest real number $r$ such that $\mathcal{L}$ contains $i$ linearly independent vectors with a length of at most $r$ :

[TABLE]

Thus, the first successive minimum of a lattice is the length of the shortest non-zero vector in the lattice:

[TABLE]

We consider the following lattice problems. In all the problems defined below, $c\geq 1$ is some arbitrary approximation factor (usually specified as subscript), which can be a constant or a function of any parameter of the lattice (usually rank). For exact versions of the problems (i.e., $c=1$ ), we drop the subscript.

Definition 2.6 (Shortest Vector Problem ( $\textsf{SVP}_{c}^{(p)}$ )).

Given a lattice $\mathcal{L}$ , find a vector $\mathbf{v}\in\mathcal{L}\setminus\{\mathbf{0}\}$ such that $\|\mathbf{v}\|_{p}\leq c\|\mathbf{u}\|_{p}$ for any other $\mathbf{u}\in\mathcal{L}\setminus\{\mathbf{0}\}$ .

Definition 2.7 (Closest Vector Problem ( $\textsf{CVP}_{c}^{(p)}$ )).

Given a lattice $\mathcal{L}$ with rank $n$ and a target vector $\mathbf{t}\in\operatorname{\mathbb{R}}^{n}$ , find $\mathbf{v}\in\mathcal{L}$ such that $\|\mathbf{v}-\mathbf{t}\|_{p}\leq c\|\mathbf{w}-\mathbf{t}\|_{p}$ for all other $\mathbf{w}\in\mathcal{L}$ .

Lemma 2.1 ([61]).

The LLL algorithm [1] can be used to solve ${\textsf{SVP}_{2^{n}}^{(p)}}$ in polynomial time.

The following result shows that in order to solve $\textsf{SVP}_{1+\epsilon}^{(p)}$ , it is sufficient to consider the case when $2\leq\lambda^{(p)}_{1}(\mathcal{L})<3$ . This is done by appropriately scaling the lattice.

Lemma 2.2 (Lemma 4.1 in [47]).

For all $\ell_{p}$ norms, if there is an algorithm $A$ that for all lattices $\mathcal{L}$ with $2\leq\lambda^{(p)}_{1}(\mathcal{L})<3$ solves $\textsf{SVP}_{1+\epsilon}^{(p)}$ in time $T=T(n,b,\epsilon)$ , then there is an algorithm $A^{\prime}$ that solves $\textsf{SVP}_{1+\epsilon}^{(p)}$ for all lattices in time $O(nT+n^{4}b)$ .

Thus, henceforth, we assume $2\leq\lambda^{(p)}_{1}(\mathcal{L})<3$ .

2.4 Some Useful Definitions and Results

In this section, we give some results and definitions which are useful for our analysis later.

Definition 2.8.

Let $P$ and $Q$ are two point sets in $\operatorname{\mathbb{R}}^{n}$ . The Minkowski sum of $P$ and $Q$ , denoted as $P\oplus Q$ , is the point set $\{p+q:p\in P,q\in Q\}$ .

Lemma 2.3.

Let $B_{1}=B^{(p)}_{n}(\mathbf{0},a)$ and $B_{2}=B^{(p)}_{n}(\mathbf{v},a)$ such that $\|\mathbf{v}\|_{p}=\lambda^{(p)}_{1}$ and $\lambda^{(p)}_{1}<2a$ . Let $D=B_{1}\cap B_{2}$ .

If $|D|$ and $|B_{1}|$ are the volumes of $D$ and $B_{1}$ , respectively, then

[65]** $\frac{|D|}{|B_{1}|}\geq 2^{-n}\Big{(}1-\frac{\lambda^{(p)}_{1}}{2a}\Big{)}^{n}\text{ if }1\leq p<\infty.$ 2. 2.

[26]** When $p=2$ , further optimization can be done such that we get

$\frac{|D|}{|B_{1}|}\geq\Big{[}1-\Big{(}\frac{\lambda^{(2)}_{1}}{2a}\Big{)}^{2}\Big{]}^{n/2}$ . 3. 3.

[49]** When $p=\infty$ then $\frac{|D|}{|B_{1}|}\geq\Big{(}1-\frac{\lambda^{(\infty)}_{1}}{2a}\Big{)}^{n}$ .

Theorem 2.1 (Kabatiansky and Levenshtein [66]).

Let $E\subseteq\operatorname{\mathbb{R}}^{n}\setminus\{\mathbf{0}\}$ . If there exists $\phi_{0}>0$ such that for any $\mathbf{u},\mathbf{v}\in E$ , we have $\phi_{\mathbf{u},\mathbf{v}}\geq\phi_{0}$ , then $|E|\leq 2^{cn+o(n)}$ with $c=-\frac{1}{2}\log[1-\cos(\min(\phi_{0},62.99^{\circ}))]-0.099$ .

Here, $\phi_{\mathbf{u},\mathbf{v}}$ is the angle between the vectors $\mathbf{u}$ and $\mathbf{v}$ .

Below, we give some bounds which work for all $\ell_{p}$ norms. We especially mention the bounds obtained for the $\ell_{2}$ norm where some optimization has been performed using Theorem 2.1.

Lemma 2.4.

[47]** Let $c_{c}=\log(1+\frac{2}{\gamma})$ . If $\mathcal{C}$ is a set of points in $B^{(p)}_{n}(R)$ such that the distance between two points is at least $\gamma R$ , then $|\mathcal{C}|\leq 2^{c_{c}n+o(n)}$ . 2. 2.

[27, 26]** When $p=2$ , we can have $|\mathcal{C}^{(2)}|\leq 2^{c_{c}^{(2)}n+o(n)}$ where $c_{c}^{(2)}=-\log\gamma+0.401$ .

Since the distance between two lattice vectors is at most $\lambda^{(p)}_{1}(\mathcal{L})$ , we obtain the following corollary.

Corollary 2.1.

Let $\mathcal{L}$ be a lattice and $R$ be a real number greater than the length of the shortest vector in the lattice.

[65]** $|B^{(p)}_{n}(R)\cap\mathcal{L}|\leq 2^{c_{b}n}$ where $c_{b}=\log\Big{(}1+\frac{2R}{\lambda^{(p)}_{1}}\Big{)}$ . 2. 2.

[28, 26]** $|B^{(2)}_{n}(R)\bigcap\mathcal{L}|\leq 2^{c_{b}^{(2)}n+o(n)}$ where $c_{b}^{(2)}=\log\frac{R}{\lambda^{(2)}_{1}}+0.401$ .

3 A Faster Provable Sieving Algorithm in $\ell_{p}$ Norm

In this section, we present an algorithm for $\textsf{SVP}^{(p)}$ that uses the framework of the AKS algorithm [21] but uses a different sieving procedure that yields a faster running time. Using Lemma 2.1, we can obtain an estimate $\lambda^{*}$ of $\lambda^{(p)}_{1}(\mathcal{L})$ such that $\lambda^{(p)}_{1}(\mathcal{L})\leq\lambda^{*}\leq 2^{n}\cdot\lambda^{(p)}_{1}(\mathcal{L})$ . Thus, if we try polynomially many different values of $\lambda=(1+1/n)^{-i}\lambda^{*}$ , for $i\geq 0$ , then for one of them, we have $\lambda^{(p)}_{1}(\mathcal{L})\leq\lambda\leq(1+1/n)\cdot\lambda^{(p)}_{1}(\mathcal{L})$ . For the rest of this section, we assume that we know an estimated $\lambda$ of the length of the shortest vector in $\mathcal{L}$ , which is correct up to a factor $1+1/n$ .

The AKS algorithm (or its $\ell_{p}$ norm generalization in [48, 47]) initially uniformly samples a large number of perturbation vectors, $\mathbf{e}\in B^{(p)}_{n}(d)$ , where $d\in\operatorname{\mathbb{R}}_{>0}$ , and for each such perturbation vector, it maintains a vector $\mathbf{y}$ close to the lattice ( $\mathbf{y}$ is such that $\mathbf{y}-\mathbf{e}\in\mathcal{L}$ ). Thus, initially, we have a set $S$ of many such pairs $(\mathbf{e},\mathbf{y})\in B^{(p)}_{n}(d)\times B^{(p)}_{n}(R)$ for some $R\in 2^{O(n)}$ . The desired situation is that after a polynomial number of such sieving iterations, we are left with a set of vector pairs $(\mathbf{e}^{\prime\prime},\mathbf{y}^{\prime\prime})$ such that $\mathbf{y}^{\prime\prime}-\mathbf{e}^{\prime\prime}\in\mathcal{L}\cap B^{(p)}_{n}(O(\lambda^{(p)}_{1}(\mathcal{L})))$ . Finally, we take the pair-wise differences of the lattice vectors corresponding to these vector pairs and output the one with the smallest non-zero norm. It was shown in [21, 48, 47] that, with overwhelming probability, this is the shortest vector in the lattice.

One of the main and usually the most expensive steps in this algorithm is the sieving procedure, where given a list of vector pairs $(\mathbf{e},\mathbf{y})\in B^{(p)}_{n}(d)\times B^{(p)}_{n}(R)$ in each iteration, it outputs a list of vector pairs $(\mathbf{e}^{\prime},\mathbf{y}^{\prime})\in B^{(p)}_{n}(d)\times B^{(p)}_{n}(\gamma R)$ where $\gamma\in\operatorname{\mathbb{R}}_{(0,1)}$ . In each sieving iteration, a number of vector pairs (usually exponential in $n$ ) are identified as “center pairs”. The second element of each such center pair is referred to as the “center”. By a well-defined map, each of the remaining vector pairs is associated to a “center pair” such that after certain operations (such as subtraction) on the vectors, we obtain a pair with a vector difference yielding a lattice vector with a norm less than $R^{\prime}$ . If we start an iteration with say $N^{\prime}$ vector pairs and identify $|\mathcal{C}|$ number of center pairs, then the output consists of $N^{\prime}-|\mathcal{C}|$ vector pairs. An illustration is given in Figure 2. In [21] and most other provable variants or generalizations such as [48, 47], the running time of this sieving procedure, which is the dominant part of the total running time of the algorithm, is roughly quadratic in the number of sampled vectors.

Here, we propose a different sieving approach to reduce the overall time complexity of the algorithm. This can be thought of as a generalization of the sieving method introduced in [49] for the $\ell_{\infty}$ norm. We divide the space such that each lattice vector can be mapped efficiently into some desired division. In the following subsection, we explain this sieving procedure, whose running time is linear in the number of sampled vectors.

3.1 Linear Sieve

In the initial AKS algorithm [21, 45] as well as in all its variants thereafter [47, 48, 27], in the sieving sub-routine, a space $B^{(p)}_{n}(R)$ has been divided into sub-regions such that each sub-region is associated with a center. Then, given a vector, we map it to a sub-region and subtract it from the center so that we get a vector of length at most $\gamma R$ . We must aim to select these sub-regions such that we can (i) map a vector efficiently to a sub-region (ii) without increasing the number of centers “too much”. The latter factor is determined by the number of divisions of $B^{(p)}_{n}(R)$ into these sub-regions and directly contributes to the space (and hence time) complexity.

In all the previous provable sieving algorithms, the sub-regions were small hyperballs (or parts of them) in $\ell_{p}$ norm. In this paper, our sub-regions are hypercubes. The choice of this particular sub-region makes the mapping very efficient. First, let us note that, in contrast with the previous algorithms (except [49]), we divide the space a priori. This can be done by dividing each co-ordinate axis into intervals of length $\frac{\gamma R}{n^{1/p}}$ so that the distance between any two vectors in the resulting hypercube is at most $\gamma R$ . In an ordered list, we store an appropriate index (say, co-ordinates of one corner) of only those hypercubes which have a non-zero intersection with $B^{(p)}_{n}(R)$ . We can map a vector to a hypercube in $O(n)$ time simply by looking at the intervals in which each of its co-ordinates belong. If the hypercube contains a center, then we subtract the vectors and store the difference; otherwise, we assign this vector as the center. An illustration is given in Figure 3.

The following lemma gives a bound on the number of hypercubes or centers we obtain by this process. Such a volumetric argument can be found in [67].

Lemma 3.1.

Let $\gamma\in(0,1),R\in\operatorname{\mathbb{R}}_{\geq 1},1\leq p\leq\infty$ and $r=\frac{\gamma R}{2n^{1/p}}$ . The number of translates of $B^{(\infty)}_{n}(r)$ required to cover $B^{(p)}_{n}(R)$ is at most $O\left(\left(2+\frac{2}{\gamma}\right)^{n}\right)$ .

Proof.

Let $N_{h}$ be the number of translates of $L=B^{(\infty)}_{n}(r)$ required to cover $K=B^{(p)}_{n}(R)$ . These translates are all within $K\oplus 2L$ . In addition, noting that $L\subseteq\frac{rn^{1/p}}{R}K$ , we have

[TABLE]

Plugging in the value of $r$ , we have $N_{h}\leq(1+\gamma)^{n}\frac{\text{vol}(K)}{\text{vol}(L)}$ .

Using Fact 2.3, we have $N_{h}\in O\left(\left(2+\frac{2}{\gamma}\right)^{n}\right)$ . ∎

Note that the above lemma implies a sub-division where one hypercube is centered at the origin. Thus, along each axis, we can have the following $2r$ -length intervals:

[TABLE]

We do not know whether this is the most optimal way of sub-dividing $B^{(p)}_{n}(R)$ into smaller hypercubes. In [49], it has been shown that if we divide $[-R,R]$ from one corner—i.e., place one small hypercube at one corner of the larger hypercube $B^{(\infty)}_{n}(R)$ —then $O\left(\left(\Big{\lceil}\frac{2}{\gamma}\Big{\rceil}\right)^{n}\right)$ copies of hypercubes of radius $r$ suffices.

Suppose in one sieving iteration, we have a set $S$ of lattice vectors of length at most $R$ ; i.e., they all lie in $B^{(p)}_{n}(R)$ (Figure 3(a)). We would like to combine points so that we are left with vectors in $B^{(p)}_{n}(\gamma R)$ . We divide each axis into intervals of length $y=\frac{\gamma R}{n^{1/p}}$ and store in an ordered set ( $\mathcal{I}$ ) co-ordinates of one corner of the resulting hypercubes that have a non-zero intersection with $B^{(p)}_{n}(R)$ (Figure 3(b)). Note that this can be done in a time of $O(nN_{h})$ , where $N_{h}$ is the maximum number of hypercube translates as described in Lemma 3.1.

We maintain a list $\mathcal{C}$ of pairs, where the first entry of each pair is an $n$ -tuple in $\mathcal{I}$ (let us call it “index-tuple”) and the second one, initialized as empty set, is for storing a center pair. Given $\mathbf{y}$ , we map it to its index-tuple $I_{\mathbf{y}}$ as follows: we calculate the interval in which each of its co-ordinates belong (steps 10-13 in Algorithm 2). This can be done in $O(n)$ time. This is equivalent to storing information about the hypercube (in Figure 3(b)) in which it belongs or is mapped to. We can access $\mathcal{C}[I_{\mathbf{y}}]$ in constant time. For each $(\mathbf{e},\mathbf{y})\in S$ , if there exists a $(\mathbf{e}_{\mathbf{c}},\mathbf{c})\in\mathcal{C}[I_{\mathbf{y}}]$ —i.e., $I_{\mathbf{y}}=I_{\mathbf{c}}$ (implying $\|\mathbf{y}-\mathbf{c}\|_{p}\leq\gamma R$ )—then we add $(\mathbf{e},\mathbf{y}-\mathbf{c}+\mathbf{e}_{\mathbf{c}})$ to the output set $S^{\prime}$ (Figure 3(c)). Otherwise, we add vector pair $(\mathbf{e},\mathbf{y})$ to $\mathcal{C}[I_{\mathbf{y}}]$ as a center pair. This implies that if there exists a center in the hypercube, then we perform subtraction operations to obtain a shorter vector. Otherwise, we make $(\mathbf{e},\mathbf{y})$ the center for its hypercube. Finally, we return $S^{\prime}$ .

More details of this sieving procedure (Linear Sieve) can be found in Algorithm 2.

3.2 AKS Algorithm with a Linear Sieve

Algorithm 1 describes an exact algorithm for $\textsf{SVP}^{(p)}$ with a linear sieving procedure (Linear Sieve) (Algorithm 2).

Lemma 3.2.

Let $\gamma\in\operatorname{\mathbb{R}}_{(0,1)}$ . The number of center pairs in Algorithm 2 always satisfies $|\mathcal{C}|\leq 2^{c_{c}n+o(n)}$ where $c_{c}=\log\left(2+\frac{2}{\gamma}\right)$ .

Proof.

This follows from Lemma 3.1 in Section 3.1.

∎

Claim 3.1.

*The following two invariants are maintained in Algorithm 1:

$\quad\forall(\mathbf{e},\mathbf{y})\in S,\quad\mathbf{y}-\mathbf{e}\in\mathcal{L}\qquad\qquad$
$\quad\forall(\mathbf{e},\mathbf{y})\in S,\quad\|\mathbf{y}\|_{p}\leq R$ .*

Proof.

The first invariant is maintained at the beginning of the sieving iterations in Algorithm 1 due to the choice of $\mathbf{y}$ at step 1 of Algorithm 1.

Since each center pair $(\mathbf{e}_{\mathbf{c}},\mathbf{c})$ once belonged to $S$ , $\mathbf{c}-\mathbf{e}_{\mathbf{c}}\in\mathcal{L}$ . Thus, at step 2 of the sieving procedure (Algorithm 2), we have $(\mathbf{e}-\mathbf{y})+(\mathbf{c}-\mathbf{e}_{\mathbf{c}})\in\mathcal{L}$ . 2. 2.

The second invariant is maintained in steps 1–1 of Algorithm 1 because $\mathbf{y}\in\mathscr{P}(\mathbf{B})$ and hence $\|\mathbf{y}\|_{p}\leq\sum_{i=1}^{n}\|\mathbf{b}_{i}\|_{p}\leq n\max_{i}\|\mathbf{b}_{i}\|_{p}=R$ .

We claim that this invariant is also maintained in each iteration of the sieving procedure.

Consider a pair $(\mathbf{e},\mathbf{y})\in S$ and let $I_{\mathbf{y}}$ be its index-tuple. Let $(\mathbf{e}_{\mathbf{c}},\mathbf{c})$ be its associated center pair. By Algorithm 2, we have $I_{\mathbf{y}}=I_{\mathbf{c}}$ ; i.e., $\|\mathbf{y}-\mathbf{c}\|_{p}^{p}=\sum_{i=1}^{n}|y_{i}-c_{i}|^{p}\leq\sum_{i=1}^{n}\frac{\gamma^{p}R^{p}}{n}\leq\gamma^{p}R^{p}$ . Thus, $\|\mathbf{y}-\mathbf{c}\|_{p}\leq\gamma R$ and hence $\quad\|\mathbf{y}-\mathbf{c}+\mathbf{e}_{\mathbf{c}}\|_{p}\leq\|\mathbf{y}-\mathbf{c}\|_{p}+\|\mathbf{e}_{\mathbf{c}}\|_{p}\leq\gamma R+\xi\lambda.$

The claim follows by the re-assignment of variable $R$ at step 1 in Algorithm 1.

∎

In the following lemma, we bound the length of the remaining lattice vectors after all the sieving iterations are over. The proof is similar to that given in [61], so we write it briefly.

Lemma 3.3.

At the end of $k$ iterations in Algorithm 1, the length of lattice vectors $\|\mathbf{y}-\mathbf{e}\|_{p}\leq\frac{\xi(2-\gamma)\lambda}{1-\gamma}+\frac{\gamma\xi}{n(1-\gamma)}=:R^{\prime}$ .

Proof.

Let $R_{k}$ be the value of $R$ after $k$ iterations, where

$\log_{\gamma}\Big{(}\frac{\xi}{nR(1-\gamma)}\Big{)}\leq k\leq\log_{\gamma}\Big{(}\frac{\xi}{nR(1-\gamma)}\Big{)}+1$ .

Then,

[TABLE]

Thus, after $k$ iterations, $\|\mathbf{y}\|_{p}\leq R_{k}$ , and hence after $k$ iterations,

[TABLE]

∎

Using Corollary 2.1 and assuming $\lambda\approx\lambda^{(p)}_{1}$ , we obtain an upper bound on the number of lattice vectors of a length of at most $R^{\prime}$ ; i.e.,

$|B^{(p)}_{n}(R^{\prime})\cap\mathcal{L}|\leq 2^{c_{b}n+o(n)}$ , where $c_{b}=\log\left(1+\frac{2\xi(2-\gamma)}{1-\gamma}\right)$ .

The above lemma along with the invariants implies that at the beginning of step 1 in Algorithm 1, we have “short” lattice vectors; i.e., vectors with a norm bounded by $R^{\prime}$ . We want to start with a “sufficient number” of vector pairs so that we do not end up with all zero vectors at the end of the sieving iterations. For this, we work with the following conceptual modification proposed by Regev[68].

Let $\mathbf{u}\in\mathcal{L}$ such that $\|\mathbf{u}\|_{p}=\lambda^{(p)}_{1}(\mathcal{L})\approx\lambda$ (where $2<\lambda^{(p)}_{1}(\mathcal{L})\leq 3$ ), $D_{1}=B^{(p)}_{n}(\xi\lambda)\cap B^{(p)}_{n}(-\mathbf{u},\xi\lambda)$ and $D_{2}=B^{(p)}_{n}(\xi\lambda)\cap B^{(p)}_{n}(\mathbf{u},\xi\lambda)$ . Define a bijection $\sigma$ on $B^{(p)}_{n}(\xi\lambda)$ that maps $D_{1}$ to $D_{2}$ , $D_{2}$ to $D_{1}$ and $B^{(p)}_{n}(\xi\lambda)\setminus(D_{1}\cup D_{2})$ to itself :

[TABLE]

For the analysis of the algorithm, we assume that for each perturbation vector $\mathbf{e}$ chosen by our algorithm, we replace $\mathbf{e}$ by $\sigma(\mathbf{e})$ with probability $1/2$ and that it remains unchanged with probability $1/2$ . We call this procedure tossing the vector $\mathbf{e}$ . This does not change the distribution of the perturbation vectors $\{\mathbf{e}\}$ . Further, we assume that this replacement of the perturbation vectors happens at the step where this has any effect on the algorithm for the first time. In particular, at step 2 in Algorithm 2, after we have identified a center pair $(\mathbf{e}_{c},\mathbf{c})$ , we apply $\sigma$ on $\mathbf{e}_{c}$ with probability $1/2$ . Then, at the beginning of step 1 in Algorithm 1, we apply $\sigma$ to $\mathbf{e}$ for all pairs $(\mathbf{e},\mathbf{y})\in S$ . The distribution of $\mathbf{y}$ remains unchanged by this procedure because $\mathbf{y}\equiv\mathbf{e}\equiv\sigma(\mathbf{e})\mod\mathscr{P}(\mathbf{B})$ and $\mathbf{y}-\mathbf{e}\in\mathcal{L}$ . A somewhat more detailed explanation of this can be found in the following result of [47].

Lemma 3.4 (Theorem 4.5 in [47] (re-stated)).

The modification outlined above does not change the output distribution of the actual procedure.

Note that since this is just a conceptual modification intended for ease in analysis, we should not be concerned with the actual running time of this modified procedure. Even the fact that we need a shortest vector to begin the mapping $\sigma$ does not matter.

The following lemma will help us to estimate the number of vector pairs to sample at the beginning of the algorithm.

Lemma 3.5 (Lemma 4.7 in [47]).

Let $N\in\operatorname{\mathbb{N}}$ and $q$ denote the probability that a random point in $B^{(p)}_{n}(\xi\lambda)$ is contained in $D_{1}\cup D_{2}$ . If $N$ points $\mathbf{x}_{1},\ldots\mathbf{x}_{N}$ are chosen uniformly at random in $B^{(p)}_{n}(\xi\lambda)$ , then with a probability larger than $1-\frac{4}{qN}$ , there are at least $\frac{qN}{2}$ points $\mathbf{x}_{i}\in\{\mathbf{x}_{1},\ldots\mathbf{x}_{N}\}$ with the property $\mathbf{x}_{i}\in D_{1}\cup D_{2}$ .

From Lemma 2.3, we have

[TABLE]

Thus, with a probability of at least $1-\frac{4}{qN}$ , we have at least $2^{-c_{s}n}N$ pairs $(\mathbf{e}_{i},\mathbf{y}_{i})$ before the sieving iterations such that $\mathbf{e}_{i}\in D_{1}\cup D_{2}$ .

Lemma 3.6.

If $N\geq\frac{2}{q}(k|\mathcal{C}|+2^{c_{b}n}+1)$ , then with a probability of at least $1/2$ , Algorithm 1 outputs a shortest non-zero vector in $\mathcal{L}$ with respect to $\ell_{p}$ norm for $1\leq p\leq\infty$ .

Proof.

Of the $N$ vector pairs $(\mathbf{e},\mathbf{y})$ sampled in steps 1–1 of Algorithm 1, we consider those such that $\mathbf{e}\in(D_{1}\cup D_{2})$ . We have already seen there are at least $\frac{qN}{2}$ such pairs with a probability of at least $1-\frac{4}{qN}$ . We remove $|\mathcal{C}|$ vector pairs in each of the $k$ sieve iterations. Thus, at step 1 of Algorithm 1, we have $N^{\prime}\geq 2^{c_{b}n}+1$ pairs $(\mathbf{e},\mathbf{y})$ to process.

By Lemma 3.3, each of them is contained within a ball of radius $R^{\prime}$ which can have at most $2^{c_{b}n}$ lattice vectors. Thus, there exists at least one lattice vector $\mathbf{w}$ for which the perturbation is in $D_{1}\cup D_{2}$ , and it appears twice in $S$ at the beginning of step 1. With a probability of $1/2$ , it remains $\mathbf{w}$ , or with the same probability, it becomes either $\mathbf{w}+\mathbf{u}$ or $\mathbf{w}-\mathbf{u}$ . Thus, after taking pair-wise difference at step 1 with a probability of at least $1/2$ , we find the shortest vector. ∎

Theorem 3.1.

*Let $\gamma\in(0,1)$ , and let $\xi>1/2$ . Given a full rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ , there is a randomized algorithm for $\textsf{SVP}^{(p)}$ with a success probability of at least $1/2$ , a space complexity of at most $2^{c_{space}n+o(n)}$ , and running time of at most $2^{c_{time}n+o(n)}$ , where $c_{space}=c_{s}+\max(c_{c},c_{b})$ and $c_{time}=\max(c_{space},2c_{b})$ , where

$c_{c}=\log\left(2+\frac{2}{\gamma}\right),\quad c_{s}=-\log\Big{(}0.5-\frac{1}{4\xi}\Big{)}$ and $c_{b}=\log\left(1+\frac{2\xi(2-\gamma)}{1-\gamma}\right)$ .*

Proof.

If we start with $N$ pairs (as stated in Lemma 3.6), then the space complexity is at most $2^{c_{space}n+o(n)}$ with $c_{space}=c_{s}+\max(c_{c},c_{b})$ .

In each iteration of the sieving Algorithm 2, it takes at most $O(nN_{h})$ time to initialize and index $\mathcal{C}$ (Lemmas 3.1 and 3.2). For each vector pair $(\mathbf{e},\mathbf{y})\in S$ , it takes a time of at most $n$ to calculate its index-tuple $I_{\mathbf{y}}$ . Thus, the time taken to process each vector pair is at most $(n+1)$ , and the total time taken per iteration of Algorithm 2 is at most $O(n(N_{h}+N))$ , which is at most $2^{c_{space}n+o(n)}$ , and there are at most $\text{poly}(n)$ such iterations.

If $N^{\prime}\geq 2^{c_{b}n}+1$ , then the time complexity for the computation of the pairwise differences is at most $(N^{\prime})^{2}\in 2^{2c_{b}n+o(n)}$ .

Thus, the overall time complexity is at most $2^{c_{time}n+o(n)}$ where

$c_{time}=\max(c_{space},2c_{b})$ . ∎

3.3 Improvement Using the Birthday Paradox

We can obtain a better running time and space complexity if we use the birthday paradox to decrease the number of sampled vectors but obtain at least two vector pairs corresponding to the same lattice vector after the sieving iterations [28, 26]. For this, we have to ensure that the vectors are independent and identically distributed before step 1 of Algorithm 1. Thus, we incorporate the following modification, as discussed in [26]. Very briefly, the trick is to set aside many uniformly distributed vector pairs as centers for each sieving step, even before the sieving iterations begin. In each sieving iteration, the probability that a vector pair is not within the required distance of any center pair decreases. Now, if we sample enough vectors, then with a good probability at step 1, we have at least two vectors whose perturbation is in $D_{1}\bigcup D_{2}$ , implying that with a probability of at least 1/2, we obtain the shortest vector.

In the analysis of [26], the authors simply stated that the required center pairs can be sampled uniformly at the beginning. In our linear sieving algorithm, we have an advantage. Unlike the AKS-style algorithms, in which the center pairs are selected and then the space is divided, in our case, we can divide the space a priori. We take advantage of this and conduct a number of random divisions of the space. Since in each iteration, the length of the vectors decreases, the size of the hypercubes also decreases, and this can be calculated. Thus, for each iteration we have a number of divisions of the space into hypercubes of a certain size. For this, we need to divide the axes into intervals of a fixed size. Simply by shifting the intervals in each axis, we can make this division random. Then, among the uniformly sampled vectors, we select a center for each hypercube.

Assume we start with $N\geq\frac{2}{q}(n^{3}k|\mathcal{C}|+n2^{\frac{c_{b}}{2}n})$ sampled pairs. After the initial sampling, for each of the $k$ sieving iterations, we fix $\Omega\Big{(}\frac{2n^{3}}{q}|\mathcal{C}|\Big{)}$ pairs to be used as center pairs in the following way.

Let $R=\max_{i\in[N]}\|\mathbf{y}_{i}\|_{p}$ . We maintain $k$ lists of pairs, $\mathcal{C}_{1},\mathcal{C}_{2},\ldots,\mathcal{C}_{k}$ , where each list is similar to ( $\mathcal{C}$ ), as described in Algorithm 2. In the $i^{th}$ list, we store the indices (co-ordinates of a corner) of translates of $B^{(\infty)}_{n}(r_{i})$ that have a non-zero intersection with $B^{(p)}_{n}(R_{i})$ where $R_{i}=\gamma^{i-1}R+\xi\lambda\frac{1-\gamma^{i-1}}{1-\gamma}$ and $r_{i}=\frac{\gamma R_{i}}{2n^{1/p}}$ . For such a division, we can obtain $O(|\mathcal{C}|)$ center pairs in each list. To meet our requirement, we maintain $O(n^{3})$ such lists for each $i$ . We call these $O(n^{3})$ lists the “sibling lists” of $\mathcal{C}_{i}$ .
For each $(\mathbf{e},\mathbf{y})\in S$ (where $S$ is the set of sampled pairs), we first calculate $\|\mathbf{y}\|_{p}$ to check in which list group it can potentially belong, say $\mathcal{C}_{j}$ . That is, $\mathcal{C}_{j}$ corresponds to the smallest hyperball containing $\mathbf{y}$ . Then, we map it to its index-tuple $I_{\mathbf{y}}$ , as has already been described before. We add $(\mathbf{e},\mathbf{y})$ to a list in $\mathcal{C}_{j}$ or any of its sibling lists if it was empty before. Since we sampled uniformly, this ensures we obtain the required number of (initially) fixed centers, and no other vector can be used as a center throughout the algorithm.

Having set aside the centers, now we repeat the following sieving operations $k$ times. For each vector pair $(\mathbf{e}_{1},\mathbf{y}_{1})\in S$ , we can check which list (or its sibling lists) it can belong to from $\|\mathbf{y}_{1}\|_{p}$ . Then, if a center pair is found, we subtract as in step 2 of Algorithm 2. Otherwise, we discard it and consider it “lost”.

Let us call this modified sieving procedure LinearSieveBirthday. We obtain the following improvement in the running time.

Theorem 3.2.

*Let $\gamma\in(0,1)$ , and let $\xi>1/2$ . Given a full rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ , there is a randomized algorithm for $\textsf{SVP}^{(p)}$ with a success probability of at least $1/2$ , a space complexity of at most $2^{c_{space}n+o(n)}$ , and running time of at most $2^{c_{time}n+o(n)}$ , where $c_{space}=c_{s}+\max(c_{c},\frac{c_{b}}{2})$ and $c_{time}=\max(c_{space},c_{b})$ , where

$c_{c}=\log\left(2+\frac{2}{\gamma}\right),\quad c_{s}=-\log\left(0.5-\frac{1}{4\xi}\right)$ and $c_{b}=\log\left(1+\frac{2\xi(2-\gamma)}{1-\gamma}\right)$ .*

Proof.

This analysis has been taken from [26]. At the beginning of the algorithm, among the pairs set aside as centers for the first step, there are $\Omega\left(n^{3}|\mathcal{C}|\right)$ pairs such that the perturbation is in $D_{1}\bigcup D_{2}$ with high probability (Lemma 3.5). We call them good pairs. After fixing these pairs as centers, the probability that the distance between the next perturbed vector and the closest center is more than $\gamma R$ decreases. The sum of these probabilities is bounded from above by $|\mathcal{C}|$ . As a consequence, once all centers have been processed, the probability for any of the subsequent pairs to be lost is $O\left(\frac{1}{n^{3}}\right)$ . By induction, it can be proved that the same proportion of pairs is lost at each step of the sieve with high probability. As a consequence, no more than $1-\left(1-\frac{1}{n^{3}}\right)^{O(n^{2})}=O\left(\frac{1}{n}\right)$ pairs are lost during the whole algorithm. This means that in the final ball, there are $\Omega\left(n2^{\frac{c_{b}}{2}n}\right)$ probabilistically independent lattice points corresponding to good pairs with high probability. As in the proof of Lemma 3.6 this implies that the algorithm returns a shortest vector with a probability of at least 1/2. ∎

Comparison of Linear Sieve with provable sieving algorithms [21, 45, 47, 48]

For $1\leq p\leq\infty$ , the number of centers obtained by [47] is

$|\mathcal{C}(BN)|\leq 2^{c_{c}(BN)n}$ , where $c_{c}(BN)=\log(1+\frac{2}{\gamma})$ (Lemma 2.4). If we conducted a similar analysis for their algorithm, we would obtain space and time complexities of $2^{c_{space}(BN)n+o(n)}$ and $2^{c_{time}(BN)n+o(n)}$ , respectively, where

[TABLE]

We can incorporate modifications to apply the birthday paradox, as has been done in [26] (for $\ell_{2}$ norm). This would improve the exponents to

[TABLE]

Clearly, the running time of our algorithm is less since $\left(1+\frac{2}{\gamma}\right)^{2}>\left(2+\frac{2}{\gamma}\right)$ for all $\gamma<1$ . In [47], the authors did not specify the constant in the exponent of running time. However, using the above formulae, we found out that their algorithm can achieve a time complexity of $2^{3.849n+o(n)}$ and space complexity of $2^{2.023n+o(n)}$ at parameters $\gamma=0.78,\xi=1.27$ (without the birthday paradox, the algorithm in [47] can achieve time and space complexities of $2^{5.179n+o(n)}$ and $2^{3.01n+o(n)}$ , respectively, at parameters $\gamma=0.572,\xi=0.742$ ). In comparison, our algorithm can achieve a time and space complexity of $2^{2.751n+o(n)}$ at parameters $\gamma=0.598,\xi=0.82$ .

For $p=2$ , we can use Theorem 2.1 to obtain a better bound on the number of lattice vectors that remain after all sieving iterations. This is reflected in the quantity $c_{b}$ , which is then given by $c_{b}^{(2)}=0.401+\log\Big{(}\frac{2\xi(2-\gamma)}{1-\gamma}\Big{)}$ (Corollary 2.1). Furthermore, $c_{s}^{(2)}=-0.5\log\Big{(}1-\frac{1}{4\xi^{2}}\Big{)}$ (Lemma 2.3). At parameters $\gamma=0.693$ and $\xi=0.99$ , we obtain $c_{time}^{(2)}=c_{space}^{(2)}=2.49$ . The AKS algorithm with the birthday paradox manages to achieve a time complexity of $2^{2.571n+o(n)}$ and space complexity of $2^{1.407n+o(n)}$ when $\gamma=0.589$ and $\xi=0.9365$ [26]. Thus, our algorithm achieves a better time complexity at the cost of more space.

For $p=\infty$ , we can reduce the space complexity by using the sub-division mentioned in Section 3.1 and achieve a space and time complexity of $2^{2.443n+o(n)}$ at parameters $\gamma=0.501,\xi=0.738$ (in [49], the authors mentioned a time and space complexity of $2^{2.82n+o(n)}$ in $\ell_{\infty}$ norm. We obtain a slightly better running time by using $c_{b}$ , as mentioned in this paper). Again, this is better than the time complexity of [47] (which is for all $\ell_{p}$ norms).

4 A Mixed Sieving Algorithm

The main advantage in dividing the space (hyperball) into hypercubes (as we did in Linear Sieve) is the efficient “decodability” in the sense that a vector can be mapped to a sub-region (and thus be associated with a center) in $O(n)$ time. However, the price we pay is in space complexity, because the number of hypercubes required to cover a hyperball is greater than the number of centers required if we used smaller hyperballs like in [21, 47, 48]. To reduce the space complexity, we perform a mixed sieving procedure. Double sieving techniques have been used for heuristic algorithms as in [32], where the rough idea is the following. There are two sets of centers: the first set consists of centers of larger radius balls, and for each such center, there is another set of centers of smaller radius balls within the respective large ball. In each sieving iteration, each non-center vector is mapped to the larger balls by comparing with the centers in the first set. Then, they are mapped to a smaller ball by comparing with the second set of centers. Thus, in both levels, a quadratic sieve is applied.

In our mixed sieving, the primary difference is the fact that in the two levels, we use two types of sieving methods: a linear sieve in the first level and then a quadratic sieve such as AKS in the next level. The overall outline of the algorithm is the same as in Algorithm 1, except at step 1, where we apply the following sieving procedure, which we call Mixed Sieve. An illustration is given in Figure 4.

The input to Mixed Sieve is a set of vectors of length $R$ , and the output is a set of smaller vectors of length $\gamma R$ .

We divide the whole space into large hypercubes of length $\frac{A\gamma R}{n^{1/p}}$ , where $A$ is some constant. In $O(n)$ time, we map a vector to a large hypercube by comparing its co-ordinates. This has been explained in Section 3.1. We do not assign centers yet and do not perform any vector operation at this step. The distance between any two vectors mapped to the same hypercube is at most $A\gamma R$ (Figure 4b). 2. 2.

Next, we perform the AKS sieving procedure within each hypercube. For each hypercube, we have a set (initially null) of centers. When a vector is mapped to a hypercube ,we check if it is within distance $\gamma R$ of any center (within that hypercube). If yes, then we subtract it from the center and add the resultant shorter vector to output set. If no, then we add this vector to the set of centers (Figure 4c).

Using the same kind of counting method as in Section 3.1, we can say we need $2^{c^{\prime}n}$ large hypercubes, where $c^{\prime}=\log\left(2+\frac{2}{A\gamma}\right)$ . The maximum distance between any two vectors in each hypercube is $A\gamma R$ , and we want to get vectors of length at most $\gamma R$ by applying the AKS sieve. Thus, the number of centers (let us call these “AKS sieve-centers”) within each hypercube is $2^{c_{p}n+o(n)}$ where $c_{p}=\log(1+A)$ (in the special case of Euclidean norm, we have $c_{2}=0.401-\log\left(\frac{2}{A}\right)$ ). $c_{p}$ (and $c_{2}$ ) are obtained by applying Lemma 2.4. Note that the value of $A$ must ensure the non-negativity of $c_{2}$ . Thus, the total number of centers is $2^{c^{(p)}n+o(n)}$ where $c^{(p)}=c^{\prime}+c_{p}$ .

To use the birthday paradox, we apply similar methods as given in Section 3.3 and [26]. Assume that we initially sample $N\geq\frac{2}{q}(n^{3}k2^{c^{(p)}n+o(n)}+n2^{\frac{c_{b}}{2}n})$ vectors. Then, using similar arguments as in Section 3, we can conclude that, with high probability, we end up with the shortest vector in the lattice. We are not re-writing the proof since it is similar to that in Theorem 3.2. The only thing that is slightly different is the number of center pairs set aside at the beginning of the sieving iterations. As in Section 3, we randomly divide the space $n^{3}$ times into $2^{c^{\prime}n}$ hypercubes. Then, among the uniformly sampled vectors, we set aside $2^{c_{p}n}$ vector pairs as centers for each hypercube. Thus, in Theorem 3.2, we replace $|\mathcal{C}|$ by $2^{c^{(p)}n+o(n)}$ .

Thus, space complexity is $2^{c_{space}n+o(n)}$ where $c_{space}=c_{s}+\max(c^{(p)},c_{b}/2)$ . It takes $O(n)$ time to map each vector to a large hypercube, and then at most $2^{c_{p}n+o(n)}$ time to compare it with the “AKS sieve-centers” within each hypercube. Thus, the time complexity is $2^{c_{time}n+o(n)}$ where $c_{time}=\max(c_{space}+c_{p},c_{b})$ .

Theorem 4.1.

Let $\gamma\in(0,1),\xi>1/2$ and $A$ be some constant. Given a full-rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ , there is a randomized algorithm for $\textsf{SVP}^{(p)}$ with a success probability of at least $1/2$ , a space complexity of at most $2^{c_{space}n+o(n)}$ , and a running time of at most $2^{c_{time}n+o(n)}$ . Here, $c_{space}=c_{s}+\max(c^{(p)},\frac{c_{b}}{2})$ and $c_{time}=\max(c_{space}+c_{p},c_{b})$ . $c_{s}=-\log\left(0.5-\frac{1}{4\xi}\right)$ , $c_{b}=\log\left(1+\frac{2\xi(2-\gamma)}{1-\gamma}\right)$ , $c_{p}=\log(1+A)$ and $c^{(p)}=\log\left(2+\frac{2}{A\gamma}\right)+c_{p}$ .

*In the Euclidean norm, we have $c_{2}=0.401-\log\left(\frac{2}{A}\right)$ ,

$c^{(2)}=\log\left(2+\frac{2}{A\gamma}\right)+c_{2}$ , $c_{s}^{(2)}=-0.5\log\left(1-\frac{1}{4\xi^{2}}\right)$ and $c_{b}^{(2)}=0.401+\log\Big{(}\frac{2\xi(2-\gamma)}{1-\gamma}\Big{)}$ .*

Comparison with previous provable sieving algorithms [27, 28, 20]

In the Euclidean norm with parameters $\gamma=0.645,\xi=0.946$ and $A=2^{0.599}$ , we obtain a space and time complexity of $2^{2.25n+o(n)}$ , while the List Sieve Birthday [26, 28] has space and time complexities of $2^{1.233n+o(n)}$ and $2^{2.465n+o(n)}$ , respectively. We can also use a different sieve in the second level, such as List Sieve [27], etc., which works in $\ell_{2}$ norm and is faster than the AKS sieve. We can therefore expect to achieve a better running time.

The Discrete Gaussian-based sieving algorithm of Aggarwal et al. [20] with a time complexity of $2^{n+o(n)}$ performs better than both our sieving techniques. However, their algorithm works for the Euclidean norm and, to the best of our knowledge, it has not been generalized to any other norm.

5 Approximation Algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$

In this section, we show how to adopt our sieving techniques to approximation algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$ . The analysis and explanations are similar to that given in [61]. For completeness, we give a brief outline.

5.1 Algorithm for Approximate $\textsf{SVP}^{(p)}$

We note that at the end of the sieving procedure in Algorithm 1, we obtain lattice vectors of length at most $R^{\prime}=\frac{\xi(2-\gamma)\lambda}{1-\gamma}+O(\lambda/n)$ . Thus, if we can ensure that one of the vectors obtained at the end of the sieving procedure is non-zero, we obtain a $\tau=\frac{\xi(2-\gamma)}{1-\gamma}+o(1)$ -approximation of the shortest vector. Consider a new algorithm ${\mathcal{A}}$ (let us call it Approx-SVP) that is identical to Algorithm 1, except that Step 1 is replaced by the following:

•

Find a non-zero vector $\mathbf{v}_{0}$ in $\{(\mathbf{y}_{i}-\mathbf{e}_{i}):(\mathbf{e}_{i},\mathbf{y}_{i})\in S\}$ .

We now show that if we start with sufficiently many vectors, we must obtain a non-zero vector.

Lemma 5.1.

If $N\geq\frac{2}{q}(k|\mathcal{C}|+1)$ , then with a probability of at least $1/2$ , Algorithm $\mathcal{A}$ outputs a non-zero vector in $\mathcal{L}$ of a length of at most $\frac{\xi(2-\gamma)\lambda}{1-\gamma}+O(\lambda/n)$ with respect to $\ell_{p}$ norm.

Proof.

Of the $N$ vector pairs $(\mathbf{e},\mathbf{y})$ sampled in steps 1-1 of Algorithm $\mathcal{A}$ , we consider those such that $\mathbf{e}\in(D_{1}\cup D_{2})$ . We have already seen there are at least $\frac{qN}{2}$ such pairs. We remove $|\mathcal{C}|$ vector pairs in each of the $k$ sieve iterations. Thus, at step 1 of Algorithm 1, we have $N^{\prime}\geq 1$ pairs $(\mathbf{e},\mathbf{y})$ to process.

With a probability of $1/2$ , $\mathbf{e}$ , and hence $\mathbf{w}=\mathbf{y}-\mathbf{e}$ is replaced by either $\mathbf{w}+\mathbf{u}$ or $\mathbf{w}-\mathbf{u}$ . Thus, the probability that this vector is the zero vector is at most $1/2$ . ∎

We thus obtain the following result.

Theorem 5.1.

Let $\gamma\in(0,1)$ , $\xi>1/2$ and $\tau=\frac{\xi(2-\gamma)}{1-\gamma}+o(1)$ , Assume we are given a full-rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ . There is a randomized algorithm that $\tau$ approximates $\textsf{SVP}^{(p)}$ with a success probability of at least $1/2$ and a space and time complexity $2^{(c_{s}+c_{c})n+o(n)}$ , where $c_{c}=\log\left(2+\frac{2}{\gamma}\right)$ , and $c_{s}=-\log\left(0.5-\frac{1}{4\xi}\right)$ .

Note that while presenting the above theorem, we assumed that we are using the Linear Sieve in Algorithm 1. We can also use the Mixed Sieve procedure as described in Section 4. Then, we will obtain space and time complexities of $2^{(c_{s}+c^{(p)})n+o(n)}$ and $2^{(c_{s}+c^{(p)}+c_{p})n+o(n)}$ , respectively, where $c^{(p)}=\log\left(2+\frac{2}{A\gamma}\right)+c_{p}$ and $c_{p}=\log(1+A)$ , respectively (in the Euclidean norm, the parameters are as described in Theorem 4.1).

Comparison with provable approximation algorithms [30, 47, 48]

We have mentioned in Section 1 that [48, 47] gave approximation algorithms for lattice problems that work for all $\ell_{p}$ norms and use the quadratic sieving procedure (as has been described before). Using our notations, the space and time complexities of their approximate algorithms are $2^{c_{space}(BN)n+o(n)}$ and $2^{c_{time}(BN)n+o(n)}$ , respectively, where

[TABLE]

The authors did not mention any explicit value of the constant in the exponent. Using the above formulae, we conclude that [48] and [47] can achieve time and space complexities of $2^{3.169n+o(n)}$ and $2^{1.586n+o(n)}$ , respectively, at parameters $\gamma=0.99,\xi=10.001$ with a large constant approximation factor. In comparison, we can achieve a space and time complexity of $2^{2.001n+o(n)}$ with a large constant approximation factor at the same parameters.

In $\ell_{2}$ norm, using the mixed sieving procedure, we obtain a time and space complexity of $2^{1.73n+o(n)}$ and a large constant approximation factor at parameters $\gamma=0.999$ , $\xi=1$ . In [30], the best running time reported is $2^{0.802n}$ for a large approximation factor.

Using a similar linear sieve, a time and space complexity of $3^{n}$ i.e., $2^{1.585n+o(n)}$ can be achieved for the $\ell_{\infty}$ norm for a large constant approximation factor [49].

5.2 Algorithm for Approximate $\textsf{CVP}^{(p)}$

Given a lattice $\mathcal{L}$ and a target vector $\mathbf{t}$ , let $d$ denote the distance of the closest vector in $\mathcal{L}$ to $\mathbf{t}$ . Just as in Section 3.2, we assume that we know the value of $d$ within a factor of $1+1/n$ . We can get rid of this assumption by using Babai’s [69] algorithm to guess the value of $d$ within a factor of $2^{n}$ and then run our algorithm for polynomially many values of $d$ .

For $\tau>0$ , define the following $(n+1)-$ dimensional lattice $\mathcal{L}^{\prime}$

[TABLE]

Let $\mathbf{z}^{*}\in\mathcal{L}$ be the lattice vector closest to $\mathbf{t}$ .

Then $\mathbf{u}=(\mathbf{z}^{*}-\mathbf{t},-\tau d/2)\in\mathcal{L}^{\prime}\setminus(\mathcal{L}-k^{\prime}\mathbf{t},0)$ for some $k^{\prime}\in\mathbb{Z}$ .

We sample $N$ vector pairs $(\mathbf{e},\mathbf{y})\in B^{(p)}_{n}(\xi d)\times\mathscr{P}(\mathbf{B}^{\prime})$ (3–3 of Algorithm 3), where $\mathbf{B}^{\prime}=[(\mathbf{b}_{1},0),\ldots,(\mathbf{b}_{n},0),(\mathbf{t},\tau d/2)]$ is a basis for $\mathcal{L}^{\prime}$ . Next, we run a number of iterations of the sieving Algorithm 2 to obtain a number of vector pairs such that $\|\mathbf{y}\|_{p}\leq R=\frac{\xi d}{1-\gamma}+o(1)$ . Further details can be found in Algorithm 3. Note that in the algorithm, $\mathbf{v}|_{[n]}$ is the $n-$ dimensional vector $\mathbf{v}^{\prime}$ obtained by restricting $\mathbf{v}$ to the first $n$ co-ordinates (with respect to the computational basis).

From Lemma 3.3, we have seen that after $\lceil\log_{\gamma}\Big{(}\frac{\xi}{nR_{0}(1-\gamma)}\Big{)}\rceil$ iterations (where $R_{0}=n\cdot\max_{i}\|\mathbf{b}_{i}\|_{p}$ ), $R\leq\frac{\xi\gamma}{n(1-\gamma)}+\frac{\xi d}{1-\gamma}\Big{[}1-\frac{\xi}{nR_{0}(1-\gamma)}\Big{]}$ . Thus, after the sieving iterations, the set $S^{\prime}$ consists of vector pairs such that the corresponding lattice vector $\mathbf{v}$ has $\|\mathbf{v}\|_{p}\leq\frac{\xi d}{1-\gamma}+\xi d+c=\frac{\xi(2-\gamma)d}{1-\gamma}+o(1)$ .

Selecting $\xi<\frac{(1-\gamma)\tau}{2-\gamma}-o(1)$ ensures that our sieving algorithm does not return vectors from $(\mathcal{L},0)-(k^{\prime}\mathbf{t},k^{\prime}\tau d/2)$ for some $k^{\prime}$ such that $|k^{\prime}|\geq 2$ . Then, every vector has $\|\mathbf{v}\|_{p}<\tau d$ , and so either $\mathbf{v}=\pm(\mathbf{z}^{\prime}-\mathbf{t},0)$ or $\mathbf{v}=\pm(\mathbf{z}-\mathbf{t},-\tau d/2)$ for some lattice vector $\mathbf{z},\mathbf{z}^{\prime}\in\mathcal{L}$ .

With similar arguments as in [61] (using the tossing argument outlined in Section 3.2), we can conclude that with some non-zero probability we have at least one vector in $\mathcal{L}^{\prime}\setminus(\mathcal{L}\pm\mathbf{t},0)$ after the sieving iterations.

Thus, we obtain the following result.

Theorem 5.2.

*Let $\gamma\in(0,1)$ , and for any $\tau>1$ let $\xi>\max(1/2,\tau/4)$ . Given a full-rank lattice $\mathcal{L}\subset\mathbb{Q}^{n}$ , there is a randomized algorithm that, for

$\tau=\frac{\xi(2-\gamma)}{1-\gamma}+o(1)$ , approximates $\textsf{CVP}^{(p)}$ with a success probability of at least $1/2$ and a space and time complexity of $2^{(c_{s}+c_{c})n+o(n)}$ , where $c_{c}=\log\left(2+\frac{2}{\gamma}\right)$ and

$c_{s}=-\log\left(0.5-\frac{1}{4\xi}\right)$ .*

Again, using Mixed Sieve in Algorithm 1, we obtain space and time complexities of $2^{(c_{s}+c^{(p)})n+o(n)}$ and $2^{(c_{s}+c^{(p)}+c_{p})n+o(n)}$ , respectively, where

$c^{(p)}=\log\left(2+\frac{2}{A\gamma}\right)+c_{p}$ and $c_{p}=\log(1+A)$ , respectively (in the Euclidean norm, the parameters are as described in Theorem 4.1).

6 Discussions

In this paper, we have designed new sieving algorithms that work for any $\ell_{p}$ norm. A comparative performance evaluation has been given in Table 1. We achieve a better time complexity at the cost of space complexity for every $1\leq p\leq\infty$ , except for the algorithm in [20] that employs a Discrete Gaussian-based sieving algorithm and has better space and time complexity in the Euclidean norm. To the best of our knowledge, this algorithm does not work for any other norm.

6.1 Future work

An obvious direction for further research would be to design heuristic algorithms on these kind of sieving techniques and to study if these can be adapted to other computing environments like parallel computing.

The major difference between our algorithm and the others like [21, 47] is in the choice of the shape of the sub-regions in which we divide the ambient space (as has already been explained before). Due to this we get superior “decodability” in the sense that a vector can be efficiently mapped to a sub-region, at the cost of inferior space complexity, as described before. It might be interesting to study what other shapes of these sub-regions might be considered and what are the trade-offs we get.

It might be possible to improve the bound on the number of hypercubes required to cover the hyperball. At least in the $\ell_{\infty}$ norm we have seen that the number of hypercubes may depend on the initial position of the smaller hypercube, whose translates cover the bigger hyperball. In fact it might be possible to get some lower bound on the complexity of this kind of approach.

Acknowledgement

The author would like to acknowledge the anonymous reviewers for their helpful comments that have helped to improve the manuscript significantly. Research at IQC was supported in part by the Government of Canada through Innovation, Science and Economic Development Canada, Public Works and Government Services Canada and Canada First Research Excellence Fund.

Bibliography69

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Lenstra, A.K.; Lenstra, H.W., Jr.; Lovász, L. Factoring polynomials with rational coefficients. Math. Ann. 1982 , 261 , 515–534.
2[2] Lenstra, H.W., Jr. Integer programming with a fixed number of variables. Math. Oper. Res. 1983 , 8 , 538–548.
3[3] Kannan, R. Minkowski’s convex body theorem and integer programming. Math. Oper. Res. 1987 , 12 , 415–440.
4[4] Dadush, D.; Peikert, C.; Vempala, S. Enumerative lattice algorithms in any norm via m-ellipsoid coverings. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, Palm Springs, CA, USA, 22–25 October 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 580–589.
5[5] Eisenbrand, F.; Hähnle, N.; Niemeier, M. Covering cubes and the closest vector problem. In Proceedings of the Twenty-Seventh Annual Symposium on Computational Geometry, Paris, France, 13–15 June 2011; ACM: New York, NY, USA, 2011; pp. 417–423.
6[6] Odlyzko, A.M. The rise and fall of knapsack cryptosystems. Cryptol. Comput. Number Theory 1990 , 42 , 75–88.
7[7] Joux, A.; Stern, J. Lattice reduction: A toolbox for the cryptanalyst. J. Cryptol. 1998 , 11 , 161–185.
8[8] Nguyen, P.Q.; Stern, J. The two faces of lattices in cryptology. In Cryptography and Lattices ; Springer: Berlin/Heidelberg, Germany, 2001; pp. 146–180.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Faster Provable Sieving Algorithms for the Shortest Vector Problem and the Closest Vector Problem on Lattices in ℓp\ell_{p}ℓp​ Norm

Abstract

1 Introduction

1.1 Prior Work

1.1.1 Sieving Algorithms in the Euclidean Norm

1.1.2 Algorithms in Other ℓp\ell_{p}ℓp​ Norms

1.1.3 Hardness Results

1.2 Our Results and Techniques

Theorem 3.2 in Section 3.3

A mixed sieving algorithm

Approximation algorithms for SVP(p)\textsf{SVP}^{(p)}SVP(p) and CVP(p)\textsf{CVP}^{(p)}CVP(p)

Remark 1.1*.*

Remark 1.2*.*

1.3 Organization of the Paper

2 Preliminaries

2.1 Notations

2.2 ℓp\ell_{p}ℓp​ Norm and Ball

Definition 2.1**.**

Fact 2.1**.**

Definition 2.2**.**

Fact 2.2**.**

Fact 2.3**.**

2.3 Lattice

Definition 2.3**.**

Definition 2.4**.**

Definition 2.5**.**

Definition 2.6** (Shortest Vector Problem (SVPc(p)\textsf{SVP}_{c}^{(p)}SVPc(p)​)).**

Definition 2.7** (Closest Vector Problem (CVPc(p)\textsf{CVP}_{c}^{(p)}CVPc(p)​)).**

Lemma 2.1** ([61]).**

Lemma 2.2** (Lemma 4.1 in ** [47]).

2.4 Some Useful Definitions and Results

Definition 2.8**.**

Lemma 2.3**.**

Theorem 2.1** (Kabatiansky and Levenshtein [66]).**

Lemma 2.4**.**

Corollary 2.1**.**

3 A Faster Provable Sieving Algorithm in ℓp\ell_{p}ℓp​ Norm

3.1 Linear Sieve

Lemma 3.1**.**

Proof.

3.2 AKS Algorithm with a Linear Sieve

Lemma 3.2**.**

Proof.

Claim 3.1**.**

Proof.

Lemma 3.3**.**

Proof.

Lemma 3.4** (Theorem 4.5 in ** [47] (re-stated)).

Lemma 3.5** (Lemma 4.7 in ** [47]).

Lemma 3.6**.**

Proof.

Theorem 3.1**.**

Proof.

3.3 Improvement Using the Birthday Paradox

Theorem 3.2**.**

Proof.

Comparison of Linear Sieve with provable sieving algorithms [21, 45, 47, 48]

4 A Mixed Sieving Algorithm

Theorem 4.1**.**

Comparison with previous provable sieving algorithms [27, 28, 20]

5 Approximation Algorithms for SVP(p)\textsf{SVP}^{(p)}SVP(p) and CVP(p)\textsf{CVP}^{(p)}CVP(p)

5.1 Algorithm for Approximate SVP(p)\textsf{SVP}^{(p)}SVP(p)

Lemma 5.1**.**

Proof.

Theorem 5.1**.**

Comparison with provable approximation algorithms [30, 47, 48]

5.2 Algorithm for Approximate CVP(p)\textsf{CVP}^{(p)}CVP(p)

Theorem 5.2**.**

6 Discussions

6.1 Future work

Acknowledgement

Faster Provable Sieving Algorithms for the Shortest Vector Problem and the Closest Vector Problem on Lattices in $\ell_{p}$ Norm

1.1.2 Algorithms in Other $\ell_{p}$ Norms

Approximation algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$

*Remark 1.1**.*

*Remark 1.2**.*

2.2 $\ell_{p}$ Norm and Ball

Definition 2.1.

Fact 2.1.

Definition 2.2.

Fact 2.2.

Fact 2.3.

Definition 2.3.

Definition 2.4.

Definition 2.5.

Definition 2.6 (Shortest Vector Problem ( $\textsf{SVP}_{c}^{(p)}$ )).

Definition 2.7 (Closest Vector Problem ( $\textsf{CVP}_{c}^{(p)}$ )).

Lemma 2.1 ([61]).

Lemma 2.2 (Lemma 4.1 in [47]).

Definition 2.8.

Lemma 2.3.

Theorem 2.1 (Kabatiansky and Levenshtein [66]).

Lemma 2.4.

Corollary 2.1.

3 A Faster Provable Sieving Algorithm in $\ell_{p}$ Norm

Lemma 3.1.

Lemma 3.2.

Claim 3.1.

Lemma 3.3.

Lemma 3.4 (Theorem 4.5 in [47] (re-stated)).

Lemma 3.5 (Lemma 4.7 in [47]).

Lemma 3.6.

Theorem 3.1.

Theorem 3.2.

Theorem 4.1.

5 Approximation Algorithms for $\textsf{SVP}^{(p)}$ and $\textsf{CVP}^{(p)}$

5.1 Algorithm for Approximate $\textsf{SVP}^{(p)}$

Lemma 5.1.

Theorem 5.1.

5.2 Algorithm for Approximate $\textsf{CVP}^{(p)}$

Theorem 5.2.