Generalization of the Ball-Collision Algorithm

Carmelo Interlando; Karan Khathuria; Nicole Rohrer; Joachim Rosenthal,; Violetta Weger

arXiv:1812.10955·cs.IT·December 31, 2018

Generalization of the Ball-Collision Algorithm

Carmelo Interlando, Karan Khathuria, Nicole Rohrer, Joachim Rosenthal,, Violetta Weger

PDF

TL;DR

This paper extends the Ball-Collision Algorithm from binary fields to general finite fields, providing a complexity analysis and comparison with other decoding algorithms.

Contribution

It generalizes the Ball-Collision Algorithm to finite fields beyond binary, offering new insights into its complexity and performance.

Findings

01

Algorithm successfully generalized to finite fields

02

Complexity analysis provided and compared with existing algorithms

03

Shows potential advantages in non-binary decoding scenarios

Abstract

In this paper we generalize the Ball-Collision Algorithm by Bernstein, Lange, Peters from the binary field to a general finite field. We also provide a complexity analysis and compare the asymptotic complexity to other generalized information set decoding algorithms.

Tables1

Table 1. Table 1 . Comparison of the asymptotic complexities over 𝔽 q subscript 𝔽 𝑞 \mathbb{F}_{q} . The values q − limit-from 𝑞 q- Stern and q − limit-from 𝑞 q- Stern-MO are from [ 17 , Table 1] , and q − limit-from 𝑞 q- BJMM-MO are from [ 16 , Table 3] .

$q$	$q -$ Stern	$q -$ Stern-MO	$q -$ BJMM-MO	$q -$ Ball-collision
2	$0.05563$	$0.05498$	$0.04730$	$0.055573$
3	$0.05217$	$0.05242$	$0.04427$	$0.052145$
4	$0.04987$	$0.05032$	$0.04294$	$0.049846$
5	$0.04815$	$0.04864$	$0.03955$	$0.048140$
7	$0.04571$	$0.04614$	$0.03706$	$0.045697$
8	$0.04478$	$0.04519$	$0.03593$	$0.044770$
11	$0.04266$	$0.04299$	$0.03335$	$0.042656$

Equations68

(U H =) H = [A_{1} A_{2} 1_{ℓ_{1} + ℓ_{2}} 0 0 1_{ℓ_{3}}], (U s =) s = [s_{1} s_{2}] .

(U H =) H = [A_{1} A_{2} 1_{ℓ_{1} + ℓ_{2}} 0 0 1_{ℓ_{3}}], (U s =) s = [s_{1} s_{2}] .

(p k) (t n)^{- 1} .

(p k) (t n)^{- 1} .

L (n, t) = i = 1 \sum t (i n) .

L (n, t) = i = 1 \sum t (i n) .

\overset{ˉ}{L} (n, t) = i = 1 \sum t (i n) (q - 1)^{i} .

\overset{ˉ}{L} (n, t) = i = 1 \sum t (i n) (q - 1)^{i} .

UHe^{\top}=\left[\begin{array}[]{ccc}A_{1}&\mathbf{1}_{\ell_{1}+\ell_{2}}&0\\ A_{2}&0&\mathbf{1}_{\ell_{3}}\end{array}\right]\left[\begin{array}[]{c}e_{1}\\ e_{2}\\ e_{3}\end{array}\right]=\left[\begin{array}[]{c}s_{1}\\ s_{2}\end{array}\right],

UHe^{\top}=\left[\begin{array}[]{ccc}A_{1}&\mathbf{1}_{\ell_{1}+\ell_{2}}&0\\ A_{2}&0&\mathbf{1}_{\ell_{3}}\end{array}\right]\left[\begin{array}[]{c}e_{1}\\ e_{2}\\ e_{3}\end{array}\right]=\left[\begin{array}[]{c}s_{1}\\ s_{2}\end{array}\right],

A_{1} e_{1} + e_{2}

A_{1} e_{1} + e_{2}

A_{2} e_{1} + e_{3}

e_{1} = π_{I} (x_{1} + x_{2}),

e_{1} = π_{I} (x_{1} + x_{2}),

e_{2} = π_{Y_{1} \cup Y_{2}} (y_{1} + y_{2}),

e_{3} = - A_{2} (π_{I} (x_{1} + x_{2})) + s_{2},

A_{1} (π_{I} (x_{1})) + π_{Y_{1} \cup Y_{2}} (y_{1}) = s_{1} - A_{1} (π_{I} (x_{2})) - π_{Y_{1} \cup Y_{2}} (y_{2}) .

A_{1} (π_{I} (x_{1})) + π_{Y_{1} \cup Y_{2}} (y_{1}) = s_{1} - A_{1} (π_{I} (x_{2})) - π_{Y_{1} \cup Y_{2}} (y_{2}) .

U H e^{⊤}

U H e^{⊤}

= [A_{1} (π_{I} (x_{1} + x_{2})) + π_{Y_{1} \cup Y_{2}} (y_{1} + y_{2}) A_{2} (π_{I} (x_{1} + x_{2})) - A_{2} (π_{I} (x_{1} + x_{2})) + s_{2}]

= [A_{1} (π_{I} (x_{1} + x_{2})) + π_{Y_{1} \cup Y_{2}} (y_{1} + y_{2}) s_{2}] .

A_{1} (π_{I} (x_{1})) + π_{Y_{1} \cup Y_{2}} (y_{1}) = - A_{1} (π_{I} (x_{2})) + s_{1} - π_{Y_{1} \cup Y_{2}} (y_{2})

A_{1} (π_{I} (x_{1})) + π_{Y_{1} \cup Y_{2}} (y_{1}) = - A_{1} (π_{I} (x_{2})) + s_{1} - π_{Y_{1} \cup Y_{2}} (y_{2})

(t n)^{- 1} (t - p _{1} - p _{2} - q _{1} - q _{2} ℓ _{3}) (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}) .

(t n)^{- 1} (t - p _{1} - p _{2} - q _{1} - q _{2} ℓ _{3}) (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}) .

(q - 1) (ℓ_{1} + ℓ_{2}) k_{1} M + (\overset{ˉ}{L} (k_{1}, p_{1}) - k_{1} (q - 1)) (ℓ_{1} + ℓ_{2}) A

(q - 1) (ℓ_{1} + ℓ_{2}) k_{1} M + (\overset{ˉ}{L} (k_{1}, p_{1}) - k_{1} (q - 1)) (ℓ_{1} + ℓ_{2}) A

+ (p _{1} k _{1}) \overset{ˉ}{L} (ℓ_{1}, q_{1}) (q - 1)^{p_{1}} A .

(q - 1) (ℓ_{1} + ℓ_{2}) k_{2} (M + A) + (\overset{ˉ}{L} (k_{2}, p_{2} - k_{2} (q - 1))) (ℓ_{1} + ℓ_{2}) A

(q - 1) (ℓ_{1} + ℓ_{2}) k_{2} (M + A) + (\overset{ˉ}{L} (k_{2}, p_{2} - k_{2} (q - 1))) (ℓ_{1} + ℓ_{2}) A

+ (p _{2} k _{2}) \overset{ˉ}{L} (ℓ_{2}, q_{2}) (q - 1)^{p_{2}} A .

\frac{( p _{1} k _{1} ) ( p _{2} k _{2} ) ( q _{1} ℓ _{1} ) ( q _{2} ℓ _{2} ) ( q - 1 ) ^{p_{1} + p_{2} + q_{1} + q_{2}}}{q ^{ℓ_{1} + ℓ_{2}}} .

\frac{( p _{1} k _{1} ) ( p _{2} k _{2} ) ( q _{1} ℓ _{1} ) ( q _{2} ℓ _{2} ) ( q - 1 ) ^{p_{1} + p_{2} + q_{1} + q_{2}}}{q ^{ℓ_{1} + ℓ_{2}}} .

\frac{q}{q - 1} (t - p_{1} - p_{2} - q_{1} - q_{2} + 1) ((p_{1} + p_{2} + 1) A + (p_{1} + p_{2}) M) .

\frac{q}{q - 1} (t - p_{1} - p_{2} - q_{1} - q_{2} + 1) ((p_{1} + p_{2} + 1) A + (p_{1} + p_{2}) M) .

(n - k) (n + 1) (n - k - 1) (A + M)

(n - k) (n + 1) (n - k - 1) (A + M)

+ (ℓ_{1} + ℓ_{2}) [(q - 1) ((k_{1} + k_{2}) M + k_{2} A)

+ (\overset{ˉ}{L} (k_{1}, p_{1}) - k_{1} (q - 1)) A + (\overset{ˉ}{L} (k_{2}, p_{2}) - k_{2} (q - 1)) A]

+ (p _{1} k _{1}) \overset{ˉ}{L} (ℓ_{1}, q_{1}) (q - 1)^{p_{1}} A + (p _{2} k _{2}) \overset{ˉ}{L} (ℓ_{2}, q_{2}) (q - 1)^{p_{2}} A

+ (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}) (q - 1)^{p_{1} + p_{2} + q_{1} + q_{2}} q^{- (ℓ_{1} + ℓ_{2})}

\cdot \frac{q}{q - 1} (t - p_{1} - p_{2} - q_{1} - q_{2} + 1) ((p_{1} + p_{2} + 1) A + (p_{1} + p_{2}) M) .

(t n) ((t - p _{1} - p _{2} - q _{1} - q _{2} ℓ _{3}) (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}))^{- 1}

(t n) ((t - p _{1} - p _{2} - q _{1} - q _{2} ℓ _{3}) (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}))^{- 1}

\cdot [(n - k) (n + 1) (n - k - 1) (A + M)

+ (ℓ_{1} + ℓ_{2}) [(q - 1) ((k_{1} + k_{2}) M + k_{2} A)

+ (\overset{ˉ}{L} (k_{1}, p_{1}) - k_{1} (q - 1)) A + (\overset{ˉ}{L} (k_{2}, p_{2}) - k_{2} (q - 1)) A]

+ (p _{1} k _{1}) \overset{ˉ}{L} (ℓ_{1}, q_{1}) (q - 1)^{p_{1}} A + (p _{2} k _{2}) \overset{ˉ}{L} (ℓ_{2}, q_{2}) (q - 1)^{p_{2}} A

+ (p _{1} k _{1}) (p _{2} k _{2}) (q _{1} ℓ _{1}) (q _{2} ℓ _{2}) (q - 1)^{p_{1} + p_{2} + q_{1} + q_{2}} q^{- (ℓ_{1} + ℓ_{2})}

\cdot \frac{q}{q - 1} (t - p_{1} - p_{2} - q_{1} - q_{2} + 1) ((p_{1} + p_{2} + 1) A + (p_{1} + p_{2}) M)] .

- T lo g_{q} (T) - (1 - T) lo g_{q} (1 - T) \leq 1 - R < 1.

- T lo g_{q} (T) - (1 - T) lo g_{q} (1 - T) \leq 1 - R < 1.

0 \leq T - 2 P - 2 Q \leq 1 - R - 2 L .

0 \leq T - 2 P - 2 Q \leq 1 - R - 2 L .

n \to \infty lim \frac{1}{n} lo g_{q} (β + o ( 1 ) n α + o ( 1 ) n) = α lo g_{q} (α) - β lo g_{q} (β) - (α - β) lo g_{q} (α - β) .

n \to \infty lim \frac{1}{n} lo g_{q} (β + o ( 1 ) n α + o ( 1 ) n) = α lo g_{q} (α) - β lo g_{q} (β) - (α - β) lo g_{q} (α - β) .

S (P, Q, L)

S (P, Q, L)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Generalization

of the Ball-Collision Algorithm

Carmelo Interlando

Department of Mathematics and Statistics

San Diego State University

San Diego, CA 92182-7720

[email protected]

,

Karan Khathuria

Institute of Mathematics

University of Zurich

Winterthurerstrasse 190

8057 Zurich, Switzerland

[email protected]

,

Nicole Rohrer

Institute of Mathematics

University of Zurich

Winterthurerstrasse 190

8057 Zurich, Switzerland

[email protected]

,

Joachim Rosenthal

Institute of Mathematics

University of Zurich

Winterthurerstrasse 190

8057 Zurich, Switzerland

[email protected]

and

Violetta Weger

Institute of Mathematics

University of Zurich

Winterthurerstrasse 190

8057 Zurich, Switzerland

[email protected]

Abstract.

In this paper we generalize the Ball-Collision Algorithm by Bernstein, Lange, Peters from the binary field to a general finite field. We also provide a complexity analysis and compare the asymptotic complexity to other generalized information set decoding algorithms.

Key words and phrases:

Coding Theory; ISD; Ball-Collision.

2010 Mathematics Subject Classification:

The fourth author is thankful to Swiss National Science Foundation grant number 169510.

1. Introduction

Since 1978 it has been known that decoding a random linear code is an NP-complete problem, this was shown in [7] by Berlekamp, McEliece and van Tilborg. Therefore the interesting task arises of finding the complexity of decoding a random linear code using the best algorithms available. Until today two main methods for decoding have been proposed: information set decoding (ISD) and the generalized birthday algorithm (GBA). The ISD is more efficient if the decoding problem has only a small number of solutions, whereas GBA is efficient when there are many solutions. Also other ideas such as statistical decoding [1], gradient decoding [2] and supercode decoding [5] have been proposed but fail to outperform ISD algorithms. An ISD algorithm is given a corrupted codeword and recovers the message or equivalently finds the error vector. ISD algorithms are often formulated via the parity check matrix, since it is enough to find a vector of a certain weight which has the same syndrome as the corrupted codeword, this problem is also referred to as the syndrome decoding problem. ISD algorithms are based on a decoding algorithm proposed by Prange [29] in 1962 and their structures do not change much from the original: as a first step one chooses an information set, then Gaussian elimination brings the parity check matrix in a standard form and assuming that the errors are outside of the information set, these row operations on the syndrome will exploit the error vector, if the weight does not exceed the given error capacity.

The problem of decoding random linear codes has recently been receiving prominence with the proposal of using code-based public key cryptosystems for an upcoming post-quantum cryptographic public key standard. The idea of using linear codes in public key cryptography was first formulated by Robert McEliece [25]. Since the publication of McEliece a large amount of research has been done and the interested reader will find more information in a recent survey [9].

If the secret code is hidden well enough an adversary who wants to break a code-based cryptosystem encounters the decoding problem of a random linear code. It is therefore of crucial importance to understand the complexity of the best algorithms capable of decoding a general linear code.

The ISD algorithms were often considered when proposing a variant of the McEliece cryptosystem, to find the key size needed for a given security level. ISD algorithms hence do not break a code-based cryptosystem but they determine the choice of secure parameters. Since some of the new proposals (for example [3, 4, 19]) involve codes over general finite fields, having efficient ISD algorithms generalized to $\mathbb{F}_{q}$ is an essential problem.

Bernstein, Lange and Peters found a clever improvement of the ISD algorithm which they called ball-collision decoding [8]. The algorithm of Bernstein et. al. was presented for random binary linear codes. The main contribution of our paper is a generalization of the ball-collision decoding algorithm to arbitrary finite fields.

The paper is structured as follows: in Section 2 we discuss the previous work on ISD algorithms focusing on those which have been generalized to an arbitrary finite field. In Section 3 we describe the ball-collision algorithm over the binary field and the notations and concepts involved in the algorithm. In Section 4 we present the ball-collision algorithm over $\mathbb{F}_{q}$ and in Section 5 we perform the complexity analysis of our algorithm including numerical parameter optimization and asymptotic analysis.

2. Related work

Many improvements have been suggested to Prange’s simplest form of ISD (see for example [10, 12, 13, 14, 20, 22, 31]), they can be split into two types: improvements on the Gaussian elimination step and a more probable and elaborated weight distribution of the error vector. The prior includes the work of Canteaut and Chabaud [11], where they show that the information set should not be taken at random after one unsuccessful iteration, but rather a part of the previous information set should be reused and therefore a part of the Gaussian elimination step is already performed. Whereas Finiasz and Sendrier [15] showed that a complete Gaussian elimination is not necessary, both of these improvements help to bring the cost of the Gaussian elimination step down.

Now we focus on the second type of improvements, which were first proposed for codes over the binary field and then later generalized over an arbitrary finite field. The first improvement of Prange’s ISD was by Lee-Brickell [21] in 1988, where in the information set $p$ errors are assumed and $t-p$ outside. In 1993 Stern [30] proposed to partition the information set in to two sets and ask for $p$ errors in each part and $t-2p$ errors outside the information set. The generalization of both Lee-Brickell and Stern’s algorithm to a general finite field $\mathbb{F}_{q}$ were performed by Peters [28] in 2010.

Niebuhr, Persichetti, Cayrel, Bulygin and Buchmann [27] in 2010 improved the performance of ISD algorithms over $\mathbb{F}_{q}$ based on the idea of Finiasz-Sendrier [15] to allow the errors to overlap in the information set.

In the past 10 years many other improvements were proposed for ISD over $\mathbb{F}_{2}$ . Namely, the ball-collision algorithm by Bernstein, Lange and Peters [8] in 2011, which splits the information set in two sets, having $p_{1}$ and $p_{2}$ errors in them and also splits the rest of the positions into three disjoint sets, having $q_{1},q_{2}$ and $t-p_{1}-p_{2}-q_{1}-q_{2}$ errors respectively. The algorithm’s name comes from a collision check, which builds the most crucial part of the algorithm.

Later in 2011 May, Meurer and Thomae [23] proposed an improvement using the representation technique introduced by Howgrave-Graham and Joux [18]. To this algorithm Becker, Joux, May and Meurer [6] (BJMM) in 2012 introduced further improvements. In the same year Meurer in his dissertation [26] proposed a new generalized ISD algorithm based on these two papers.

In 2015, May-Ozerov [24] used the nearest neighbor algorithm to improve the BJMM version of ISD. In 2016, Hirose [17] generalized the nearest neighbor algorithm over $\mathbb{F}_{q}$ and applied it to the generalized Stern algorithm. Later in 2017, this was applied to generalized BJMM algorithm by Gueye, Klamti and Hirose [16].

In this paper we provide the missing generalization of the ball-collision algorithm. The order of the complexities of ISD algorithms over $\mathbb{F}_{2}$ is consistent also with their generalizations over $\mathbb{F}_{q}$ .

3. Preliminaries

3.1. Notation

We first want to fix some notation: let $q$ be a prime power and let $n,k,t\in\mathbb{N}$ be positive integers, such that $k,t<n$ . We will denote by $\mathbf{1}_{n}$ the $n\times n$ identity matrix.

For an $m\times n$ matrix $A$ and a set $S\subseteq\{1,...,n\}$ of size $k$ , we denote by $A_{S}$ the $m\times k$ matrix consisting of the columns of $A$ indexed by $S$ .

For a set $S\subseteq\{1,...,n\}$ of size $k$ , we denote by $\mathbb{F}_{q}^{n}(S)$ the vectors in $\mathbb{F}_{q}^{n}$ having support in $S$ . The projection of $x\in\mathbb{F}_{q}^{n}(S)$ to $\mathbb{F}_{q}^{k}$ is then canonical and denoted by $\pi_{S}(x)$ .

On the other hand we denote by $\sigma_{S}$ the canonical embedding of a vector $x\in\mathbb{F}_{q}^{k}$ into $\mathbb{F}_{q}^{n}(S)$ , where $S\subseteq\{1,...,n\}$ is again of size $k$ .

For an $[n,k]$ linear code $\mathcal{C}$ over $\mathbb{F}_{q}$ we denote by $H$ be the parity check matrix of size $(n-k)\times n$ and by $G$ the $k\times n$ generator matrix. We denote the Hamming weight of a vector $x\in\mathbb{F}_{q}^{n}$ , by $w(x)$ . The corrupted codeword $c\in\mathbb{F}_{q}^{n}$ is given by $c=mG+e$ , where $m\in\mathbb{F}_{q}^{k}$ is the message and $e\in\mathbb{F}_{q}^{n}$ is the error vector. The syndrome of $c$ is then defined as $s=Hc^{\top}$ and coincides with the syndrome of the error vector, since $Hc^{\top}=H(mG+e)^{\top}=HG^{\top}m^{\top}+He^{\top}=He^{\top}$ .

3.2. Ball-collision algorithm over the binary field

In what follows we describe the ball-collision algorithm over the binary proposed in [8] by Bernstein, Lange and Peters.

*Remark 1**.*

Note that if $H$ is already in standard form, then $I=\{1,...,k\}$ and $U=\mathbf{1}_{n-k}$ . In this case $H$ and $s$ can be written as

[TABLE]

3.3. Concepts

There are a few concepts for computing the complexity of the ball-collision algorithm introduced in [8] that we will use and present beforehand. In general the complexity of an ISD attack consists of the cost of one iteration times the expected number of iterations. The cost in the following refers to operations, i.e. additions or multiplications, over the given field.

The success probability over the binary is usually given by having chosen the correct weight distribution of the error vector. For example let the error vector be of length $n$ having weight $t$ , now we assume that the error vector has weight $p$ in the information set, i.e. in $k$ bits and the rest is redundant, then the success probability is given by

[TABLE]

This will not change over $\mathbb{F}_{q}$ , since the algorithm runs through all elements in the finite field having support in those chosen sets.

The concept of intermediate sums is important whenever one wants to compute something for all vectors in a certain space. For example we are given a $k\times n$ matrix $A$ and want to compute $Ax^{\top}$ for all $x\in\mathbb{F}_{2}^{n}$ , of weight $t$ . This would usually cost $k$ times $t-1$ additions and $t$ multiplications, for each $x\in\mathbb{F}_{2}^{n}$ . But if we first compute $Ax^{\top}$ , where $x$ has weight one, this only outputs the corresponding column of $A$ and has no cost. From there we can compute the sums of two columns of $A$ , there are $\binom{n}{2}$ many of these sums and each one costs $k$ additions. From there we can compute all sums of three columns of $A$ , which are $\binom{n}{3}$ many and using the sums of two columns we have already computed, means we only need to add one more column costing $k$ additions. Proceeding in this way, until one reaches the weight $t$ , to compute $Ax^{\top}$ for all $x\in\mathbb{F}_{2}^{n}$ , of weight $t$ costs $k\cdot(L(n,t)-n)$ additions, where

[TABLE]

This changes slightly over a general finite field. As a first step one computes $Ax^{\top}$ for all $x\in\mathbb{F}_{q}^{n}$ , of weight $1$ . Hence this step is no longer for free, but rather means computing $A\lambda$ for all $\lambda\in\mathbb{F}_{q}^{\star}$ , costing $(q-1)kn$ multiplications. From there on one computes the sum of two multiple of the columns, there are $\binom{n}{2}(q-1)^{2}$ many and each sum costs $k$ additions. Hence proceeding in the same manner the cost turns out to be $(q-1)kn$ multiplications and $(\bar{L}(n,t)-n(q-1))k$ additions,, where

[TABLE]

The next concept called early abort is also important whenever a computation is done while checking the weight of the result. For example one wants to compute $x+y$ , where $x,y\in\mathbb{F}_{2}^{n}$ , which usually costs $n$ additions, but we only proceed in the algorithm if $w(x+y)=t$ . Hence we compute and check the weight simultaneously and if the weight of the partial solution exceeds $t$ one does not need to continue. Over the binary one expects a randomly chosen bit to have weight 1 with probability $\frac{1}{2}$ , hence after $2t$ we should reach the wanted weight $t$ , and after $2(t+1)$ we should exceed the weight $t$ . Hence on average we expect to compute only $2(t+1)$ many bits of the solution, before we can abort. Over $\mathbb{F}_{q}$ , we expect a randomly chosen bit to have weight 1 with probability $\frac{q-1}{q}$ , therefore we need to compute $\frac{q}{q-1}(t+1)$ many bits before we can abort.

An important step in the ball-collision algorithm is to check for a collision, i.e. if $Ax^{\top}=By^{\top}$ one continues, where again $A,B\in\mathbb{F}_{2}^{k\times n}$ and $x,y$ are living in some sets $S$ and $T$ respectively. There are $\mid S\mid\cdot\mid T\mid$ many choices for $(x,y)$ , assuming that they are distributed uniformly over $\mathbb{F}_{2}^{n}$ , then on average one expects the number of collisions to be $\mid S\mid\cdot\mid T\mid 2^{-n}$ . Similarly over $\mathbb{F}_{q}$ the number of collisions will be $\mid S\mid\cdot\mid T\mid q^{-n}.$

4. Generalization of the Ball-Collision Algorithm

In this section we generalize the ball-collision algorithm over the binary [8] to a general finite field.

The algorithm requires a parity check matrix $H$ . Notice that if the generator matrix $G$ is published, the easiest way to get $H$ is to choose an information set $I$ and to compute $\tilde{G}:=G_{I}^{-1}G$ .

Again, as in the binary case, the idea of the algorithm is to solve $UHe^{\top}=Us$ instead of $He^{\top}=s$ , where an invertible $U$ is chosen such that $UH=\left[\begin{array}[]{cc}A&\mathbf{1}_{n-k}\end{array}\right]$ and $Us=\left[\begin{array}[]{c}s_{1}\\ s_{2}\end{array}\right]$ with $s_{1}\in\mathbb{F}_{q}^{\ell_{1}+\ell_{2}},\,s_{2}\in\mathbb{F}_{q}^{\ell_{3}}$ . We are therefore looking for a vector $e\in\mathbb{F}_{q}^{n}$ fulfilling

[TABLE]

with $e_{1}\in\mathbb{F}_{q}^{k},\,e_{2}\in\mathbb{F}_{q}^{\ell_{1}+\ell_{2}},\,e_{3}\in\mathbb{F}_{q}^{\ell_{3}}$ . This leads to the following system of equations:

[TABLE]

The algorithm solves the above by finding

[TABLE]

such that

[TABLE]

This last condition is fulfilled by the collision between $S$ and $T$ in Step 15.

Observe that for $q=2$ the above algorithm is equivalent to the one proposed over the binary. We hence did not change it in its substantial form.

We now want to prove that the ball-collision algorithm over $\mathbb{F}_{q}$ works, i.e. that it returns any vector $e$ of the desired form, if it exists. For this we follow the idea of [8].

Theorem 2.

The ball-collision algorithm over $\mathbb{F}_{q}$ finds any vector $e$ that fulfills $UHe^{\top}=Us$ and is of the desired form - of weight $t$ , with $p_{1},p_{2},q_{1},q_{2}$ and $t-p_{1}-p_{2}-q_{1}-q_{2}$ nonzero entries in $X_{1},X_{2},Y_{1},Y_{2}$ and $Y_{3}$ respectively.

Proof.

First, we want to prove, that the output $e$ is of the desired form:

•

$x_{1}$ is of weight $p_{1}$ and in $\mathbb{F}_{q}^{n}(X_{1})$ ,

•

$x_{2}$ is of weight $p_{2}$ and in $\mathbb{F}_{q}^{n}(X_{2})$ ,

•

$y_{1}$ is of weight $q_{1}$ and in $\mathbb{F}_{q}^{n}(Y_{1})$ ,

•

$y_{2}$ is of weight $q_{2}$ and in $\mathbb{F}_{q}^{n}(Y_{2})$ ,

•

$w(-A_{2}(\pi_{I}(x_{1}+x_{2}))+s_{2})=w(A_{2}(\pi_{I}(x_{1}+x_{2}))-s_{2})=t-p_{1}-p_{2}-q_{1}-q_{2}$ and it lies in $\mathbb{F}_{q}^{n}(Y_{3})$ .

As the above subspaces do not intersect, $w(e)$ can be calculated by adding up the weights of each of them. Hence $w(e)=t$ and each of the subspaces has the desired weight distribution by definition.

It remains to prove that $UHe^{\top}=Us$ . Let us write each of the subspaces $\mathbb{F}_{q}^{n}(I),\mathbb{F}_{q}^{n}(Y_{1}\cup Y_{2})$ and $\mathbb{F}_{q}^{n}(Y_{3})$ separately.

[TABLE]

And we know that $A_{1}(\pi_{I}(x_{1}+x_{2}))+\pi_{Y_{1}\cup Y_{2}}(y_{1}+y_{2})=s_{1}$ by the collision of $S$ and $T$ in Step 15.

We now want to prove that the algorithm returns each of the above vectors such that $He^{\top}=s$ under the assumption, that we worked with a correct partitioning into $X_{1},X_{2},Y_{1},Y_{2},Y_{3}$ . We do that by checking whether the algorithm considers all possible combinations and does not exclude any possible solution.

$U$ is invertible and hence does not exclude any solution when multiplied to $H$ and $s$ . In Step 11, where we build the sets $S$ and $T$ , we go over all the possible sets $V_{1},V_{2},W_{1}$ and $W_{2}$ , which contain all possible vectors of the desired weight distribution. There are only two steps in the algorithm, where we exclude certain vectors:

(1)

When we only keep the collisions between $S$ and $T$ in Step 15. But this is justified as $A_{1}e_{1}+e_{2}=s_{1}$ , i.e.

[TABLE]

needs to be satisfied. 2. (2)

When we check whether $w(-A_{2}(\pi_{I}(x_{1}+x_{2}))+s_{2})=t-p_{1}-p_{2}-q_{1}-q_{2}$ . But also this is justified as $e_{3}\in\mathbb{F}_{q}^{\ell_{3}}$ needs this weight to complete the weight of $e$ to be $t$ .

Hence we consider all possible error vectors that are of the given weight distribution and satisfy $UHe^{\top}=Us$ . ∎

5. Complexity Analysis

In this section we want to analyze the complexity of the extended ball-collision algorithm over $\mathbb{F}_{q}$ . Since the cost will be given in operations over $\mathbb{F}_{q}$ , we will denote by $\mathcal{M}$ the multiplications needed and by $\mathcal{A}$ the amount of additions. Note that one addition over $\mathbb{F}_{q}$ costs $\log_{2}(q)$ bit operations and one multiplication over $\mathbb{F}_{q}$ costs $\log_{2}(q)\log_{2}(\log_{2}(q))\log_{2}(\log_{2}(\log_{2}(q)))$ bit operations.

Success Probability of one Iteration

We follow the idea of [8] as the success probability does not depend on the base field, in fact: we have the same success probability over $\mathbb{F}_{q}$ as over $\mathbb{F}_{2}$ , since it only depends on choosing the correct partition of the subspaces. The success probability of one iteration equals the chances that there are $p_{i}$ error bits in $X_{i}$ , $q_{i}$ error bits in $Y_{i}$ and the remaining ones in $Y_{3}$ - all for $i\in\{1,2\}$ . If this distribution is assumed correctly, then the algorithm will find the error vector $e$ as it goes over all possible combinations of vectors in each of the mentioned subspaces. Hence the iteration succeeds with a probability of

[TABLE]

Cost of one Iteration

In Step 4 of the algorithm, one uses Gaussian elimination to find an invertible matrix $U$ , bringing $H$ into systematic form, since we will also need to compute $Us$ we will directly perform Gaussian elimination on the matrix $\begin{pmatrix}H\mid&s\end{pmatrix}$ , where we adjoined the vector $s$ as a column to $H$ . A crude estimate of the cost for this step is $(n-k)(n+1)(n-k-1)(\mathcal{A}+\mathcal{M})$ .

To build the set $S$ we want to use the concept of intermediate sums over $\mathbb{F}_{q}$ described before. Hence to compute $A_{1}(\pi_{I}(x_{1}))$ , for all $x_{1}\in\mathbb{F}_{q}^{n}(V_{1})$ we need $(q-1)(\ell_{1}+\ell_{2})k_{1}$ multiplications and $(\bar{L}(k_{1},p_{1})-k_{1}(q-1))(\ell_{1}+\ell_{2})$ additions. To a fixed $A_{1}(\pi_{I}(x_{1}))$ , we then add $\pi_{Y_{1}\cup Y_{2}}(y_{1})$ again using intermediate sums this costs $\bar{L}(\ell_{1},q_{1})$ additions for each of the $x_{1}\in\mathbb{F}_{q}^{n}(V_{1})$ , which are $\binom{k_{1}}{p_{1}}(q-1)^{p_{1}}$ many. Hence resulting in a total cost of

[TABLE]

To build the set $T$ we proceed similarly, the only difference being that $s_{1}$ needs to be added to the first step of the intermediate sums over $\mathbb{F}_{q}$ , hence adding a cost of $(\ell_{1}+\ell_{2})(q-1)k_{2}$ additions. The total cost of this step is hence given by

[TABLE]

In Step 15, when checking for collisions between $S$ and $T$ , we want to calculate the number of collisions we can expect on average. The elements in $S$ and $T$ are all of length $\ell_{1}+\ell_{2}$ and hence there is a total of $q^{\ell_{1}+\ell_{2}}$ possible elements. $S$ has $\binom{k_{1}}{p_{1}}\binom{\ell_{1}}{q_{1}}(q-1)^{p_{1}+q_{1}}$ many elements and $T$ has $\binom{k_{2}}{p_{2}}\binom{\ell_{2}}{q_{2}}(q-1)^{p_{2}+q_{2}}$ many elements, we therefore get that the expected number of collisions is

[TABLE]

For each collision we have, we check whether $w(-A_{2}(\pi_{I}(x_{1}+x_{2}))+s_{2})=t-p_{1}-p_{2}-q_{1}-q_{2}$ is satisfied. For this we will use the method of early abort: to compute one bit of the result costs $(p_{1}+p_{2}+1)$ additions and $(p_{1}+p_{2})$ multiplications, hence this step costs on average

[TABLE]

Hence the total cost of one iteration is given by

[TABLE]

Overall cost

Combining the result from (5.1) and (5) the overall cost of the ball-collision algorithm over $\mathbb{F}_{q}$ then amounts to

[TABLE]

5.1. Asymptotic Complexity

In this subsection we want to find the asymptotic complexity of the ball-collision algorithm over $\mathbb{F}_{q}$ .

Fix real numbers $0<T<1/2$ and $R$ , with

[TABLE]

We consider codes of large length $n$ , we fix functions $k,t:\mathbb{N}\to\mathbb{N}$ which satisfy $\lim_{n\to\infty}t(n)/n=T$ and $\lim_{n\to\infty}k(n)/n=R$ .

We fix real numbers $P,Q,L$ with $0\leq P\leq R/2,0\leq Q\leq L$ and

[TABLE]

We fix the parameters $p_{1},p_{2},q_{1},q_{2},\ell_{1},\ell_{2},k_{1},k_{2}$ of the ball-collision algorithm over $\mathbb{F}_{q}$ such that

i)

$\lim_{n\to\infty}\frac{p_{i}}{n}=P,$

ii)

$\lim_{n\to\infty}\frac{q_{i}}{n}=Q,$

iii)

$\lim_{n\to\infty}\frac{k_{i}}{n}=R/2,$

iv)

$\lim_{n\to\infty}\frac{\ell_{i}}{n}=L,$

for $i\in\{1,2\}$ . We use the convention that $x\log_{q}(x)=0$ , for $x=0$ . In what follows we will use the following asymptotic formula for binomial coefficients:

[TABLE]

With this formula we get the following:

i)

$\lim_{n\to\infty}\frac{1}{n}\log_{q}\binom{n}{t}=-T\log_{q}(T)-(1-T)\log_{q}(1-T),$

ii)

$\lim_{n\to\infty}\frac{1}{n}\log_{q}\binom{k_{i}}{p_{i}}=R/2\log_{q}(R/2)-P\log_{q}(P)-(R/2-P)\log_{q}(R/2-P),$

iii)

$\lim_{n\to\infty}\frac{1}{n}\log_{q}\binom{\ell_{i}}{q_{i}}=L\log_{q}(L)-Q\log_{q}(Q)-(L-Q)\log_{q}(L-Q),$

iv)

$\lim_{n\to\infty}\frac{1}{n}\log_{q}\binom{n-k-\ell_{1}-\ell_{2}}{t-p_{1}-p_{2}-q_{1}-q_{2}}=(1-R-2L)\log_{q}(1-R-2L)-(T-2P-2Q)\log_{q}(T-2P-2Q)-(1-R-2L-T+2P+2Q)\log_{q}(1-R-2L-T+2P+2Q).$

Success probability

We will denote by $S(P,Q,L)$ the asymptotic exponent of the success probability:

[TABLE]

Cost of one iteration

We will denote by $C(P,Q,L)$ the asymptotic exponent of the cost of one iteration.

[TABLE]

Overall cost

The overall asymptotic cost exponent of the ball-collision algorithm over $\mathbb{F}_{q}$ is given by the difference of $C(P,Q,L)$ and $S(P,Q,L)$ :

[TABLE]

The asymptotic complexity is then given by $q^{D(P,Q,L)n+o(n)}$ .

Asymptotically, we assume that the code attains the Gilbert-Varshamov bound, i.e. the code rate $R=k/n$ and the distance $D=d/n$ relate via:

[TABLE]

In order to compute the asymptotic complexity of half-distance decoding (i.e. $T=D/2$ ) for a fixed rate $R$ , we performed a numerical optimization of the parameters $P,Q$ and $L$ such that the overall cost $D(P,Q,L)$ is minimized subject to the following constraints:

[TABLE]

Let $F(q,R)$ be the exponent of the optimized asymptotic complexity. The asymptotic complexity of half-distance decoding at rate $R$ over $\mathbb{F}_{q}$ is then given by $q^{F(q,R)n+o(n)}$ .

In Table 1, the values refer to the exponent of the worst-case complexity of distinct algorithms, i.e. $F(q,R_{w})$ where $R_{w}={\rm argmax}_{0<R<1}\left(F(q,R)\right)$ . It compares Peter’s generalization of Stern’s algorithm to $\mathbb{F}_{q}$ , Hirose ’s generalization of Stern’s algorithm using May-Ozerov’s nearest neighbor algorithm (MO) to $\mathbb{F}_{q}$ , Gueye et al. generalization of the algorithm of BJMM using MO to $\mathbb{F}_{q}$ and the generalization of the ball-collision algorithm to $\mathbb{F}_{q}$ .

We can observe that the ball-collision algorithm over $\mathbb{F}_{q}$ outperforms Peter’s generalization of Stern’s algorithm to $\mathbb{F}_{q}$ and Hirose’s ISD algorithm over $\mathbb{F}_{q}$ , for all $q\geq 3$ . Like in the binary case, the ball-collision algorithm does not outperform the generalization of Gueye et al. of the BJMM algorithm using MO to $\mathbb{F}_{q}$ .

Acknowledgments

The fourth author is thankful to the Swiss National Science Foundation grant number 169510.

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Abdulrahman Al Jabri. A statistical decoding algorithm for general linear block codes. In IMA International Conference on Cryptography and Coding , pages 1–8. Springer, 2001.
2[2] Alexei E. Ashikhmin and Alexander Barg. Minimal vectors in linear codes. IEEE Transactions on Information Theory , 44(5):2010–2017, 1998.
3[3] Marco Baldi, Marco Bianchi, Franco Chiaraluce, Joachim Rosenthal, and Davide Schipani. Enhanced Public Key Security for the Mc Eliece Cryptosystem. Journal of Cryptology , pages 1–27, 2016.
4[4] Gustavo Banegas, Paulo SLM Barreto, Brice Odilon Boidje, Pierre-Louis Cayrel, Gilbert Ndollane Dione, Kris Gaj, Cheikh Thiécoumba Gueye, Richard Haeussler, Jean Belo Klamti, Ousmane N’diaye, et al. Dags: Key encapsulation using dyadic GS codes. Journal of Mathematical Cryptology , 2018.
5[5] Alexander Barg, Evgueni Krouk, and Henk CA van Tilborg. On the complexity of minimum distance decoding of long linear codes. IEEE Transactions on Information Theory , 45(5):1392–1405, 1999.
6[6] Anja Becker, Antoine Joux, Alexander May, and Alexander Meurer. Decoding random binary linear codes in 2n/20: How 1+ 1= 0 improves information set decoding. In Annual International Conference on the Theory and Applications of Cryptographic Techniques , pages 520–536. Springer, 2012.
7[7] Elwyn Berlekamp, Robert Mc Eliece, and Henk Van Tilborg. On the inherent intractability of certain coding problems (corresp.). IEEE Transactions on Information Theory , 24(3):384–386, 1978.
8[8] Daniel J Bernstein, Tanja Lange, and Christiane Peters. Smaller decoding exponents: ball-collision decoding. In Annual Cryptology Conference , pages 743–760. Springer, 2011.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Generalization

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. Related work

3. Preliminaries

3.1. Notation

3.2. Ball-collision algorithm over the binary field

Remark 1*.*

3.3. Concepts

4. Generalization of the Ball-Collision Algorithm

Theorem 2**.**

Proof.

5. Complexity Analysis

Success Probability of one Iteration

Cost of one Iteration

Overall cost

5.1. Asymptotic Complexity

Success probability

Cost of one iteration

Overall cost

Acknowledgments

*Remark 1**.*

Theorem 2.