A Reduced-Complexity Projection Algorithm for ADMM-based LP Decoding

Florian Gensheimer; Tobias Dietz; Kira Kraft; Stefan Ruzika; Norbert; Wehn

arXiv:1901.03240·cs.IT·January 11, 2019

A Reduced-Complexity Projection Algorithm for ADMM-based LP Decoding

Florian Gensheimer, Tobias Dietz, Kira Kraft, Stefan Ruzika, Norbert, Wehn

PDF

TL;DR

This paper introduces a low-complexity projection algorithm for ADMM-based LP decoding of LDPC codes, significantly reducing computational effort by eliminating sorting and leveraging recursive parity polytope structure.

Contribution

It presents a novel projection algorithm that avoids sorting and reduces arithmetic operations by up to 37%, enhancing efficiency for LP decoding.

Findings

01

Requires up to 37% fewer arithmetic operations

02

Eliminates the need for sorting in projection computations

03

Leverages recursive structure of the parity polytope

Abstract

The Alternating Direction Method of Multipliers has recently been adapted for Linear Programming Decoding of Low-Density Parity-Check codes. The computation of the projection onto the parity polytope is the core of this algorithm and usually involves a sorting operation, which is the main effort of the projection. In this paper, we present an algorithm with low complexity to compute this projection. The algorithm relies on new findings in the recursive structure of the parity polytope and iteratively fixes selected components. It requires up to 37% less arithmetical operations compared to state-of-the-art projections. Additionally, it does not involve a sorting operation, which is needed in all exact state-of-the-art projection algorithms. These two benefits make it appealing for efficient hard- and software implementations.

Tables1

Table 1. Table I: Maximum Gain

input	low-complexity operations	all arith. op.
${[- 1, 1)}^{d}$	-13%	-14%
${[- 3, 3)}^{d}$	-31%	-32%
${[- 5, 5)}^{d}$	-36%	-37%
${[- 10, 10)}^{d}$	-37%	-37%

Equations301

x^{M L} = x \in C ar g min i = 1 \sum n λ_{i} x_{i},

x^{M L} = x \in C ar g min i = 1 \sum n λ_{i} x_{i},

min

min

T_{j} x = z_{j} \forall j \in J

z_{j} \in P_{d_{j}} \forall j \in J .

L_{ρ} (x, z, u) = λ^{⊤} x + \frac{ρ}{2} j \in J \sum ∥ T_{j} x - z_{j} + u_{j} ∥_{2}^{2} - \frac{ρ}{2} j \in J \sum ∥ u_{j} ∥_{2}^{2} .

L_{ρ} (x, z, u) = λ^{⊤} x + \frac{ρ}{2} j \in J \sum ∥ T_{j} x - z_{j} + u_{j} ∥_{2}^{2} - \frac{ρ}{2} j \in J \sum ∥ u_{j} ∥_{2}^{2} .

x^{k + 1} := x arg min λ^{⊤} x + \frac{ρ}{2} j \in J \sum T_{j} x - z_{j}^{k} + u_{j}^{k}_{2}^{2},

x^{k + 1} := x arg min λ^{⊤} x + \frac{ρ}{2} j \in J \sum T_{j} x - z_{j}^{k} + u_{j}^{k}_{2}^{2},

z_{j}^{k + 1} := Π_{P_{d_{j}}} (T_{j} x^{k + 1} + u_{j}^{k}) \forall j \in J,

z_{j}^{k + 1} := Π_{P_{d_{j}}} (T_{j} x^{k + 1} + u_{j}^{k}) \forall j \in J,

u_{j}^{k + 1} := u_{j}^{k} + T_{j} x^{k + 1} - z_{j}^{k + 1} \forall j \in J .

u_{j}^{k + 1} := u_{j}^{k} + T_{j} x^{k + 1} - z_{j}^{k + 1} \forall j \in J .

P_{d_{j}} := conv {x \in {0, 1}^{d_{j}} : i = 1 \sum d_{j} x_{i} is even} .

P_{d_{j}} := conv {x \in {0, 1}^{d_{j}} : i = 1 \sum d_{j} x_{i} is even} .

x_{i} = \frac{1}{d _{i}} j \in N_{i} \sum ((u_{j}^{k})_{i} - (z_{j}^{k})_{i}) - \frac{λ}{ρ} \forall i \in I .

x_{i} = \frac{1}{d _{i}} j \in N_{i} \sum ((u_{j}^{k})_{i} - (z_{j}^{k})_{i}) - \frac{λ}{ρ} \forall i \in I .

P_{d, even} := P_{d} := conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is even}

P_{d, even} := P_{d} := conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is even}

P_{d, odd} := conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is odd} .

P_{d, odd} := conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is odd} .

0 \leq x_{i} \leq 1 \forall i = 1, \dots, n

0 \leq x_{i} \leq 1 \forall i = 1, \dots, n

0 \leq w_{j, S} \leq 1 \forall S \in E_{j}

S \in E_{j} \sum w_{j, S} = 1

x_{i} = S \in E_{j} S ∋ i \sum w_{j, S} \forall i \in N_{j}

x_{i} = S \in E_{1} S ∋ i \sum \geq 0 w_{1, S} \geq 0

x_{i} = S \in E_{1} S ∋ i \sum \geq 0 w_{1, S} \geq 0

x_{i} = S \in E_{1} S ∋ i \sum w_{1, S} \leq S \in E_{1} \sum w_{1, S} = (\ref e q : w j s u m) 1

x_{i} = S \in E_{1} S ∋ i \sum w_{1, S} \leq S \in E_{1} \sum w_{1, S} = (\ref e q : w j s u m) 1

\dot{Q_{1}} = P_{d, even} = conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is even},

\dot{Q_{1}} = P_{d, even} = conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is even},

0 \leq x_{i} \leq 1

0 \leq x_{i} \leq 1

i \in V \sum x_{i} - i \in N (j) ∖ V \sum x_{i} \leq ∣ V ∣ - 1

0 \leq x_{i} \leq 1

0 \leq x_{i} \leq 1

0 \leq w_{j, S} \leq 1 \forall S \in E_{1}

0 \leq w_{j, S} \leq 1 \forall S \in E_{1}

S \in E_{1} \sum w_{1, S} = 1

x_{i} = S \in E_{1} S ∋ i \sum w_{1, S} \forall i = 1, \dots, d,

\dot{Q_{1}} = P_{d, odd} = conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is odd} .

\dot{Q_{1}} = P_{d, odd} = conv {x \in {0, 1}^{d} : i = 1 \sum d x_{i} is odd} .

0 \leq x_{i} \leq 1

0 \leq x_{i} \leq 1

i \in V \sum x_{i} - i \in {1, \dots, d} ∖ V \sum x_{i} \leq ∣ V ∣ - 1

i \in V \sum x_{i} - i \in {1, \dots, d} ∖ V \sum x_{i} \leq ∣ V ∣ - 1

i \in V \sum x_{i} - i \in {1, \dots, d} ∖ V \sum x_{i} \leq ∣ V ∣ - 1

i \in V \sum (1 - x_{i}) + i \in {1, \dots, d} ∖ V \sum x_{i} \geq 1

i \in V \sum (1 - x_{i}) + i \in {1, \dots, d} ∖ V \sum x_{i} \geq 1

P_{3, even} := conv {x \in {0, 1}^{3} : x_{1} + x_{2} + x_{3} is even},

P_{3, even} := conv {x \in {0, 1}^{3} : x_{1} + x_{2} + x_{3} is even},

z_{1} z_{2} z_{3} := Π_{P_{3, even}} \frac{1}{2} 1 \frac{11}{4} .

z_{1} z_{2} z_{3} := Π_{P_{3, even}} \frac{1}{2} 1 \frac{11}{4} .

Π_{[0, 1]^{3}} \frac{1}{2} 1 \frac{11}{4} = \frac{1}{2} 11,

Π_{[0, 1]^{3}} \frac{1}{2} 1 \frac{11}{4} = \frac{1}{2} 11,

F := {x \in [0, 1]^{3} : x_{1} + x_{2} + x_{3} = 2} .

F := {x \in [0, 1]^{3} : x_{1} + x_{2} + x_{3} = 2} .

v

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A Reduced-Complexity Projection Algorithm for ADMM-based LP Decoding

Florian Gensheimer, Tobias Dietz, Kira Kraft, ,

Stefan Ruzika, and Norbert Wehn This work was supported by the DFG under project-ID WE 2442/9-3 and RU 1524/2-3.This paper will be presented in parts at the 10th International Symposium on Turbo Codes and Iterative Information Processing. F. Gensheimer is with the Mathematical Institute, University of Koblenz-Landau, 56070 Koblenz, Germany (email: [email protected]).T. Dietz and S. Ruzika are with the Department of Mathematics, Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany (email: [email protected]; [email protected]).K. Kraft and N. Wehn are with the Department of Electrical and Computer Engineering, Technische Universität Kaiserslautern, 67663 Kaiserslautern, Germany (email: [email protected]; [email protected]).

Abstract

The Alternating Direction Method of Multipliers has recently been adapted for Linear Programming Decoding of Low-Density Parity-Check codes. The computation of the projection onto the parity polytope is the core of this algorithm and usually involves a sorting operation, which is the main effort of the projection.

In this paper, we present an algorithm with low complexity to compute this projection. The algorithm relies on new findings in the recursive structure of the parity polytope and iteratively fixes selected components. It requires up to 37% less arithmetical operations compared to state-of-the-art projections. Additionally, it does not involve a sorting operation, which is needed in all exact state-of-the-art projection algorithms. These two benefits make it appealing for efficient hard- and software implementations.

Index Terms:

ADMM, LP decoding, parity polytope projection

I Introduction

Linear Programming (LP) decoding is a rather new decoding approach, that was established in 2003 by Feldman et al. [1]. The LP decoding problem is a relaxation of the Maximum-Likelihood (ML) decoding problem onto a special polytope. Using redundant parity checks [2], the error-correcting performance of LP decoding is close to the ML decoding performance. Therefore, LP decoding has become an interesting area of research for nearly all relevant code classes.

The major advantage of LP decoding is the reduced complexity compared to ML decoding. While ML decoding is NP hard in general [3], the relaxation onto the special polytope reduces the problem to a linear program, which can be solved in polynomial time. Recently, the Alternating Direction Method of Multipliers (ADMM) [4], an iterative method from convex optimization, was proposed for solving the LP decoding problem. In this context, the main effort of the ADMM is the projection onto the so-called parity polytope (see Section IV). The projection complexity grows with the number of ones in each row of the code’s underlying parity-check matrix, thus, ADMM-based LP decoding is mainly performed for Low-Density Parity-Check (LDPC) codes. This projection is the key of the ADMM algorithm and requires a sorting operation in all exact state-of-the-art implementations. However, sorting can become a major problem, especially for efficient hardware implementations, where it can heavily impact on latency, area, and power consumption.

In this paper, we extend the theory of parity polytopes and reveal their recursive structure. These findings allow us to present a new efficient projection algorithm that does not require sorting operations. The proposed algorithm iteratively fixes selected components of the projection to recursively reduce the problem to a smaller instance. We show that at least one component of the input can be fixed in every step. Therefore, the number of recursions is bounded and the problem size is strictly decreasing in every iteration. Our approach requires up to 37% less arithmetic operations than state-of-the-art projection algorithms, which directly translates to a reduction in computational complexity. In addition, the sorting operation is circumvented completely.

The outline of the paper is as follows: Section II describes the related work in the area of the ADMM-based LP decoding. In Section III, preliminaries and the ADMM method for LP decoding are recapitulated. In Section IV, we define the considered projection problem, recall some essential properties of the (even) parity polytope and derive the analogous results for its odd counterpart. Section V shows the geometrical idea of our new projection algorithm in an example. In Section VI, we prove the main theorem about this projection. It states that in every iteration, there exists at least one component that can be fixed for the rest of the projection. In Section VII, we show that the projection can be formulated as a recursive problem and utilize this fact in our efficient projection algorithm. The whole algorithm is then summarized in Section VIII. Section IX presents numerical results and highlights the benefit of our new projection algorithm. Finally, the paper is concluded in Section X.

II Related Work

The first ADMM method for LP decoding was presented by Barman et al. in [5], where a two-slice representation is used in the projection in order to describe the vectors of the parity polytope. The projection method in [5] needs two sorting operations. A more efficient projection method was presented Zhang and Siegel in [6], which is based on the cut-search algorithm [2] of the same authors. Its main effort is a sorting operation on a vector, which is in worst-case as large as the dimension $d$ of the parity polytope. The projections by Wasson and Draper [7] and Zhang et al. [8] reduce the problem to the projection onto a simplex and use the corresponding algorithms of Duchi et al. from [9]. The main effort in [7] is again a sorting operation. In [8], the two main subroutines are partial sorting and a modification of the randomized median finding algorithm from [10]. An iterative method, which does not require sorting operations, is presented by Wei and Banihashemi in [11]. However, it only outputs an approximate projection. A lookup table is used by Jiao et al. in the projection algorithm in [12], where the authors use the symmetric structure of the parity polytope in order to reduce the size of the table. In [13], Jiao et al. further decrease the size by using a non-uniform quantization scheme, which is found by minimizing the mean square error of a sample set.

Apart from the projection onto the parity polytope, many other investigations and improvements of ADMM-based LP decoding are made in the literature. In [14], Liu and Draper improve the error correction rate by introducing penalty terms for the objective function that reward binary decision variables. The behavior of ADMM decoding on trapping sets is studied by the same authors in [15]. In [16], Jiao et al. improve the error-correcting performance of penalized ADMM decoding for irregular LDPC codes by using different penalty parameters for variables with different variable degrees. In [17], Wei et al. reduce the runtime by avoiding projections whenever the change in the input of the projection is sufficiently small. New piecewise penalty terms are introduced by Wang et al. in [18]. In [19], Jiao et al. compare two improving techniques of ADMM in the context of LP decoding, namely over-relaxation [4] and accelerated ADMM [20]. A two-step scheme based on ADMM decoding is presented by Jiao and Mu in [21]. In order to reduce the error floor, the code structure is changed by eliminating codewords with low weight, and a postprocessing step is added after the ADMM LP decoder.

In [22], Wasson et al. propose a hardware architecture for the ADMM LP decoder based on the projection method presented in [7]. The hardware complexity of ADMM LP decoding is also investigated by Debbabi et al. in [23]. A multicore implementation is presented by the same authors in [24]. The schedule of the computations in the ADMM LP decoder are changed by Debbabi et al. [25] and Jiao et al. [26]. These schedules are combined to a new mixed schedule for ADMM LP decoding of LDPC convolutional codes by Thameur et al. in [27]. In [28], Xu et al. propose turbo equalization together with ADMM decoding for communication over the partial response channel.

III ADMM-based LP Decoding

In this paper, we consider binary linear block codes $C\subseteq\{0,1\}^{n}$ with block length $n$ and a parity-check matrix $H\subseteq\{0,1\}^{m\times n}$ , where $J=\{1,\dots,m\}$ denotes the set of check nodes and $d_{j}$ denotes the degree of check $j\in J$ , that is, the number of ones in the corresponding row of the parity-check matrix. The set of variable nodes is given by $I=\{1,\dots,n\}$ and $N_{i}\subseteq J$ describes the set of check nodes that include variable node $i\in I$ . The set $N_{j}=\{i\in\{1,\dots,n\}:H_{ji}=1\}\subseteq I$ denotes the set of variable nodes that are considered for check $j\in\{1,\dots,m\}$ . We consider binary-input memoryless channels. In [29], it is shown that, in this case, maximum-likelihood (ML) decoding can be rewritten as the minimization of a linear function. This means that

[TABLE]

where $\lambda_{i}=\ln\frac{Pr\left(\widetilde{x}_{i}|x_{i}=0\right)}{Pr\left(\widetilde{x}_{i}|x_{i}=1\right)}$ are the so-called log-likelihood ratios (LLR) and $\widetilde{x}$ is the received vector. The linear programming relaxation of this problem is called linear programming (LP) decoding [29]. For ADMM-based LP decoding, the following LP formulation with auxiliary variables $z_{j}\in\mathbb{R}^{d_{j}}$ is used:

[TABLE]

The matrix $T_{j}\in\{0,1\}^{d_{j}\times n}$ from [6] selects the variable nodes $i\in N_{j}$ for all $j\in J$ . The ADMM is an iterative method from convex optimization, that combines the strong convergence of the method of multipliers with the decomposability of the dual ascent method [4]. Mathematically, the ADMM is a gradient method, that solves a special dual problem of (1)-(3) which depends on the augmented Lagrangian. For the LP decoding problem, the augmented Lagrangian with scaled dual variables $u$ is given by

[TABLE]

In iteration $k$ , the variables are updated as follows:

[TABLE]

The mapping $\Pi_{\mathcal{P}_{d_{j}}}(\cdot)$ is defined as the projection onto the parity polytope

[TABLE]

The minimum in the $x$ -update (4) can be computed analytically with the formula

[TABLE]

The penalty terms used in [14] and [18] only change this $x$ -update. The main computational effort are the $z$ -updates which consist of a projection onto the parity polytopes of every parity row $j\in J$ .

IV Even and Odd Parity Polytopes

In the following, we recall basic properties of $\mathcal{P}_{d}$ which we call the even parity polytope

[TABLE]

to distinguish it from the odd parity polytope

[TABLE]

Projections on the even parity polytope and on the odd parity polytope both play a crucial role in the projection algorithm presented later.

In [29], Feldman presents the linear programming relaxation

[TABLE]

for the local parity polytope $C_{j}=\{x\in\{0,1\}^{n}:H_{j}x\equiv 0\mod 2\}$ , where $E_{j}=\{S\subseteq N(j):\lvert S\rvert\text{ even}\}$ is the set of even-sized subsets of $N_{j}$ . The feasible set of this polyhedron is called $\mathcal{Q}_{j}$ in [29] and $\dot{\mathcal{Q}_{j}}=\{x\in\mathbb{R}^{n}:\exists\mkern 2.0muw:(x,w)^{\top}\in\mathcal{Q}_{j}\}$ is the corresponding polyhedron without the auxiliary variables $w_{j,S}$ .

$\mathcal{P}_{d,\text{even}}$ can be interpreted as the convex hull of the local parity polytope with parity row $H_{1}=(1\dots 1)$ . This means that all bit variables $x_{i}$ participate in this row. Hence, the sum in (7) is not empty and it follows that

[TABLE]

and

[TABLE]

for all $i=1,\dots,d$ . Thus, the constraints (5) are redundant and can be removed, so it holds that

[TABLE]

where the variables $w_{j,S}$ can be interpreted as the coefficients in the convex combinations of the incidence vectors to the even-sized subsets $S\subseteq\{1,\dots,d\}$ . In Theorem 5.15 in [29], Feldman shows that $\dot{\mathcal{Q}_{j}}$ can be described by the polyhedron

[TABLE]

Together with (8), this shows that $\mathcal{P}_{d,\text{even}}$ is completely characterized by the box constraints and the forbidden-set inequalities, i. e. $\mathcal{P}_{d,\text{even}}$ is given by

[TABLE]

In [30], it is shown, that if for $x\in[0,1]^{d}$ , one of the forbidden-set inequalities in (IV) defines a cut, i. e. it is violated, then all other forbidden-set inequalities are fulfilled with strict inequality. In particular this means that at most one forbidden-set inequality of $\mathcal{P}_{d,\text{even}}$ is violated. In [2], Zhang and Siegel present their so-called cut-search algorithm, that computes this potentially violated forbidden-set inequality. It consists of two steps:

$\theta_{i}=\text{sgn}\left(x_{i}-0.5\right)\quad\forall\mkern 2.0mui=1,\dots,d$ 2. 2.

If $|\{i:\theta_{i}=1\}|$ is even, then determine $i^{*}=\operatorname*{arg\,min}_{i}|0.5-x_{i}|$ and set $\theta_{i^{*}}=-\theta_{i^{*}}$ .

The inequality is then given by $\theta^{\top}w\leq|V|-1=:p$ , where $|V|=|\{i:\theta_{i}=1\}|$ . If $x\notin[0,1]^{d}$ , then the cut-search algorithm for $\Pi_{[0,1]^{d}}(x)$ can be used as above with $x$ in the formulas, because $x\geq\frac{1}{2}$ holds if and only if $\Pi_{[0,1]}(x_{i})\geq\frac{1}{2}$ . For ADMM-based LP decoding, Zhang and Siegel show in [6] that if $\Pi_{[0,1]^{d}}(x)\notin\mathcal{P}_{d,\text{even}}$ , then the projection of $x$ onto $\mathcal{P}_{d,\text{even}}$ lies on the face defined by the unique forbidden-set inequality, that is violated by $\Pi_{[0,1]^{d}}(x)$ :

Lemma 1 ([6]).

Let $x\in\mathbb{R}^{d}$ , let $z=\Pi_{[0,1]^{d}}(x)$ . If $V\subseteq\{1,\dots,d\}$ with $|V|$ odd is a cutting set of $z$ , i. e. $\theta_{V}^{\top}z>|V|-1$ , then $\mathcal{P}_{d,\text{even}}$ must be on the face of $\mathcal{P}_{d,\text{even}}$ defined by $V$ , i. e. $\mathcal{P}_{d,\text{even}}\in F_{V}:=\{w\in[0,1]^{d}:\theta_{V}^{\top}w=|V|-1\}$ .

The vector $\theta_{V}$ denotes the forbidden-set inequality corresponding to $V$ . Next, we show that these properties for $\mathcal{P}_{d,\text{even}}$ are analogously valid for $\mathcal{P}_{d,\text{odd}}$ , where we can use forbidden-set inequalities with even instead of odd subsets:

Theorem 2.

Let $d\in\mathbb{N}_{>0}$ . Then it holds:

i)

$\mathcal{P}_{d,\text{odd}}=\{x\in[0,1]^{d}:\sum_{i\in V}x_{i}-\sum_{i\in\{1,\dots,d\}\setminus V}x_{i}\leq|V|-1\ \forall\mkern 2.0muV\subseteq\{1,\dots,d\}\text{ with }|V|\text{ even}\}$ ** 2. ii)

If for $x\in[0,1]^{d}$ , one of the forbidden-set inequalities from i) is a cut, then all other forbidden-set inequalities of i) are fulfilled with strict inequality. 3. iii)

The cut from ii) can be found as follows:

$\theta_{i}=\text{sgn}(x_{i}-0.5)\quad\forall\mkern 2.0mui=1,\dots,d$ ** 2. 2.

If $|\{i:\theta_{i}=1\}|$ is odd, then determine $i^{*}=\operatorname*{arg\,min}_{i}|0.5-x_{i}|$ and set $\theta_{i^{*}}=-\theta_{i^{*}}$ 4. iv)

Let $u\in\mathbb{R}^{d}$ , let $z=\Pi_{[0,1]^{d}}(x)$ . If $V\subseteq\{1,\dots,d\}$ with $|V|$ even is a cutting set of $z$ , then $\mathcal{P}_{d,\text{odd}}$ must be on the face of $\mathcal{P}_{d,\text{odd}}$ defined by $V$ , i. e. $\mathcal{P}_{d,\text{odd}}\in F_{V}:=\{w\in[0,1]^{d}:\theta_{V}^{\top}w=|V|-1\}$ .

Proof:

i)

For $d=1$ , the parity polytope has the form $\mathcal{P}_{1,\text{odd}}=\text{conv}\{1\}=\{1\}$ and the right-hand side of i) is given by $\{x_{1}\in[0,1]:-x_{1}\leq-1\}=\{1\}=\mathcal{P}_{1,\text{odd}}$ , because $V=\emptyset$ defines the only forbidden-set inequality in this case.

Next, let us consider the case $d\geq 2$ . Analogously to the formulation $(\ref{eq:boxconstraints})-(\ref{eq:wjs})$ , we consider the polyhedron $\mathcal{\widetilde{Q}}_{1}$ defined by

[TABLE]

where $E_{1}=\{S\subseteq\{1,\dots,d\}:|S|\text{ odd}\}$ is the set of odd-sized subsets of $\{1,\dots,d\}$ . We denote the restriction of $\mathcal{\widetilde{Q}}_{1}$ to the variables $x_{i}$ by $\dot{\mathcal{\widetilde{Q}}_{1}}:=\{x\in\mathbb{R}^{d}:\exists\mkern 2.0muw:(x,w)^{\top}\in\mathcal{\widetilde{Q}}_{1}\}$ . Since the values $w_{1,S}$ can be interpreted as the coefficients of a convex combination of the incidence vectors to the sets $S$ , it follows that

[TABLE]

We can prove almost exactly as in Theorem 5.15 in [29] that $\mathcal{P}_{d,\text{odd}}=\dot{\mathcal{\widetilde{Q}}_{1}}$ can be described by the polyhedron

[TABLE]

In the proof, we only need to replace “odd” by “even” and vice versa. Additionally, it is used in the proof of Theorem 5.15 that $\dot{\mathcal{Q}_{j}}$ is full-dimensional, which is proven in Theorem 2 (c) in [31]. In our proof, we use instead that $\dot{\mathcal{\widetilde{Q}}_{1}}$ is full-dimensional, which this is also shown in Theorem 2 (c) of [31]. 2. ii)

The proof from Theorem 1 in [30] can be adopted by replacing every “odd” by “even” in the proof. The only statement, that needs to be verified, is that two indicator vectors of two distinct odd subsets have an $\ell_{1}$ -distance of at least 2. This also holds for two distinct even subsets. 3. iii)

The forbidden-set inequalities

[TABLE]

for all $V\subseteq\{1,\dots,d\}$ with $|V|$ even can be rewritten as

[TABLE]

for all $V\subseteq\{1,\dots,d\}$ with $|V|$ even. By ii), at most one of these inequalities is violated. If one of these inequalities is violated, it must be the one, where the left-hand side is minimal, since the right-hand side is always $1$ . For finding the even-sized set corresponding to this inequality, we define $V=\{i\in\{1,\dots,d\}:x_{i}>0.5\}$ . If $|V|$ is even, then we are done. If $|V|$ is odd, then we must flip the membership of that index, which increases the sum by the smallest margin. This means that we must find the index $i^{*}$ , where $|x_{i}-0.5|$ is minimized and include it in $V$ if it was not contained in $V$ before or vice versa. The cut-search algorithm computes the corresponding coefficient vector to this $V$ . Hence, this cut-search algorithm is correct. 4. iv)

The proof in [6] can be used word-by-word, only statement ii) is needed.

∎

As for the polytope $\mathcal{P}_{d,\text{even}}$ , we can use $x\notin[0,1]^{d}$ as an input in Theorem 2 iii), when we want to apply this cut-search algorithm to $\Pi_{[0,1]^{d}}(x)$ .

V Geometrical Idea

In this section, we want to explain the geometrical idea behind our new projection algorithm in an example. We want to project the point $\hat{x}=(\frac{1}{2},1,\frac{11}{4})^{\top}$ onto the parity polytope

[TABLE]

i. e. we want to compute

[TABLE]

As a first step, we apply the cut-search algorithm of Zhang and Siegel [2] to

[TABLE]

which outputs the forbidden-set inequality $x_{1}+x_{2}+x_{3}\leq 2$ . Since $\frac{1}{2}+1+1=\frac{5}{2}>2$ , it holds that this inequality is, indeed, violated by $(\frac{1}{2},1,1)^{\top}$ . Hence, it follows that $(\frac{1}{2},1,1)^{\top}$ is not in $\mathcal{P}_{3,\text{even}}$ and therefore not the projection of $\hat{x}$ onto $\mathcal{P}_{3,\text{even}}$ . Hence, it follows from Lemma 1, that the projection of $(\frac{1}{2},1,\frac{11}{4})^{\top}$ onto $\mathcal{P}_{3,\text{even}}$ lies on the face

[TABLE]

The projection onto such a face is a difficult problem, because the face is an intersection of two sets, namely the unit hypercube $[0,1]^{3}$ and the hyperplane $\{x\in\mathbb{R}^{3}:x_{1}+x_{2}+x_{3}=2\}$ . However, as an idea in our projection, we use the fact that the projection onto the hypercube $[0,1]^{3}$ and the projection onto a hyperplane can both be computed easily. The projection onto $[0,1]^{3}$ can be computed component-wise by mapping values greater than $1$ to $1$ , negative values to [math] and keeping all other values unchanged. The projection onto the hyperplane can be obtained by subtracting a certain multiple of the normal vector of the hyperplane, which is $(1,1,1)^{\top}$ in the example.

In our projection, we start with the projection of $\hat{x}$ onto the hyperplane $x_{1}+x_{2}+x_{3}=2$ , which leads to the point

[TABLE]

The situation is illustrated in Figure 1.

In this figure, the yellow area illustrates the hyperplane $x_{1}+x_{2}+x_{3}=2$ and the red triangle is the face $F$ . The blue volume is the part of the unit hypercube, that fulfills the forbidden-set inequality $x_{1}+x_{2}+x_{3}\leq 2$ , and the blue point $z$ is the projection of $\hat{x}$ onto the face, which is - at this moment - still unknown. If the violet point $v=(-\frac{1}{4},\frac{1}{4},2)^{\top}$ lies in the unit hypercube $[0,1]^{3}$ , then this point will also be the projection of $\hat{x}$ onto the face $F$ . However, this is not the case in this example.

First Attempt

As a next idea, we project $v$ onto $[0,1]^{3}$ , which leads to the point

[TABLE]

the orange point in Figure 1. This is no projection onto the face since $0+\frac{1}{4}+1\neq 2$ . Even projecting this new point onto the face, which leads to the red point in the figure, is not the desired projection. Hence, projecting all components of $v$ onto $[0,1]^{3}$ is not a good idea, in general. However, we claim that one component which is not yet contained in $[0,1]$ can be fixed to [math] or $1$ for the rest of the projection.

Second Attempt

As a second attempt, we try to fix the first component $v_{1}$ . Since $v_{1}<0$ , we fix $v_{1}$ to [math] and obtain the point $(0,\frac{1}{4},2)^{\top}$ . However, the only point on the plane $x_{1}=0$ , that lies on the face $F$ is the point $(0,1,1)$ , which is not the wanted projection.

Third Attempt

Since $v_{2}$ is already contained in $[0,1]$ , the next attempt is to fix the third component $v_{3}$ to $1$ , i. e. we move from

[TABLE]

As can be seen in Figure 1, the green line is the intersection of the plane $v_{3}=1$ with the face $F$ . Since $z$ is contained in this green line, fixing $v_{3}$ to $1$ was correct.

In the second attempt, the point $(0,\frac{1}{4},2)^{\top}$ is not contained in the feasible halfspace of the forbidden-set inequality $x_{1}+x_{2}+x_{3}\leq 2$ , in contrast to the point $(-\frac{1}{4},\frac{1}{4},1)^{\top}$ . It is shown later that this is the criterion for choosing the correct component.

Since we concluded that $z_{3}=1$ , we reduce the original problem of projecting $\hat{x}$ onto $\mathcal{P}_{3,\text{even}}$ to the subproblem of projecting $(\hat{x}_{1},\hat{x}_{2})^{\top}$ onto

[TABLE]

which is exactly the green line $\{x\in F:x_{1}+x_{2}=1\}$ in Figure 1. This subproblem has one dimension less, and the “type” of the parity polytope changed from even to odd. We can now apply the same approach again. The cut-search algorithm applied to $(\frac{1}{2},1)^{\top}$ leads to the forbidden-set inequality $x_{1}+x_{2}\leq 1$ , which is the same as inserting $x_{3}=1$ to the original forbidden-set inequality $x_{1}+x_{2}+x_{3}\leq 2$ . Later it is proven that this fast update procedure is always correct, such that the cut-search algorithm does not have to be applied again from scratch.

Next, we project $(\frac{1}{2},1)^{\top}$ onto the hyperplane $x_{1}+x_{2}=1$ . This situation is illustrated in Figure 2.

The resulting point on the hyperplane is

[TABLE]

Since $\widetilde{v}\in[0,1]^{2}$ , it holds that

[TABLE]

Hence, the solution of the original projection problem is

[TABLE]

VI Fixing Components of the Projection

In this section, we generalize the previous example and present the main theorem of this paper, that states how components in our projection can be fixed. Its proof is inspired by the geometric idea.

The goal of the projection is to compute $z=\Pi_{\mathcal{P}_{d,\text{even}}}(x)$ , i. e. the projection of some given point $x\in\mathbb{R}^{d}$ onto the even parity polytope $\mathcal{P}_{d,\text{even}}$ . As in other projection algorithms (see e. g. [6]), we start with the application of the cut-search algorithm of Zhang and Siegel [2] to $x\in\mathbb{R}^{d}$ to obtain some forbidden-set inequality $\theta^{\top}w\leq p$ . If $\theta^{\top}\Pi_{[0,1]^{d}}(x)\leq p$ , then we know that all other forbidden-set inequalities are also fulfilled, because $\theta^{\top}w\leq p$ is, as the output of the cut-search algorithm, the only forbidden-set inequality of $\mathcal{P}_{d,\text{even}}$ , that is potentially violated. Hence, in this case it holds that $\Pi_{[0,1]^{d}}(x)\in\mathcal{P}_{d,\text{even}}$ and that $\Pi_{\mathcal{P}_{d,\text{even}}}(x)=\Pi_{[0,1]^{d}}(x)$ . If $\theta^{\top}\Pi_{[0,1]^{d}}(x)>p$ , we know from Lemma 1 that $z$ lies on the face $\{w\in[0,1]^{d}:\theta^{\top}w=p\}$ . In this case, we compute the orthogonal projection of $x$ onto the hyperplane $\theta^{\top}w=p$ . The following lemma states, that the orthogonal projection onto the hyperplane moves the point $x$ in the correct direction, namely in the direction of the desired projection $z=\Pi_{\mathcal{P}_{d,\text{even}}}(x)$ .

Lemma 3.

Let $U:=\{w\in\mathbb{R}^{d}:\theta^{\top}w=p\}$ be a hyperplane and $\emptyset\neq F:=U\cap[0,1]^{d}$ its intersection with the unit hypercube. Let $x\in\mathbb{R}^{d}$ . Then it holds that $\Pi_{F}(x)\in\operatorname*{arg\,min}_{y\in F}\left\lVert\Pi_{U}(x)-y\right\rVert_{2}$ . This means, that the projection of $x$ onto $F$ is the point on $F$ , which has the smallest distance to the projection of $x$ onto the hyperplane.

Proof:

Let $z=\Pi_{F}(x)$ be the projection of $x$ onto $F$ and let $v=\Pi_{U}(x)$ be the projection of $x$ onto the hyperplane $U$ . Let $y\in F$ . The projection $v$ can be written as $x-\lambda\theta$ for some $\lambda\in\mathbb{R}$ , because $\theta$ is a normal vector of the hyperplane $\theta^{\top}w=p$ . Hence, it follows that

[TABLE]

We can conclude that

[TABLE]

By replacing $y$ with $z$ , one obtains that

[TABLE]

Hence, it follows that

[TABLE]

∎

To perform the projection onto the hyperplane, we need to subtract a multiple of the normal vector of the hyperplane $\theta^{\top}w=p$ from $x$ , i. e. we want to find the step length $\lambda$ such that $x-\lambda\theta$ lies on the hyperplane. Therefore, the equation

[TABLE]

needs to be fulfilled. Hence, the projection of $x$ onto $\theta^{\top}w=p$ is given by

[TABLE]

If $v$ lies in $[0,1]^{d}$ , then $v\in\{w\in[0,1]^{d}:\theta^{\top}w=p\}$ holds. By Lemma 1, this means that $v$ is the wanted projection onto $\mathcal{P}_{d,\text{even}}$ in this case. If $v\notin[0,1]^{d}$ , we claim that we can fix at least one component $z_{i}$ with $v_{i}\notin[0,1]$ to [math] or $1$ . We claim that we can fix those components $z_{i}$ , where the projection of $v_{i}$ onto $[0,1]$ would move the point $v$ into the feasible halfspace $\theta^{\top}w\leq p$ of the violated forbidden-set inequality. If $v_{i}>1$ , this would mean that we move into the direction $-e_{i}$ , i. e. $\theta^{\top}(v-e_{i})\leq p$ shall be fulfilled where $e_{i}$ denotes the $i$ -th unit vector. With $\theta^{\top}v=p$ and $\theta\in\{\pm 1\}$ , we can conclude that

[TABLE]

For the case $v_{i}<0$ , projecting onto $[0,1]$ means to move into the direction $e_{i}$ , i. e. $\theta^{\top}(v+e_{i})\leq p$ shall be fulfilled. In the same way, we get that

[TABLE]

This means that, if $v_{i}>1$ and $\theta_{i}=1$ , we claim that $z_{i}=1$ . If $v_{i}<0$ and $\theta_{i}=-1$ , we claim that $z_{i}=0$ . This claim is formalized and proven in the following main theorem of the paper:

Theorem 4.

Let $x\in\mathbb{R}^{d}$ with $d\geq 2$ . Let

[TABLE]

with $V\subseteq\{1,\dots,d\}$ ( $|V|$ even or odd) be a forbidden-set inequality. Let $v=\Pi_{\{w\in\mathbb{R}^{d}:\theta^{\top}w=|V|-1\}}(x)$ be the projection of $x$ onto the hyperplane $\theta^{\top}w=|V|-1$ and let $z=\Pi_{F}(x)$ be the projection of $x$ onto the face $F:=\{w\in[0,1]^{d}:\theta^{\top}w=|V|-1\}$ . Let $i\in\{1,\dots,d\}$ . Then it holds:

If $v_{i}>1$ and $\theta_{i}=1$ , then $z_{i}=1$ . 2. 2.

If $v_{i}<0$ and $\theta_{i}=-1$ , then $z_{i}=0$ .

Proof:

to 1.: It is proven by contradiction. Assume that $z_{i}<1$ :

The idea of this proof is to construct another point $z+\lambda y$ on the face $F$ , which has a shorter distance to $v$ than $z$ . This results in a contradiction to Lemma 3. In order to construct $z+\lambda y$ , we start in $z$ and move along the hyperplane perpendicular to the face $F^{\prime}:=\{w\in F:w_{i}=1\}$ until we intersect it. The intersection point is then the wanted point in $F$ with the shorter distance to $v$ . The situation is illustrated in Figure 3.

Finding direction of improvement $y$

For finding the improving direction $y$ , we increase the component $z_{i}$ along the hyperplane $\theta^{\top}w=|V|-1$ . This means that the direction $y$ is the orthogonal projection of $e_{i}$ onto $\theta^{\top}w=0$ . Since $\theta$ is a normal vector of $\theta^{\top}w=0$ , we get that

[TABLE]

For $j\in\{1,\dots,d\}\setminus\{i\}$ , this leads to

[TABLE]

For $j=i$ , it leads to

[TABLE]

By multiplying with the denominator $d$ , we get that

[TABLE]

Finding the intersection point

Next, we determine the step length $\lambda$ , such that $z+\lambda y\in F^{\prime}$ . Hence, we set

[TABLE]

It follows that $\lambda=\frac{1-z_{i}}{d-1}$ . Next, we show that $z+\lambda y$ lies in $F$ . Since $z\in F$ and $\theta^{\top}y=0$ , it follows by

[TABLE]

that $z+\lambda y$ is contained in the hyperplane. Therefore, it is left to show that $0\leq z_{j}+\lambda y_{j}\leq 1$ for all $j\in\{1,\dots,d\}\setminus\{i\}$ . Let $j\in\{1,\dots,d\}\setminus\{i\}$ .

Case 1: $\theta_{j}=1$

Since $z_{j}\leq 1$ , $z_{i}<1$ and $d>1$ , it follows that $(z+\lambda y)_{j}=z_{j}+\frac{1-z_{i}}{d-1}\cdot(-\theta_{j})=z_{j}-\frac{1-z_{i}}{d-1}\leq 1$ . In the following, we use that $z$ lies on the hyperplane, i. e.

[TABLE]

Since

[TABLE]

and since $\theta_{i}=\theta_{j}=1$ , it follows that

[TABLE]

Together with $d\geq 2$ and $0\leq z_{j}\leq 1$ , we can conclude that

[TABLE]

By using some algebra, it follows that

[TABLE]

Hence, it follows that $(z-\lambda y)_{j}\geq 0$ .

Case 2: $\theta_{j}=-1$

This case follows the same idea as the first case. With $z_{j}\geq 0$ , $z_{i}<1$ and $d>1$ , we get that

[TABLE]

Since $\theta_{i}=1$ and $\theta_{j}=-1$ , it follows with (16) that

[TABLE]

With $d\geq 2$ and $0\leq z_{j}\leq 1$ , we can conclude that

[TABLE]

By using some algebra, $\theta_{j}=-1$ and $\theta_{i}=1$ , we get that

[TABLE]

Hence, it follows that $(z-\lambda y)_{j}\leq 1$ .

Distances to $v$

We have shown that $z+\lambda y\in F^{\prime}\subset F$ . Next, we show that $z+\lambda y$ is a point on the face, that is closer to $v$ than $z$ .

In the following, we use that

[TABLE]

Calculating the distance between $v$ and $z$ , we obtain

[TABLE]

The last two terms can be transformed to

[TABLE]

Hence, it follows with (20) that

[TABLE]

However, because of Lemma 3 and $z+\lambda y\in F$ , it holds that $\left\lVert v-z\right\rVert_{2}^{2}>\left\lVert v-(z+\lambda y)\right\rVert_{2}^{2}$ is a contradiction to $z$ being the projection onto the face $F$ . Hence, the assumption $z_{i}<1$ was wrong and it must hold that $z_{i}=1$ .

to 2.: The proof is similar to the first part. To obtain a contradiction, we assume that $z_{i}>0$ .

We want to decrease the component of $z_{i}$ along the hyperplane $\theta^{\top}w=|V|-1$ . Hence, we choose the direction vector $y$ as the orthogonal projection of $-e_{i}$ onto $\theta^{\top}w=0$ . We get that

[TABLE]

For $j\in\{1,\dots,d\}\setminus\{i\}$ , this leads to

[TABLE]

For $j=i$ , we obtain

[TABLE]

By multiplying with $d$ , we obtain the direction vector

[TABLE]

For determining the step length $\lambda$ , $z+\lambda y$ shall be the intersection point with the face $F^{\prime}=\{w\in F:w_{i}=0\}$ . Hence, $z+\lambda y$ shall fulfill the equation

[TABLE]

Hence, we get that $\lambda=\frac{z_{i}}{d-1}$ .

Again, we want to show that $z+\lambda y\in F^{\prime}\subset F$ . As in (14), it follows that $\theta^{\top}(z+\lambda y)=|V|-1$ . It remains to show that $0\leq z_{j}+\lambda y_{j}\leq 1$ for all $j\in\{1,\dots,d\}\setminus\{i\}$ . For this purpose, we distinguish two cases:

Case 1: $\theta_{j}=1$

We get that

[TABLE]

With $\theta_{j}=1=-\theta_{i}$ and (16) we can conclude that

[TABLE]

and that

[TABLE]

By using some algebra, we get that

[TABLE]

Hence, it holds that $z_{j}+\lambda y_{j}\geq 0$ .

Case 2: $\theta_{j}=-1$

It holds that

[TABLE]

In this case, we have $\theta_{i}=\theta_{j}=-1$ . Together with (16), it follows that

[TABLE]

and

[TABLE]

We get that

[TABLE]

Hence, it follows that $(z+\lambda y)_{j}\leq 1$ and $z+\lambda y\in F^{\prime}\subset F$ .

For the distance comparison, we use

[TABLE]

As in case 1, it holds that

[TABLE]

For the last two terms, we get that

[TABLE]

Hence, it follows that $\left\lVert v-z\right\rVert_{2}^{2}>\left\lVert v-(z+\lambda y)\right\rVert_{2}^{2}$ . Again, Lemma 3, $z+\lambda y\in F$ and $\left\lVert v-z\right\rVert_{2}^{2}>\left\lVert v-(z+\lambda y)\right\rVert_{2}^{2}$ are a contradiction to the fact, that $z$ is the projection of $x$ onto the face $F$ . Hence, the assumption $z_{i}>0$ was wrong and it follows that $z_{i}=0$ . ∎ The next theorem shows that the conditions of the last theorem are fulfilled for at least one component $v_{i}$ if $v\notin[0,1]^{d}$ :

Theorem 5.

Let $x\in\mathbb{R}^{d}$ and let

[TABLE]

with $V\subseteq\{1,\dots,d\}$ ( $|V|$ odd or even) a forbidden-set inequality. Let $v=\Pi_{\{w\in\mathbb{R}^{d}:\theta^{\top}w=|V|-1\}}(x)$ be the projection of $x$ onto the hyperplane $\theta^{\top}w=|V|-1$ with $v\notin[0,1]^{d}$ . Then there exists at least one $i\in\{1,\dots,d\}$ such that

[TABLE]

or

[TABLE]

Proof:

The proof is again by contradiction:

Let us assume that for all $v_{i}\notin[0,1]$ , it holds that

[TABLE]

or

[TABLE]

In particular, it holds that $v_{i}\leq 1$ for all $i=1,\dots,d$ with $\theta_{i}=1$ and that $v_{i}\geq 0$ for all $i=1,\dots,d$ with $\theta_{i}=-1$ . Since $v\notin[0,1]^{d}$ , there exists at least one component $v_{j}$ with $j\in\{1,\dots,d\}$ that fulfills (23) or (24). If $v_{j}<0$ and $\theta_{j}=1$ , it follows that

[TABLE]

If $v_{j}>1$ and $\theta_{j}=-1$ , it holds that

[TABLE]

Both cases lead to a contradiction. Hence, the assumption was wrong and the claim follows. ∎

VII Recursive Structure of the Projection

Up to now, we established the following procedure for computing the projection $z=\Pi_{\mathcal{P}_{d,\text{even}}}(x)$ : First, we compute the potentially violated forbidden-set inequality $\theta^{\top}w\leq p$ of $\Pi_{[0,1]^{d}}(x)$ and check whether $\theta^{\top}\Pi_{[0,1]^{d}}(x)\leq p$ . If this is true, then $z=\Pi_{[0,1]^{d}}(x)$ . Otherwise, we know from Lemma 1, that $z\in\{w\in[0,1]^{d}:\theta^{\top}w=p\}$ and compute the projection $v=x-\frac{\theta^{\top}x-p}{d}\theta$ of $x$ onto the hyperplane. If $v\in[0,1]^{d}$ , then $z=v$ . Otherwise, Theorem 5 tells us that we can compute at least one component of $z$ by using Theorem 4. Next, we show that the remaining components of $z$ are the solution of a smaller-dimensional projection problem onto $\mathcal{P}_{\widetilde{d},\text{even}}$ or $\mathcal{P}_{\widetilde{d},\text{odd}}$ , where $\widetilde{d}<d$ . First, we consider the case that only one component of $z$ was fixed with Theorem 4. For this purpose, we show that the points in the parity polytopes exhibit the following recursive structure:

Theorem 6.

Let $d\geq 2$ and $i\in\{1,\dots,d\}$ . Then it holds:

[TABLE]

Proof:

For 1.) : Since

[TABLE]

is a convex hull of finitely many points, it follows that it is a polyhedron. Since $\mathcal{P}_{d,\text{even}}\subseteq[0,1]^{d}$ , it follows that $x_{i}\leq 1$ is a valid inequality and $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}$ is a face of $\mathcal{P}_{d,\text{even}}$ . Hence, all extreme points of $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}$ are also extreme points of $\mathcal{P}_{d,\text{even}}$ . Together with the fact that all extreme points of $\mathcal{P}_{d,\text{even}}$ are binary vectors, it follows that $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}$ does also only have binary vectors as extreme points. Since $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}\subseteq[0,1]^{d}$ , it even follows that its binary solutions are exactly its extreme points. For any binary vector $x\in\{0,1\}^{d}$ , it holds that

[TABLE]

Since $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}\subseteq[0,1]^{d}$ is a polytope, it follows from Minkowski’s Theorem that $\{x\in\mathcal{P}_{d,\text{even}}:x_{i}=1\}$ is the convex hull of its extreme points, i. e.

[TABLE]

In $(*)$ , we use that $\sum_{l=1}^{L}\lambda_{l}\cdot 1=1$ . The proofs for 2) to 4) are completely analogous to 1). In 2) and 4), we need to replace $1$ by [math] in the corresponding $i$ -th components and use $\sum_{l=1}^{L}\lambda_{l}\cdot 0=0$ instead of $\sum_{l=1}^{L}\lambda_{l}=1$ . The equivalences in (25) - (27) are replaced by:

$\begin{aligned} &x\in\mathcal{P}_{d,\text{even}}\ \text{and}\ x_{i}=0\\ \Leftrightarrow\quad&x_{i}=0\ \text{and}\ \sum_{j=1}^{d}x_{j}\text{ is even}\\ \Leftrightarrow\quad&x_{i}=0\ \text{and}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{d}x_{j}\text{ is even.}\end{aligned}$ 2. 3)

$\begin{aligned} &x\in\mathcal{P}_{d,\text{odd}}\ \text{and}\ x_{i}=1\\ \Leftrightarrow\quad&x_{i}=1\ \text{and}\ \sum_{j=1}^{d}x_{j}\text{ is odd}\\ \Leftrightarrow\quad&x_{i}=1\ \text{and}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{d}x_{j}\text{ is even.}\end{aligned}$ 3. 4)

$\begin{aligned} &x\in\mathcal{P}_{d,\text{odd}}\ \text{and}\ x_{i}=0\\ \Leftrightarrow\quad&x_{i}=0\ \text{and}\ \sum_{j=1}^{d}x_{j}\text{ is odd}\\ \Leftrightarrow\quad&x_{i}=0\ \text{and}\sum_{\begin{subarray}{c}j=1\\ j\neq i\end{subarray}}^{d}x_{j}\text{ is odd.}\end{aligned}$

∎

Next, we show that after fixing one component of $z_{i}$ , the remaining entries of $z$ are again the solution of a projection problem onto a parity polytope. For this purpose, we show that fixing one component reduces the problem to a projection of the remaining components onto a smaller-dimensional parity polytope.

Theorem 7.

Let $z=\Pi_{\mathcal{P}_{d,\text{even}}}(x)$ with $z_{i}=1$ for some $i\in\{1,\dots,d\}$ . Then it holds:

[TABLE] 2. 2.

Let $z=\Pi_{\mathcal{P}_{d,\text{even}}}(x)$ with $z_{i}=0$ for some $i\in\{1,\dots,d\}$ . Then it holds:

[TABLE] 3. 3.

Let $z=\Pi_{\mathcal{P}_{d,\text{odd}}}(x)$ with $z_{i}=1$ for some $i\in\{1,\dots,d\}$ . Then it holds:

[TABLE] 4. 4.

Let $z=\Pi_{\mathcal{P}_{d,\text{odd}}}(x)$ with $z_{i}=0$ for some $i\in\{1,\dots,d\}$ . Then it holds:

[TABLE]

Proof:

It follows directly from Theorem 6 and

[TABLE]

∎

If more than one component of $z$ was fixed with Theorem 4, then Theorem 7 can be applied several times inductively so that the remaining components of $z$ are again the solution of a projection problem onto a parity polytope.

Hence, the idea of our algorithm is to repeat our mentioned steps (see the beginning of this section) recursively in order to solve the corresponding smaller-dimensional projection problem onto some parity polytope $\mathcal{P}_{\widetilde{d},\text{even}}$ or $\mathcal{P}_{\widetilde{d},\text{odd}}$ with $\widetilde{d}<d$ . Since we showed in Theorem 2 that the required properties for the parity polytope $\mathcal{P}_{\widetilde{d},\text{even}}$ from the literature are also true in the analogous way for $\mathcal{P}_{\widetilde{d},\text{odd}}$ , it is not relevant in our problem, on which type of parity polytope we need to project. In the following, we will just talk about (csa), when we mean the cut-search algorithm of Zhang and Siegel [2] for $\mathcal{P}_{\widetilde{d},\text{even}}$ or the analog cut-search algorithm from Theorem 2 iii) for $\mathcal{P}_{\widetilde{d},\text{even}}$ , respectively.

The first step in our proposed projection algorithm was the application of (csa) to $\Pi_{[0,1]^{d}}(x)$ . Let us assume we are now in the recursive part of the algorithm with $\widetilde{d}<d$ , i. e. we already fixed some components of $z$ and want to project the remaining components $\widetilde{x}\in\mathbb{R}^{\widetilde{d}}$ of $x$ to the corresponding (even or odd) smaller-dimensional parity polytope. Next, we will show that the (csa) applied to $\Pi_{[0,1]^{\widetilde{d}}}(\widetilde{x})$ can be computed very efficiently. We do not need to apply (csa) again from the scratch. Instead, we can use the output $\theta^{\top}w\leq p$ with $\theta\in\{\pm 1\}^{d}$ of (csa) applied to $\Pi_{[0,1]^{d}}(x)$ , which was computed in the previous step. The output of (csa) for $\Pi_{[0,1]^{\widetilde{d}}}(\widetilde{x})$ is then given by the components of $\theta$ , where $z_{i}$ is not yet fixed. The right-hand side $\widetilde{p}$ is the right-hand side of the previous step minus the number of components of $z$ that were fixed to $1$ in the previous step. The next theorem shows this statement for the case that one component of $z$ was fixed in the previous step.

Theorem 8.

Let $x=(\widetilde{x}_{1},\dots,\widetilde{x}_{i-1},y,\widetilde{x}_{i},\dots,\widetilde{x}_{d-1})^{\top}\in\mathbb{R}^{d}$ with $d\geq 2$ .

Let $(\theta,p)\in\mathbb{R}^{d+1}$ be the forbidden-set inequality returned by (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ . Let $\theta_{i}=1$ . Then $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p-1)$ is the output of (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ . 2. 2.

Let $(\theta,p)\in\mathbb{R}^{d+1}$ be the forbidden-set inequality returned by (csa) applied to $x$ and $\mathcal{P}_{d,\text{odd}}$ . Let $\theta_{i}=1$ . Then $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p-1)$ is the output of (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{even}}$ . 3. 3.

Let $(\theta,p)\in\mathbb{R}^{d+1}$ be the forbidden-set inequality returned by (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ . Let $\theta_{i}=-1$ . Then $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p)$ is the output of (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{even}}$ . 4. 4.

Let $(\theta,p)\in\mathbb{R}^{d+1}$ be the forbidden-set inequality returned by (csa) applied to $x$ and $\mathcal{P}_{d,\text{odd}}$ . Let $\theta_{i}=-1$ . Then $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p)$ is the output of (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ .

Proof:

We use that the right-hand side of any forbidden-set inequality $\theta^{\top}w\leq p$ is given by $p=|\{i\in\{1,\dots,\}:\theta_{i}=1\}|-1$ .

to 1): We consider two cases:

Case 1: $|\{j\in\{1,\dots,d\}:x_{j}>0.5\}|$ is odd

In this case, the (csa) for $x$ and $\mathcal{P}_{d,\text{even}}$ stops after step 1 and it holds that $\theta_{j}=1$ if and only if ${x}_{j}>0.5$ for all $j=1,\dots,d$ . Since $\theta_{i}=1$ , it holds that $y>0.5$ and that

[TABLE]

is even. This means that the (csa) for $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ also stops after step 1 and computes

$((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p-1)$ .

Case 2: $|\{j\in\{1,\dots,d\}:x_{j}>0.5\}|$ is even

In this case, the (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ would need to do step 2, i. e. it would compute $i^{*}$ and set $\theta_{i^{*}}:=-\theta_{i^{*}}$ . We consider two cases:

Case 2a: $i^{*}=i$

Because of $\theta_{i}=1$ and $i^{*}=i$ , it holds in this case that $y\leq 0.5$ . Then, the (csa) for $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ would compute the vector $(\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top}$ in step 1 and stop because

[TABLE]

is even. The right-hand side is $p-1=|\{j\in\{1,\dots,d\}\setminus\{i\}:x_{j}>0.5\}|-1$ , because no component of $\theta$ is flipped and $p-1$ , the right-hand side after step 1, is not increased to $p$ in step 2, as it is done in the (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ .

Case 2b: $i^{*}\neq i$

Because of $\theta_{i}=1$ and $i^{*}\neq i$ , it holds that $y>0.5$ . Hence, it follows that

[TABLE]

is odd. Since, $i^{*}\neq i$ , the same entry of $\theta$ is flipped by (csa) for $x$ and $\widetilde{x}$ . Hence, the (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ would output the coefficient vector $(\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top}$ . From (28) and the flipping of $\theta_{i^{*}}$ in the (csa) for $x$ and $\widetilde{x}$ , it follows that the right-hand side for $\widetilde{x}$ is one less than the one for $x$ after step 1 and step 2 of (csa). Hence, the right-hand side $p-1$ is outputted by the (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{odd}}$ .

to 2): the proof of 1) can be used, where every “odd” is replaced by “even” and vice versa.

to 3): We consider two cases:

Case 1: $|\{j\in\{1,\dots,d\}:x_{j}>0.5\}|$ is odd

In this case, the (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ stops after step 1 and $x_{j}>0.5$ is equivalent to $\theta_{j}=1$ for all $j=1,\dots,d$ . Hence, it follows from $\theta_{i}=-1$ that $y\leq 0.5$ and that

[TABLE]

is odd. Hence, the (csa) applied to $\widetilde{x}$ also stops after step 1 and outputs $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p)$ .

Case 2: $|\{j\in\{1,\dots,d\}:x_{j}>0.5\}|$ is even

In this case, (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ is not finished after step 1 and has to flip $\theta_{i^{*}}$ . There can occur two cases:

Case 2a: $i^{*}=i$ .

Since $\theta_{i}=-1$ , it must hold that $y>0.5$ and that

[TABLE]

is odd. Hence, the (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{even}}$ stops after step 1 and returns the coefficient vector $(\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top}$ . After step 1 of (csa), the right-hand side to $x$ is one unit larger than the one of $\widetilde{x}$ . However, since the (csa) applied to $x$ flips $\theta_{i}$ from $1$ to $-1$ , the computed right-hand sides to $x$ and $\widetilde{x}$ are the same after the termination of (csa). Hence the (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{even}}$ returns $((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p)$ .

Case 2b: $i^{*}\neq i$

In this case, $\theta_{i}$ is always $-1$ during the whole (csa) applied to $x$ and $\mathcal{P}_{d,\text{even}}$ . Hence, it holds that $y\leq 0.5$ and that

[TABLE]

is even. Hence, the (csa) for $x$ and $\widetilde{x}$ (with $\mathcal{P}_{d,\text{even}}$ and $\mathcal{P}_{d-1,\text{even}}$ ) will compute the same right-hand side in step 1 and flip the same entry of $\theta$ . Hence, the (csa) applied to $\widetilde{x}$ and $\mathcal{P}_{d-1,\text{even}}$ outputs

$((\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top},p)$ .

to 4): the proof of 3) can be used by replacing every “even” by “odd” and vice versa. ∎

If more than one component of $z$ was fixed in the last iteration, then the last theorem can be applied several times inductively. The next step after computing the potentially violated forbidden-set inequality $\widetilde{\theta}^{\top}\widetilde{w}\leq\widetilde{p}$ of $\Pi_{[0,1]^{\widetilde{d}}}(\widetilde{x})$ would be the check whether $\widetilde{\theta}^{\top}\Pi_{[0,1]^{\widetilde{d}}}(\widetilde{x})\leq\widetilde{p}$ . If the check is fulfilled, then $\Pi_{[0,1]^{\widetilde{d}}}(\widetilde{x})$ would be the projection of $\widetilde{x}$ onto $\mathcal{P}_{\widetilde{d},\text{even}}$ or $\mathcal{P}_{\widetilde{d},\text{odd}}$ , respectively. However, we show that this check is never fulfilled and therefore redundant in the recursive part of our projection algorithm. If the check is true, then the corresponding very first check in the beginning of the algorithm would have also been true and the algorithm would have terminated in the beginning with $\Pi_{[0,1]^{d}}(x)$ . This is shown in the next theorem:

Theorem 9.

The check $\theta^{\top}\Pi_{[0,1]^{d}}(x)\leq p$ is only necessary in the beginning of the proposed projection algorithm, i. e. it is redundant in the recursive calls of the algorithm.

Proof:

Let $x\in\mathbb{R}^{d}$ and $(\theta,p)$ the input of a recursive call of the described projection algorithm. Since it is a recursive call, at least one component of the wanted projection $z$ was fixed before in the algorithm by using Theorem 4. With Theorem 8, it follows that $\theta^{\top}w\leq p$ is the output of the corresponding cut-search algorithm applied to $\Pi_{[0,1]^{d}}(x)$ and $\mathcal{P}_{d,\text{even}}$ (or $\mathcal{P}_{d,\text{odd}}$ ). If we undo the last fixing of a component $z_{j}$ in the algorithm, we obtain a larger vector $\begin{pmatrix}x\\ \widetilde{x}_{d+1}\end{pmatrix}\in\mathbb{R}^{d+1}$ . (Without loss of generality, the last component of $z$ was fixed). There are two possible cases when we fix components of $z$ according to Theorem 4:

Case 1: $\theta_{d+1}=1$ and $v_{d+1}>1$

From the way, how components of $z$ are fixed in Theorem 4, and from Theorem 8, it follows that

[TABLE]

is the result of the cut-search algorithm applied to $\Pi_{[0,1]^{d+1}}\begin{pmatrix}x\\ \widetilde{x}_{d+1}\end{pmatrix}$ with $\mathcal{P}_{d+1,\text{odd}}$ (or $\mathcal{P}_{d+1,\text{even}}$ ).

If $\theta^{\top}\Pi_{[0,1]^{d}}(x)\leq p$ , then it holds that

[TABLE]

Case 2: $\theta_{d+1}=-1$ and $v_{d+1}<0$

From Theorem 8 and the way, how components of $z$ are fixed in Theorem 4, it follows that

[TABLE]

is the result of the cut-search algorithm applied to $\Pi_{[0,1]^{d+1}}\begin{pmatrix}x\\ \widetilde{x}_{d+1}\end{pmatrix}$ with $\mathcal{P}_{d+1,\text{even}}$ (or $\mathcal{P}_{d+1,\text{odd}}$ ).

If $\theta^{\top}\Pi_{[0,1]^{d}}(x)\leq p$ , then it holds that

[TABLE]

Summarizing both cases, we can conclude that if the check in Theorem 9 is fulfilled during a recursive call of the proposed projection algorithm, then the corresponding vector $\begin{pmatrix}x\\ \widetilde{x}_{d+1}\end{pmatrix}$ fulfills the corresponding higher-dimensional check (29) or (30). Inductively, it follows that if the check in Theorem 9 is fulfilled, then the very first check of the algorithm, that tests whether the projection onto the unit hypercube lies in the parity polytope, was also fulfilled. But in this case, the algorithm terminates with the projection onto the unit hypercube without starting the recursion. Hence, if our projection algorithm is currently in a recursive call, then the check from Theorem 9 is never fulfilled, i. e. we can omit the check during the recursive part of the proposed algorithm. ∎

After the projection onto the hyperplane, there are three possibilites: A component $v_{i}$ could lie outside of $[0,1]$ and fulfill one of the two fixing conditions from Theorem 4, such that we can compute $z_{i}$ , or $v_{i}$ lies outside of $[0,1]$ and cannot be fixed, or $v_{i}\in[0,1]$ . If all components of $v$ lie in $[0,1]$ , the algorithm terminates and $z=v$ . Next, we want to investigate the question, how the components of $v$ can switch between these three cases in the subsequent recursive steps.

For this purpose, let us assume first that the projection $v$ of $x$ onto the current hyperplane $\theta^{\top}w=p$ is not in $[0,1]^{d}$ (otherwise, $z=v$ and we are finished) and that we could only fix one component of $z$ , i. e. there exists exactly one $i\in\{1,\dots,d\}$ with

[TABLE]

or

[TABLE]

Let $\widetilde{x}=(x_{1},\dots,x_{i-1},x_{i+1},\dots,x_{d})^{\top}$ be the remaining components of $x$ in the next recursive step and let $\widetilde{\theta}=(\theta_{1},\dots,\theta_{i-1},\theta_{i+1},\dots,\theta_{d})^{\top}$ and

[TABLE]

be the violated forbidden-set inequality in the next recursive step. Let $(\widetilde{v}_{1},\dots,\widetilde{v}_{i-1},\widetilde{v}_{i+1},\dots,\widetilde{v}_{d})^{\top}=\widetilde{x}-\frac{\widetilde{\theta}^{\top}\widetilde{x}-\widetilde{p}}{d-1}\widetilde{\theta}$ be the projection of $\widetilde{x}\in\mathbb{R}^{\widetilde{d}}=\mathbb{R}^{d-1}$ onto the hyperplane $\widetilde{\theta}^{\top}\widetilde{w}=\widetilde{p}$ in the next recursive step. For simplifying the notation, we assume without loss of generality, that $j<i$ . Then, the following holds for any component $\widetilde{v}_{j}$ :

Theorem 10.

i)

If $\theta_{j}=1$ , then $\widetilde{v}_{j}>v_{j}$ . 2. ii)

If $\theta_{j}=-1$ , then $\widetilde{v}_{j}<v_{j}$ .

Proof:

We make a case distinction:

In both cases, we use that

[TABLE]

Case 1: $v_{i}>1$ , $\theta_{i}=1$ :

[TABLE]

If $\theta_{j}=1$ , then it holds that $\widetilde{v}_{j}=v_{j}+\underbrace{\frac{1}{d-1}}_{>0}\underbrace{(v_{i}-1)}_{>0}>v_{j}$ .

If $\theta_{j}=-1$ , then it holds that $\widetilde{v}_{j}=v_{j}-\underbrace{\frac{1}{d-1}}_{>0}\underbrace{(v_{i}-1)}_{>0}<v_{j}$ .

Case 2: $v_{i}<0$ , $\theta_{i}=-1$ :

[TABLE]

If $\theta_{j}=1$ , then it holds that $\widetilde{v}_{j}=v_{j}-\underbrace{\frac{1}{d-1}}_{>0}\underbrace{v_{i}}_{<0}>v_{j}$ .

If $\theta_{j}=-1$ , then it holds that $\widetilde{v}_{j}=v_{j}+\underbrace{\frac{1}{d-1}}_{>0}\underbrace{v_{i}}_{<0}<v_{j}$ . ∎

If more than one component was fixed, the analogous result follows from applying the last theorem inductively. Hence, it follows that components $v_{j}$ of the projection onto the hyperplane are strictly monotonically increasing in the case of $\theta_{j}=1$ and strictly monotonically decreasing in the case of $\theta_{j}=-1$ . Since we can fix components of $z$ in the cases $(v_{j}>1,\theta_{j}=1)$ and $(v_{j}<0,\theta_{j}=-1)$ , this means that the $v_{j}^{\prime}s$ move into the direction where Theorem 4 is applicable. Additionally, this means if $v_{j}\in[0,1]$ , then there are two possibilities:

$v_{j}$ stays in $[0,1]$ in every following recursion. 2. 2.

$v_{j}$ is fixed in the first recursion, where it is not in $[0,1]$ anymore.

VIII Projection Algorithm

Our projection method is summarized in Algorithm 1.

In the lines $1-5$ , we apply the cut-search algorithm to $x$ , which leads to the same result as applying it to $\Pi_{[0,1]^{d}}(x)$ . In the lines $6-8$ , it is checked whether $\Pi_{[0,1]^{d}}(x)$ is lying in the parity polytope. If this is true, then $\Pi_{[0,1]^{d}}(x)$ is the projection of $x$ onto $\mathcal{P}_{d,\text{even}}$ and therefore returned. If $\Pi_{[0,1]^{d}}\notin\Pi_{\mathcal{P}_{d,\text{even}}}$ , we enter the while loop, the recursive part of the projection. Before entering the loop, we initialize $l$ by $d$ . This variable tracks the number of components of $z$ , that are not yet computed. For the implementation, we swap the computed components of $z$ to the beginning of the vector and continue with the remaining vector. The variable $f$ tracks the index of the first uncomputed component of $z$ , whereas $fold$ stores the value of $f$ from the beginning of the current while loop iteration. The index vector $Q$ tracks all swaps, that were made. The lines $12-17$ describe one of the two stopping criteria. If the current dimension $l$ of the recursive subproblem is $1$ , then there are the following two possibilities for the corresponding parity polytope: The first case is

[TABLE]

In this case, $\mathcal{P}_{1,\text{even}}$ can be described by the box constraints $0\leq x\leq 1$ and the only forbidden-set inequality $x\leq 0$ . This means that $\theta_{f}=1$ and $p=0$ in this case. The second possibility is the odd parity polytope

[TABLE]

In this case, $\mathcal{P}_{1,\text{odd}}$ can be described by $0\leq x\leq 1$ and the only forbidden-set inequality $-x\leq-1$ . This means that $\theta_{f}=-1$ and $p=-1$ in this case. Hence, we can make the check $\theta_{f}=1$ to distinguish both cases. If this stopping criterion is not fulfilled, we continue and compute the projection of the remaining components of $x$ onto the hyperplane defined by the current forbidden-set inequality in line 18. In the lines 19-34, we go through all components of $v_{f,\dots,d}$ and check to which components of $z$ Theorem 4 can be applied, i. e. which components of $z$ can be fixed in this iteration. For the swaps in $x$ and $\theta$ in the lines 23-24 and 31-32, it is sufficient to update $x_{i}$ and $\theta_{i}$ , because $x_{f}$ and $\theta_{f}$ are not needed anymore. Theorem 5 says that at least one component can be fixed in the case of $v_{f,\dots,d}\notin[0,1]^{l}$ . Hence, if no component was fixed, i. e. the check in line 35 is true, then we are in the case $v_{f,\dots,d}\in[0,1]^{l}$ and we can stop with $z_{f,\dots,d}=v_{f,\dots,d}$ . Otherwise, we update the current problem size $l$ , update $fold$ and consider the corresponding smaller-dimensional projection problem in the next while loop iteration. In the algorithm, we do not check whether $l=0$ , which could happen theoretically. However, the next theorem shows that this situation cannot occur:

Theorem 11.

In Algorithm 1, the variable $l$ cannot become zero.

Proof:

Assume the claim is wrong, i. e. $l$ is set to zero during an iteration of the while loop in Algorithm 1. Let $v\in\mathbb{R}^{\widetilde{d}}$ be the current projection of $x\in\mathbb{R}^{\widetilde{d}}$ onto the current hyperplane $\theta^{\top}w=p=|\{i:\theta_{i}=1\}|-1$ in the iteration of the while loop, where $l$ is set to zero. Since $l$ is set to zero in line 38, this means that all remaining components of the projection on $x$ onto the face $\{w\in[0,1]^{\widetilde{d}}:\theta^{\top}w=p\}$ were fixed with Theorem 4. This means that for all $i=1,\dots,\widetilde{d}$ , it must hold that

[TABLE]

or

[TABLE]

Since $v$ lies on the hyperplane $\theta^{\top}v=p$ , it follows that

[TABLE]

which is a contradiction. Hence, the assumption was wrong and the claim follows. ∎

Since we fix, by Theorem 5, at least one component of $z$ in every iteration of the while loop in Algorithm 1, it follows that the worst-case complexity of Algorithm 1 is $\mathcal{O}(d)+d\cdot\mathcal{O}(d)=\mathcal{O}(d^{2})$ . Figure 4 displays the actual average number of iterations in the case $\Pi_{[0,1]^{d}}(x)\notin\mathcal{P}_{d,\text{even}}$ that we measured in our numerical results for different ranges of input vectors. In the Figure, one can see that the actual number is close to $\log_{2}(d)$ . Hence, we make the conjecture that half of the components are fixed in average. A mathematical intuition behind this conjecture is that when we compute the projection $v$ onto the hyperplane $\theta^{\top}x=p$ , then every component $v_{j}$ can be in $[0,1]$ or not. If $v_{j}\in[0,1]$ , Theorem 10 states that $v_{j}$ stays in $[0,1]$ for the whole algorithm or is fixed in the first iteration where it leaves $[0,1]$ . Staying in $[0,1]$ would be a good situation, because if $v\in[0,1]^{d}$ , then the algorithm terminates and $v$ is the wanted projection onto the parity polytope (see line 35 of Algorithm 1). In the more difficult situation $v_{j}\notin[0,1]$ , there are four possibilities: $(v_{j}<0,\theta_{j}=1)$ , $(v_{j}<0,\theta_{j}=-1)$ , $(v_{j}>1,\theta_{j}=1)$ and $(v_{j}>1,\theta_{j}=-1)$ . In two of four, i.e. in half of the cases, we can fix the component $z_{j}$ of the projection by using Theorem 4. Under the conjecture that all four cases occur with equal probability, we would obtain an average complexity of $\mathcal{O}(d)+\mathcal{O}(\frac{d}{2})+\mathcal{O}(\frac{d}{4})+\dots=\mathcal{O}(2d)=\mathcal{O}(d)$ , i. e. we would have a quadratic worst-case and a linear average-case complexity, similar to the algorithm in [8].

IX Numerical Results

In this section, we compare our new projection algorithm with the projection algorithms from the literature, namely with the algorithm of Zhang and Siegel [6], Wasson and Draper [7], Zhang et al. [8], and Wei and Banihashemi [11]. To have an estimation of the potential complexity for efficient hard- and software implementations, we count the number of arithmetic operations. For this purpose, we count the number of divisions, the number of multiplications and the number of all other low-complexity operations, as e. g. comparisons, additions, substractions or negations. We avoided unnecessary divisions by replacing the comparisons $\frac{\delta}{\zeta}>t_{i}$ in Algorithm 3 of [6] by $\delta>\zeta t_{i}$ and by storing $is_{i}$ instead of $s_{i}$ for the computation of $\rho$ in Algorithm 2 of [7]. Comparisons with [math], $\max(x,0)$ , the floor operation and assignments to $1$ , [math], $-1$ or other assignments without arithmetic operations were not counted to the low-complexity operations due to their negligible effort in hardware. Since the coefficient vector $\theta\in\{\pm 1\}^{d}$ (or the vector $f\in\{0,1\}^{d}$ in [7]) could be stored as a boolean array, comparisons of the form $\theta_{i}=1$ or $\theta_{i}=-1$ are also negligible and were not counted. Operations of the form $a+\theta_{i}\cdot b$ , as they appear in several algorithms, were therefore counted as one low-complexity operation, because one can check whether $\theta_{i}=1$ and then make an addition or substraction to compute $a+b$ or $a-b$ , respectively. The sortings from [6] and [7] were implemented with the quicksort algorithm with the last element as pivot element. For simplification, we also chose the last element as pivot element in the partial sorting and the projection onto the simplex in [8].

We simulated projections onto $\mathcal{P}_{d,\text{even}}$ for input vectors from $[-1,1)^{d}$ , $[-3,3)^{d}$ , $[-5,5)^{d}$ , and $[-10,10)^{d}$ . For $[-1,1)^{d}$ , the dimensions $d=2,\dots,20$ were simulated. For the other three simulations, the dimensions $d=2,\dots,50$ were investigated. For every $d$ , one million randomly generated vectors were projected. The results are shown in Figures 5 to 8. Our proposed algorithm is denoted by “Fix”. Figures 5a to 8a show the average number of low-complexity operations and Figure 5b to 8b the average number of multiplications and divisions for different check degrees. After our mentioned simplifications, our algorithm and the algorithm from [11] do not need any multiplications. Hence, the corresponding two plots are left out in the Figures 5b to 8b. The algorithms from [6] and [7] need both exactly one division in the difficult case of $\Pi_{[0,1]^{d}}(x)\notin\mathcal{P}_{d,\text{even}}$ and zero in the simple case $\Pi_{[0,1]^{d}}(x)\in\mathcal{P}_{d,\text{even}}$ . The number of divisions of [8] are very similar and form basically the same lines in the Figures 5b to 8b. In these Figures, the number of multiplications and divisions is monotonically decreasing for all five algorithms after a certain value of $d$ , although the problem size increases for growing $d$ . The same situation can be observed for the low-complexity operations of [11]. This behavior can be explained by Figure 9. It shows the probabilities, measured through our simulations, that the difficult case $\Pi_{[0,1]^{d}}(x)\notin\mathcal{P}_{d,\text{even}}$ occurs for some random input from our considered ranges. For all four ranges, we can see that the probabilities decrease after a certain certain check degree. In our proposed and in the two algorithms from [6, 7] it is checked whether $\Pi_{[0,1]^{d}}(x)\in\mathcal{P}_{d,\text{even}}$ . In [8], this check is done in the main case of their algorithm. The approximative algorithm from [11] stops after one iteration in that case. Hence, the number of multiplications and divisions (and low-complexity operations for [11]) decrease for large $d$ . Additionally, this means that for $[-1,1)^{d}$ and $d>20$ , the projection is very simple, because the hard case $\Pi_{[0,1]^{d}}(x)\notin\mathcal{P}_{d,\text{even}}$ is happening only very rarely. Hence, we only considered the degrees $d\leq 20$ for this range.

As can be seen in the Figures 5 to 8, our implementation needs less low-complexity operations than all other implementations for every dimension $d$ and every range the random input is created from. The Figures also show that low-complexity operations form the large majority of operations in all considered algorithms. Regarding high-complexity operations, our algorithm needs slightly more divisions than the implementations from [6], [7] and [8]. However, they are less than the sum of divisions and multiplications of every other algorithm in all considered cases. Table I shows the maximum gain in arithmetical operations for different input intervals. The values were obtained by comparing our proposed algorithm with the best result from the other four considered algorithms. Overall, we need up to $37\%$ less arithmetical operations, resulting in lower implementation complexity and making LP decoding more attractive for efficient hardware implementation.

X Conclusion

In this paper, we presented a new reduced-complexity projection algorithm for ADMM-based LP decoding. By establishing the theory of the odd parity polytope similar to that of the even parity polytope, the projection algorithm can be regarded as a recursive problem, where the projections are varying between projections on the even or odd parity polytope. As some components of the input are fixed in every iteration, the problem size is constantly decreasing. In contrast to other exact state-of-the-art projections, the proposed algorithm needs up to 37% less arithmetical operations and additionally requires no sorting operation. These properties make it a very good choice for future hardware implementations.

Acknowledgement

We gratefully acknowledge financial support by the DFG (project-ID: WE 2442/9-3 and RU 1524/2-3).

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Feldman, M. J. Wainwright, and D. R. Karger, “Using linear programming to decode binary linear codes,” IEEE Transactions on Information Theory , vol. 51, no. 3, pp. 954–972, March 2005.
2[2] X. Zhang and P. H. Siegel, “Adaptive cut generation algorithm for improved linear programming decoding of binary linear codes,” IEEE Transactions on Information Theory , vol. 58, no. 10, pp. 6581–6594, October 2012.
3[3] E. Berlekamp, R. Mc Eliece, and H. van Tilborg, “On the inherent intractability of certain coding problems (corresp.),” IEEE Transactions on Information Theory , vol. 24, no. 3, pp. 384–386, May 1978.
4[4] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. , vol. 3, no. 1, pp. 1–122, January 2011. [Online]. Available: http://dx.doi.org/10.1561/2200000016
5[5] S. Barman, X. Liu, S. C. Draper, and B. Recht, “Decomposition methods for large scale LP decoding,” IEEE Transactions on Information Theory , vol. 59, no. 12, pp. 7870–7886, December 2013.
6[6] X. Zhang and P. H. Siegel, “Efficient iterative LP decoding of LDPC codes with alternating direction method of multipliers,” in 2013 IEEE International Symposium on Information Theory , July 2013, pp. 1501–1505.
7[7] M. Wasson and S. C. Draper, “Hardware based projection onto the parity polytope and probability simplex,” in 2015 49th Asilomar Conference on Signals, Systems and Computers , November 2015, pp. 1015–1020.
8[8] G. Zhang, R. Heusdens, and W. B. Kleijn, “Large scale LP decoding with low complexity,” IEEE Communications Letters , vol. 17, no. 11, pp. 2152–2155, November 2013.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A Reduced-Complexity Projection Algorithm for ADMM-based LP Decoding

Abstract

Index Terms:

I Introduction

II Related Work

III ADMM-based LP Decoding

IV Even and Odd Parity Polytopes

Lemma 1** ([6]).**

Theorem 2**.**

Proof:

V Geometrical Idea

First Attempt

Second Attempt

Third Attempt

VI Fixing Components of the Projection

Lemma 3**.**

Proof:

Theorem 4**.**

Proof:

Finding direction of improvement yyy

Finding the intersection point

Distances to vvv

Theorem 5**.**

Proof:

VII Recursive Structure of the Projection

Theorem 6**.**

Proof:

Theorem 7**.**

Proof:

Theorem 8**.**

Proof:

Theorem 9**.**

Proof:

Theorem 10**.**

Proof:

VIII Projection Algorithm

Theorem 11**.**

Proof:

IX Numerical Results

X Conclusion

Acknowledgement

Lemma 1 ([6]).

Theorem 2.

Lemma 3.

Theorem 4.

Finding direction of improvement $y$

Distances to $v$

Theorem 5.

Theorem 6.

Theorem 7.

Theorem 8.

Theorem 9.

Theorem 10.

Theorem 11.