A Novel Convex Relaxation for Non-Binary Discrete Tomography

Jan Kuske; Paul Swoboda; Stefania Petra

arXiv:1703.03769·math.OC·December 27, 2018·SSVM

A Novel Convex Relaxation for Non-Binary Discrete Tomography

Jan Kuske, Paul Swoboda, Stefania Petra

PDF

TL;DR

This paper introduces a new convex relaxation method for non-binary discrete tomography that jointly solves reconstruction and labeling, leading to more accurate and tighter solutions than existing approaches.

Contribution

A novel joint convex relaxation formulation for non-binary discrete tomography that improves solution tightness and integrates reconstruction and labeling tasks.

Findings

01

Tighter convex relaxation achieved compared to previous methods.

02

Experimental results show superior reconstruction quality.

03

The approach outperforms existing relaxations both mathematically and empirically.

Abstract

We present a novel convex relaxation and a corresponding inference algorithm for the non-binary discrete tomography problem, that is, reconstructing discrete-valued images from few linear measurements. In contrast to state of the art approaches that split the problem into a continuous reconstruction problem for the linear measurement constraints and a discrete labeling problem to enforce discrete-valued reconstructions, we propose a joint formulation that addresses both problems simultaneously, resulting in a tighter convex relaxation. For this purpose a constrained graphical model is set up and evaluated using a novel relaxation optimized by dual decomposition. We evaluate our approach experimentally and show superior solutions both mathematically (tighter relaxation) and experimentally in comparison to previously proposed relaxations.

Figures5

Click any figure to enlarge with its caption.

Tables2

Table 1. Table 1: Number of instances where duality gap < 1 absent 1 <1 (optimality).

	STD relax	STD BB	CTG CB	CTG relax	CTG BB
duality gap ” $< 1$ ”	$53$	$243$	$178$	$154$	$182$
			$205$		$182$

Table 2. Table 2: Comparison of bounds and primal solutions obtained by ( STD ) or ( CTG ).

	#Instances
(CTG) $>$ (STD) (our relaxation yields strictly better lower bound)	$350$
our heuristic (only) found optimal integral solution	$12$
our heuristic found optimal integral solution	$238$

Equations18

x \in X_{V} min E (x) := u \in V \sum θ_{u} (x_{u}) + uv \in E \sum θ_{uv} (x_{u}, x_{v}) s.t. A x = b .

x \in X_{V} min E (x) := u \in V \sum θ_{u} (x_{u}) + uv \in E \sum θ_{uv} (x_{u}, x_{v}) s.t. A x = b .

(x_{1}, \dots, x_{n}) \in X_{U} min s.t. u \in U \sum θ_{u} (x_{u}) + i \in [n - 1] \sum θ_{u_{i}, u_{i + 1}} (x_{u_{i}}, x_{u_{i + 1}}) u \in U \sum x_{u} = b .

(x_{1}, \dots, x_{n}) \in X_{U} min s.t. u \in U \sum θ_{u} (x_{u}) + i \in [n - 1] \sum θ_{u_{i}, u_{i + 1}} (x_{u_{i}}, x_{u_{i + 1}}) u \in U \sum x_{u} = b .

x_{j + 1 : l} \sum μ_{i : j : l} (x_{i : j}, x_{j + 1 : l})

x_{j + 1 : l} \sum μ_{i : j : l} (x_{i : j}, x_{j + 1 : l})

x_{i : j} \sum μ_{i : j : l} (x_{i : j}, x_{j + 1 : l})

x_{i : j}, x_{j + 1 : l} \sim x_{i : l} \sum μ_{i : j : l} (x_{i : j}, x_{j + 1 : l})

θ_{i : j : l}^{ϕ} (x_{i : j}, x_{j + 1 : l})

θ_{i : j : l}^{ϕ} (x_{i : j}, x_{j + 1 : l})

θ_{i : j : l}^{ϕ} (x_{i : j}, x_{j + 1 : l})

θ_{i : j : l}^{ϕ} (x_{i : j}, x_{j + 1 : l})

ϕ_{i : j : l}^{↑} (x_{i : l}) = s_{i : j} + s_{j + 1 : l} = s_{i : l} - x_{u_{j}} - x_{u_{j}} min θ_{u_{j}, u_{j + 1}} (x_{u_{j}}, x_{u_{j + 1}}) + ϕ_{i : j}^{\leftarrow} (x_{i : j}) + ϕ_{j + 1 : l}^{\to} (x_{j + 1 : l}),

ϕ_{i : j : l}^{↑} (x_{i : l}) = s_{i : j} + s_{j + 1 : l} = s_{i : l} - x_{u_{j}} - x_{u_{j}} min θ_{u_{j}, u_{j + 1}} (x_{u_{j}}, x_{u_{j + 1}}) + ϕ_{i : j}^{\leftarrow} (x_{i : j}) + ϕ_{j + 1 : l}^{\to} (x_{j + 1 : l}),

λ_{1}, \dots, λ_{m} max i = 1 \sum m x \in X_{U_{i}} min E_{i} (x ∣ λ_{i}) s.t. i \sum λ_{i, u} \equiv 0 \forall u \in V .

λ_{1}, \dots, λ_{m} max i = 1 \sum m x \in X_{U_{i}} min E_{i} (x ∣ λ_{i}) s.t. i \sum λ_{i, u} \equiv 0 \forall u \in V .

\begin{array}[]{rl}\min_{\mu\geq 0}&\sum_{u\in\operatorname{\mathsf{V}}}\langle\theta_{u},\mu_{u}\rangle+\sum_{uv\in\operatorname{\mathsf{E}}}\langle\theta_{uv},\mu_{uv}\rangle\\ \text{s.t.}&\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}\mu_{u}(x_{u})=1\quad\forall u\in\operatorname{\mathsf{V}}\\ &\begin{array}[]{c}\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}\mu_{uv}(x_{u},x_{v})=\mu_{v}(x_{v})\quad\forall x_{v}\in\operatorname{\mathcal{X}}_{v}\\ \sum_{x_{v}\in\operatorname{\mathcal{X}}_{v}}\mu_{uv}(x_{u},x_{v})=\mu_{u}(x_{u})\quad\forall x_{u}\in\operatorname{\mathcal{X}}_{u}\end{array}\quad\forall uv\in\operatorname{\mathsf{E}}\\ &\sum_{u\in\operatorname{\mathsf{V}}}A_{iu}\cdot\left(\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}x_{u}\cdot\mu_{u}(x_{u})\right)=b_{i}\quad i=1,\ldots,m\,.\end{array}

\begin{array}[]{rl}\min_{\mu\geq 0}&\sum_{u\in\operatorname{\mathsf{V}}}\langle\theta_{u},\mu_{u}\rangle+\sum_{uv\in\operatorname{\mathsf{E}}}\langle\theta_{uv},\mu_{uv}\rangle\\ \text{s.t.}&\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}\mu_{u}(x_{u})=1\quad\forall u\in\operatorname{\mathsf{V}}\\ &\begin{array}[]{c}\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}\mu_{uv}(x_{u},x_{v})=\mu_{v}(x_{v})\quad\forall x_{v}\in\operatorname{\mathcal{X}}_{v}\\ \sum_{x_{v}\in\operatorname{\mathcal{X}}_{v}}\mu_{uv}(x_{u},x_{v})=\mu_{u}(x_{u})\quad\forall x_{u}\in\operatorname{\mathcal{X}}_{u}\end{array}\quad\forall uv\in\operatorname{\mathsf{E}}\\ &\sum_{u\in\operatorname{\mathsf{V}}}A_{iu}\cdot\left(\sum_{x_{u}\in\operatorname{\mathcal{X}}_{u}}x_{u}\cdot\mu_{u}(x_{u})\right)=b_{i}\quad i=1,\ldots,m\,.\end{array}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\standaloneconfig

mode=buildnew, build=latexoptions=-interaction=batchmode -shell-escape -jobname

\buildjobname

11institutetext: †MIG, Inst. Appl. Mathematics, Heidelberg University

‡ Institute of Science and Technology (IST), Austria

A Novel Convex Relaxation for Non-Binary Discrete Tomography

Jan Kuske†

Paul Swoboda‡ and Stefania Petra†

Abstract

We present a novel convex relaxation and a corresponding inference algorithm for the non-binary discrete tomography problem, that is, reconstructing discrete-valued images from few linear measurements. In contrast to state of the art approaches that split the problem into a continuous reconstruction problem for the linear measurement constraints and a discrete labeling problem to enforce discrete-valued reconstructions, we propose a joint formulation that addresses both problems simultaneously, resulting in a tighter convex relaxation. For this purpose a constrained graphical model is set up and evaluated using a novel relaxation optimized by dual decomposition. We evaluate our approach experimentally and show superior solutions both mathematically (tighter relaxation) and experimentally in comparison to previously proposed relaxations.

†† Acknowledgments: We gratefully acknowledge support by the DFG (German Science Foundation), Grant GRK 1653. This work is partially funded by the European Research Council under the European Unions 7th Framework Programme (FP7/2007-2013)/ERC grant agreement no 616160. The authors would like to thank Vladimir Kolmogorov for helpful discussions.

1 Introduction

We study the discrete tomography problem, that is reconstructing a discrete-valued image from a small number of linear measurements (tomographic projections), see Figure 1 for an illustration. The main difficulty in reconstructing the original image is that there are usually far too few measurements, making the problem ill-posed. Hence, it is common to search for a discrete-valued image that (i) satisfies the measurements and (ii) minimizes an appropriate energy function.

More generally, the discrete tomography problem can be regarded as reconstructing a discrete-valued synthesis/analysis-sparse signal from few measurements which is observed by deterministic sensors $A$ . This is, in turn, a special instance of the compressed sensing problem [7], for which it has been shown that discreteness constraints on the possible values of the reconstructed function can significantly reduce the number of required measurements [10].

However, the discreteness constraint leads to great computational challenges. Simple outer convex relaxations coming from continuous scenarios are doomed to fail, as they will not output discrete solutions, unless the signal sparsity satisfies favourable relations and $A$ is well-conditioned on the class of sparse signals, e.g. a random matrix. In fact, in most practical scenarios the projection matrices fall short of assumptions that underlie rigorous compressed sensing theory (like e.g. the restricted isometry property [7]), and standard algorithms from the continuous $\ell_{1}$ -setting cannot be applied any more. Rounding continuous solutions will on the other hand render the solutions infeasible for the measurement constraints.

Therefore, algorithms exploiting the combinatorial structure of the discrete tomography problem are necessary to successfully exploit discreteness as prior knowledge and to reduce the number of required measurements.

Related work.

Several algorithms have been proposed to solve the discrete tomography problem. Among them are (i) linear programming-based algorithms [9, 17], (ii) belief propagation [8], (iii) network flow techniques [3], (iv) convex-convave programming [19, 14] (v) evolutionary algorithms [2] and other heuristic algorithms [4, 6, 11, 12]. Not all approaches are applicable to the general discrete tomography problem we treat here: algorithms [9, 17, 8, 6, 11, 12] only support binary labels, while [4, 3] solves only the feasiblity problem and does not permit any energy. Algorithms [19, 14] are applicable to the setting we propose but are purely primal algorithms and do not output dual lower bounds (which we do). Hence, one cannot judge proximity of solutions computed by [19, 14] to global optimal ones, prohibiting its use in branch and bound. In case convex relaxations were considered [9, 19, 17, 3], they were less tight than the one we propose, leading to inferior dual bounds.

Contribution.

We propose the (to our knowledge) first LP-based algorithm for the non-binary discrete tomography problem. In particular, we

•

recast the discrete tomography problem as a Maximum-A-Posteriori inference problem in a graphical model with additional linear constraints coming from the tomographic projections in Section 2,

•

construct higher order factors in the graphical model such that a feasible solution to the higher order factors coincides with solutions feasible for the tomographic projections and

•

present an efficient exact algorithm for solving the special case when exactly one ray constraint is present and the energy factorizes as a chain (such a problem will be called a one-dimensional discrete tomography problem) in Section 3,

•

decompose the whole problem into such subproblems and solve this decomposition with bundle methods in Section 4.

Our approach leads to significantly tighter bounds as compared to generalising previously proposed relaxations to the non-binary case, see Proposition 1 in Section 4 and experiments in Section 5. Code and datasets are available on GitHub111https://github.com/pawelswoboda/LP_MP.

Notation.

Let $[a,b]=\{a,\ldots,b\}$ be the set of natural numbers between $a$ and $b$ . For $x\in\operatorname{\mathbb{R}}$ we denote by $\lfloor x\rfloor$ and $\lceil x\rceil$ the floor and ceiling function.

2 Problem Statement

The discrete tomography problem we study consists in finding a discrete labeling $x\in\{0,1,\ldots,k-1\}^{n}$ such that (i) tomographic projection constraints given by $Ax=b$ with $A\in\{0,1\}^{m\times n},b\in\operatorname{\mathbb{N}}^{m}$ are fulfilled and (ii) $x$ minimizes some energy $E:\{0,1,\ldots,k-1\}^{n}\rightarrow\operatorname{\mathbb{R}}$ . We assume that $E$ factorizes according to a pairwise graphical model: given a graph $\operatorname{\mathsf{G}}=(\operatorname{\mathsf{V}},\operatorname{\mathsf{E}})$ , together with a label space $\operatorname{\mathcal{X}}_{\operatorname{\mathsf{V}}}:=\prod_{u\in\operatorname{\mathsf{V}}}\operatorname{\mathcal{X}}_{u}$ , $\operatorname{\mathcal{X}}_{u}:=\{0,1\ldots,k-1\}$ $\forall u\in\operatorname{\mathsf{V}}$ , the energy is a sum of unary potentials $\theta_{u}:\operatorname{\mathcal{X}}_{u}\rightarrow\operatorname{\mathbb{R}}$ $\forall u\in\operatorname{\mathsf{V}}$ and pairwise ones $\theta_{uv}:\operatorname{\mathcal{X}}_{u}\times\operatorname{\mathcal{X}}_{v}\rightarrow\operatorname{\mathbb{R}}$ $\forall uv\in\operatorname{\mathsf{E}}$ . The full problem hence reads

[TABLE]

For the discrete tomography problem we usually choose $\operatorname{\mathsf{G}}$ to be a grid graph corresponding to the pixels of the image to be reconstructed, zero unary potentials $\theta_{u}\equiv 0$ $\forall u\in\operatorname{\mathsf{V}}$ , as no local information about the image values is known, and pairwise potentials $\theta_{uv}=g(x_{u}-x_{v})$ penalize intensity transitions, e.g. $g(\cdot)=\lvert\cdot\rvert$ (TV) or $g(\cdot)=\min(1,\lvert\cdot\rvert)$ (Potts). Such choice of pairwise potentials assigns small energy to labelings $x$ with a regular spatial structure.

3 One-Dimensional Non-Binary Discrete Tomography

A natural decomposition of the discrete tomography problem (1) consists of (i) considering a subproblem for each ray constraint separately and (ii) joining them together via Lagrangian variables. We will study the first aspect below and the second one in the next section. In particular, let $U=\{u_{1},\ldots,u_{n}\}\subseteq\operatorname{\mathsf{V}}$ be the variables from a single ray constraint $x_{u_{1}}+\ldots+x_{u_{n}}=b$ corresponding to a row of the projection matrix $A$ in (1). Assume that pairwise potentials form a chain, i.e. they are $\theta_{u_{i}u_{i+1}}$ , $i=1,\ldots,n-1$ . The one-dimensional discrete tomography problem is

[TABLE]

We present an exact linear programming relaxation and an efficient message-passing routine to solve (2) below.

3.1 Linear Programming Model

The one-dimensional discrete tomography subproblem (2) could naively be solved by dynamic programming by going over all variables $u_{1},\ldots,u_{n}$ sequentially. This however would entail quadratic space complexity in the number of nodes in $U$ , as the state space for variable $u_{i}$ would need to include costs for all possible labels $x_{u_{i}}$ and all values of the intermediate sum $\sum_{j=1}^{i-1}x_{u_{j}}$ . The latter sum can have $1+\sum_{j=1}^{i-1}(\lvert\operatorname{\mathcal{X}}\rvert-1)$ possible values. To achieve a better space complexity, we will recursively (i) equipartition variables $u_{1},\ldots,u_{n}$ , (ii) define LP-subproblems in terms of so-called counting factors which are exact on each subpartition and (iii) join them together to eventually obtain an exact LP-relaxation for (2). Our approach is inspired by [16].

Partition of variables.

Given the nodes $u_{1},\ldots,u_{n}$ , we choose an equipartition $\Pi_{1}=\{u_{1},\ldots,u_{\lfloor\nicefrac{{n}}{{2}}}\rfloor\}$ and $\Pi_{2}=\{u_{\lfloor\nicefrac{{n}}{{2}}\rfloor+1},\ldots,u_{n}\}$ . We recursively equipartition $\Pi_{1}$ into $\Pi_{1,1}$ and $\Pi_{1,2}$ and do likewise for $\Pi_{2}$ . For $u_{1},\ldots,u_{8}$ we obtain a recursive partitioning as in Figure 2.

Counting factors.

Given an interval, a counting factor holds the states of its left and right end and the value of the intermediate sum.

Definition 1 (Counting label space)

The counting label space for interval $[i,j]$ is $\operatorname{\mathcal{X}}_{i:j}:=\operatorname{\mathcal{X}}_{u_{i}}\times\operatorname{\mathcal{S}}_{i:j}\times\operatorname{\mathcal{X}}_{u_{j}}$ with $\operatorname{\mathcal{S}}_{i:j}=\{0,1,\ldots,1+\sum_{l=i+1}^{j-1}(\lvert\operatorname{\mathcal{X}}_{u_{l}}\rvert-1)\}$ holding all possible intermediate sums. A counting label $x_{i:j}$ consists of the three components $(x_{u_{i}},s_{i:j},x_{u_{j}})$ : its left endpoint label $x_{u_{i}}$ , intermediate sum $s_{i:j}:=x_{u_{i+1}}+\ldots+x_{u_{j-1}}$ and right endpoint label $x_{u_{j}}$ .

See again Figure 2 for the exemplary case $U=\{u_{1},\ldots,u_{8}\}$ .

For interval $[i,j]$ there are $\lvert\operatorname{\mathcal{X}}_{i:j}\rvert=\lvert\operatorname{\mathcal{X}}_{u_{i}}\rvert\cdot\lvert\operatorname{\mathcal{X}}_{u_{j}}\rvert\cdot\lvert\operatorname{\mathcal{S}}_{i:j}\rvert$ distinct counting labels. We associate to each counting factor counting marginals $\mu_{i:j}$ satisfying $\{\mu_{i:j}\in\operatorname{\mathbb{R}}_{+}^{\lvert\operatorname{\mathcal{X}}_{i:j}\rvert}:\sum_{x_{i:j}\in\operatorname{\mathcal{X}}_{i:j}}\mu_{i:j}(x_{i:j})=1\}$ .

Assuming an uniform label space $\lvert\operatorname{\mathcal{X}}_{u}\rvert=k$ $\forall u\in\operatorname{\mathsf{V}}$ , the total space complexity of all counting factors is $O(k^{3}\cdot n\cdot\log(n))$ , hence subquadratic in the number of nodes in $U$ .

Joining counting factors.

Assume the partitioning of variables has produced two adjacent subsets $\Pi=\{u_{i},\ldots,u_{j}\}$ and $\Pi^{\prime}=\{u_{j+1},\ldots,u_{l}\}$ , which were constructed from their common subset $\Pi\cup\Pi^{\prime}\subseteq U$ . The associated three counting factors with marginals $\mu_{i:j},\mu_{j+1:l}$ and $\mu_{i:l}$ introduced above shall be consistent with respect to each other.

Definition 2 (Label consistency)

Label $x_{i:l}\in\operatorname{\mathcal{X}}_{i:l}$ , $x_{i:j}\in\operatorname{\mathcal{X}}_{i:j}$ and $x_{j+1:l}$ are consistent with each other, denoted by $x_{i:j},x_{j+1:l}\sim x_{i:l}$ iff (i) left endpoint labels of $x_{i:j}$ and $x_{i:l}$ match, (ii) right endpoint labels of $x_{j+1:l}$ and $x_{i:l}$ match and (ii) intermediate sums match $s_{i:l}=s_{i:j}+x_{u_{j}}+x_{u_{j+1}}+s_{j+1:l}$ .

We enforce this by introducing a higher order marginal $\mu_{i:j:l}\in\operatorname{\mathbb{R}}_{+}^{\operatorname{\mathcal{X}}_{i:j}\times\operatorname{\mathcal{X}}_{j+1:l}}$ to bind together $\mu_{i:j},\mu_{j+1:l}$ and $\mu_{i:l}$ .

[TABLE]

The recursive arrangement of counting factors is illustrated in Figure 3.

Remark 1

The constraints between $\mu_{i:j:l}$ and $\mu_{i:j}$ and $\mu_{j+1:l}$ are analoguous to the marginalization constraints between pairwise and unary marginals in the local polytope relaxation for pairwise graphical models [18]. The constraints between $\mu_{i:j:l}$ and $\mu_{i:l}$ however are different. Hence, specialized efficient solvers for inference in graphical models cannot be applied.

Costs.

Above we have described the polytope for the one-dimensional discrete tomography problem (2). The LP-objective consists of vectors $\theta_{i:j}$ for each counting marginal and $\theta_{i:j:l}$ for each higher order marginal. Accounting for the pairwise costs in (2) we set $\theta_{i:j}(x_{i:j}):=\begin{cases}\theta_{u_{i}u_{j}}(x_{u_{i}},x_{u_{j}}),&i+1=j\\ 0,&\text{otherwise}\end{cases}$ for the counting factors and for the higher order factors we set $\theta_{i:j:l}(x_{i:j},x_{j+1:l}):=\theta_{u_{j}u_{j+1}}(x_{u_{j}},x_{u_{j+1}})$ . For the projection constraint in (2) we set costs of the top counting marginal as $\theta_{1:n}(x_{1:n}):=\begin{cases}0,&x_{u_{1}}+s_{1:n}+x_{u_{n}}=b\\ \infty,&\text{otherwise}\end{cases}$ .

3.2 Message Passing Algorithm

Above we have introduced a linear program formulation for the one-dimensional discrete tomography problem (2). While it is possible to solve it with a standard LP-solver, doing so would be slow. As the counting factors and the higher order marginals connecting them form a tree, it is possible to devise a message passing algorithm optimizing (2) exactly. First, this implies that the linear programming relaxation for (2) is exact, as message passing amounts to optimizing the Lagrangian dual of this same relaxation. Second, marginals do not need to be held explicitly, holding messages is enough. The size of all messages equals the size of all counting factors, hence giving again subquadratic space complexity.

Message passing for (2) is detailed in Algorithm 1. It proceeds by first computing up messages from adjacent fine subsets to coarser subsets (i.e. going up the tree in Figure 3) and afterwards computing down messages from coarse subsets to their equipartition (i.e. going down the tree in Figure 3). Messages reparametrize costs of counting and higher order factors.

Reparametrization

Let indices $i<j<l$ be given, where $[i:j]$ , $[j+1:l]$ and $[i:l]$ are subsets generated by the recursive partitioning. Let messages $\phi_{i:j:l}^{\leftarrow},\phi_{i:j:l}^{\rightarrow},\phi_{i:j:l}^{\uparrow}$ correspond to constraints (3), (4) and (5) respectively. Messages $\phi$ act on (reparametrize) costs $\theta$ as

[TABLE]

Fast message computation

Naively computing one up messages would result in time complexity $O(\ell^{5}\cdot n^{2})$ , which would make the algorithm prohibitively slow. We will describe a fast message computation technique for (1), which uses the structure of the corresponding linear constraints (5) and relies on the latent factorization of $\theta_{i:j:l}^{\phi}$ . Specifically, when we fix the endpoints $x_{u_{i}},x_{u_{j}}$ of interval $[i,j]$ and $x_{u_{j+1}},x_{u_{l}}$ of $[j+1,l]$ , (1) becomes

[TABLE]

Problem (9) is an instance of the min-sum convolution problem: Given $a,b\in\operatorname{\mathbb{R}}^{n}$ , compute $c\in\operatorname{\mathbb{R}}^{2n-1}$ , where $c_{i}=\min_{j\leq i}(a_{j}+b_{i-j})$ . This can be seen by replacing $\phi^{\leftarrow}$ by $a$ , $\phi^{\rightarrow}$ by $b$ and noting that $\theta_{u_{j},u_{j+1}}$ is a constant, as $x_{u_{j}}$ and $x_{u_{j+1}}$ were fixed. For the min-sum convolution problem efficient algorithms [5] were proposed with expected running time $O(n\log(n))$ under the assumption that sorting $a$ and $b$ results in permutations occurring with uniform probability. Problem (9) can be efficiently computed by performing $O(\ell^{4})$ min-sum convolutions (one convolution for every choice of endpoints).

Remark 2 (Comparison to [16])

While our approach for solving (2) is inspired by [16], it is notably different: (i) our model includes pairwise potentials forming a chain, while [16] assumes that pairwise potentials do not occur between neighboring subsets. This necessitates to store left and right endpoints in counting factors. (ii) [16] optimizes a different objective: they solve the sum-product version of (2) (i.e. they exchange min by $+$ and $+$ by $\cdot$ in (2)). This allows [16] to use fast Fourier transforms for message computations, instead of the harder min-sum convolution problems.

4 Discrete Tomography Graphical Model

The discrete tomography problem (1) consists of m = #rows $(A)$ distinct one-dimensional subproblems (2). We connect all subproblems (2) via Lagrangian variables into one large problem. This procedure is called dual decomposition, see [15] for an introduction. Specifically, in our discrete tomograpy problems subproblems only share variables $v\in\operatorname{\mathsf{V}}$ , but not edges $e\in\operatorname{\mathsf{E}}$ (shared edges can be handled analoguously). Then for each node $u\in\operatorname{\mathsf{V}}$ which participates in the $i$ -th subproblem, we introduce the Lagrangian variable $\lambda_{i,u}\in\operatorname{\mathbb{R}}^{\lvert\operatorname{\mathcal{X}}_{u}\rvert}$ . The $i$ -th subproblem then consists of solving (2) with the subset of variables $U_{i}$ , where the unary potentials are the Lagrangian variables $\theta_{u}=\lambda_{i,u}$ . We denote its energy by $E_{i}(\cdot|\lambda_{i})$ . The overall problem is

[TABLE]

An exemplary $4\times 4$ model with eight subproblems coming from two projection directions can be seen in Figure 4.

Optimization of relaxation (CTG).

To maximize (CTG) we use the bundle solver ConicBundle222https://www-user.tu-chemnitz.de/~helmberg/ConicBundle/ to find optimal Lagrangian variables $\lambda$ and Algorithm 1 to find solutions to the one-dimensional subproblems. The bundle method will only give us a dual lower bound to the value of the optimal reconstruction.

Primal solution.

To obtain a feasible reconstruction, we solve a reduced problem by excluding labels with high cost: Given dual variables $\lambda_{i}$ , let $x^{*}$ be the optimal solution to the $i$ -th subproblem on variables $U_{i}\subseteq\operatorname{\mathsf{V}}$ . For each label $x_{u}\in\operatorname{\mathcal{X}}_{u},u\in U_{i}$ , $x_{u}\neq x_{u}^{*}$ we compute the energy $x^{\prime*}\in\operatorname*{arg\,min}_{\{x^{\prime}\in\operatorname{\mathcal{X}}_{U}:x_{u}=x^{\prime}_{u}\}}E_{i}(x^{\prime}|\lambda_{i})$ of the minimal reconstruction for subproblem $i$ when the label at $u$ is fixed to $x_{u}$ (this value can be read off from the reparametrization output by Algorithm 1). Only if the gap $E_{i}(x^{\prime*}|\lambda_{i})-E_{i}(x^{*}|\lambda_{i})$ is smaller than some given threshold, we consider the label $x_{u}$ . We construct the discrete tomograpy problem on this reduced set of possible labelings and solve the problem with CPLEX [1].

Comparison to previously used relaxation.

It can be shown that the algorithms in [9, 19, 17, 3] use the following relaxation.

[TABLE]

This relaxation (STD) is the straigtforward generalization of the local polytope relaxation [18] to the discrete tomography problem. The only difference are the linear constraints in the last line of (STD). When specialized to the one-dimensional discrete tomography problem (2), the difference between (STD) and our approach is: for (STD) the tomographic projections are directly enforced through the unary marginals $\mu_{u},u\in\operatorname{\mathsf{V}}$ instead of enforcing them through the counting factors and higher order ones as we did in Section 3. This more simplistic relaxation (STD) is however less tight.

Proposition 1

Relaxation (STD) is less tight than (CTG).

Proof

Relaxation (STD) is equivalent to applying it to each tomographic projection separately and then joining every subproblem by Lagrangian variables as we did with our approach above (CTG), see [15, Section 1.6]. Hence, it is enough to show that (STD) is not tight in the one-dimensional case (2). We give a counter-example. Assume $\operatorname{\mathcal{X}}_{u}=\{0,1\}$ $\forall u\in U$ and we are given Potts pairwise potentials $\theta_{uv}(x_{u},x_{v})=\begin{cases}0,&x_{u}=x_{v}\\ 1,&x_{u}\neq x_{v}\end{cases}$ and zero unary potentials $\theta_{u}\equiv 0$ . Set unary marginals $\mu_{u}(1)=\frac{b}{\lvert U\rvert}$ and $\mu_{u}(0)=1-\mu_{u}(1)$ $\forall u\in U$ and pairwise marginals as $\mu_{uv}(x_{u},x_{v})=\begin{cases}\mu_{u}(x_{u}),&x_{u}=x_{v}\\ 0,&x_{u}\neq x_{v}\end{cases}$ . Such marginals are feasible to (STD), yet give cost [math]. On the other hand for e.g. $b=1$ and $\lvert U\rvert>1$ there must be at least one label transition, which the Potts potential penalizes with cost $1$ .∎

5 Experiments

Test images.

We used $200$ randomly generated $32\times 32$ images with three distinct intensity values $\{0,1,2\}$ , examples of which can be seen in Figure 5. Matrices $A$ for the tomographic projections were constructed as in [13]. For each test image we consider two tomographic problems: (i) measuring along horizontal and vertical directions or (ii) measuring along horizontal, vertical and two diagonal directions (left upper to right lower and left lower to right upper corner). This gives 400 test problems in total. Potentials for energy $E$ in (1) are: unary potentials are zero, while pairwise ones are $\theta_{uv}=\lvert x_{u}-x_{v}\rvert$ (that corresponds to TV). Due to integrality of all costs, optimality is ascertained through a duality gap $<1$ .

Algorithms.

We identify our solvers by a prefix {CTG|STD} depending on whether (CTG) or (STD) is solved and by a suffix {CB|relax|BB} depending on whether ConicBundle, CPLEX [1] or CPLEX with branch and bound enabled was utilized. This gives in total 5 solvers: CTG_CB, CTG_relax, CTG_BB, STD_relax and STD_BB. We set a timelimit of 1 hour for all algorithms.

Unfortunately, CPLEX cannot solve problems larger than $32\times 32$ . When solving the relaxation (CTG), it already consumes multiple GB of memory for $32\times 32$ images. Solving (STD) on the other hand leads to low memory consumption, but CPLEX takes too much time for larger problems ( $>1$ hour). Hence, to have a baseline, we stick to $32\times 32$ images.

Results.

We have proved in Proposition 1 that relaxation (STD) is less tight than our relaxation (CTG). In fact, the first line in Table 2 shows that this occurs $350$ times. Furthermore, our tighter relaxation also actually helps in giving optimality certificates. In Table 1 we confirm this numerically: STD_relax can provide optimality certificates $53$ times, while CBC_CB and CTG_relax can do so in total $205$ times. Interestingly, when using the branch and bound capabilities of CPLEX, the picture changes and STD_BB outperforms CTG_BB. This is probably due to the fact that CPLEX can solve the underlying relaxation (STD) much faster than (CTG). We conjecture that the picture will change if the more efficient implementation CBC_CB is used as a bounds provider inside a branch and bound solver. This is however outside the scope of our work.

In Figure 6 we give a detailed plot on how much our relaxation (CTG) improved upon (STD).

Also, our relaxation helps in reconstructing the signal. Out of 238 instances, where our heuristic could find an optimal integral solution (third line in Table 2) there were 12 cases, where only our heuristic could do so (second line in Table 2).

6 Conclusion

We have proposed a novel convex relaxation and an accompanying algorithm for the non-binary discrete tomography problem. We have showed theoretically and empirically that our novel relaxation is tighter than the traditionally used relaxation. Solving our new relaxation helps in decoding tomographic reconstructions.

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] IBM ILOG CPLEX Optimizer. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/ .
2[2] K. J. Batenburg. An evolutionary algorithm for discrete tomography. Discr. Appl. Math. , 151(1):36–54, 2005.
3[3] K. J. Batenburg. A network flow algorithm for reconstructing binary images from continuous x-rays. JMIV , 30(3):231–248, 2008.
4[4] K. J. Batenburg and J. Sijbers. DART: A practical reconstruction algorithm for discrete tomography. IEEE TIP , 20(9):2542–2553, 2011.
5[5] M. Bussieck, H. Hassler, G. J. Woeginger, and U. T. Zimmermann. Fast algorithms for the maximum convolution problem. Oper. Res. Let. , 15:1–5, 1994.
6[6] B. M. Carvalho, G. T. Herman, S. Matej, C. Salzberg, and E. Vardi. Binary tomography for triplane cardiography. IPMI, 1999.
7[7] S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing . Birkhäuser Basel, 2013.
8[8] E. Gouillart, F. Krzakala, M. Mézard, and L. Zdeborová. Belief-propagation reconstruction for discrete tomography. Inverse Problems , 29(3):035003, 2013.