Accelerating optimization-based computed tomography via sparse matrix   approximations

Richard C. Barnard; Rick Archibald

arXiv:1705.07497·math.NA·May 23, 2017

Accelerating optimization-based computed tomography via sparse matrix approximations

Richard C. Barnard, Rick Archibald

PDF

Open Access

TL;DR

This paper introduces two methods to accelerate computed tomography reconstruction by approximating the Radon transform with sparse matrices, significantly reducing computation time while maintaining reconstruction quality.

Contribution

It proposes novel sparse matrix approximation techniques for Radon transform evaluations applicable to various iterative algorithms in CT reconstruction.

Findings

01

Significant reduction in computational time for iterative CT algorithms

02

Maintained high-quality reconstructions with the proposed approximations

03

Applicable to both general iterative and error-forgetting algorithms

Abstract

Variational formulations of reconstruction in computed tomography have the notable drawback of requiring repeated evaluations of both the forward Radon transform and either its adjoint or an approximate inverse transform which are relatively expensive. We look at two methods for reducing the effect of this resulting computational bottleneck via approximating the transform evaluation with sparse matrix multiplications. The first method is applicable for general iterative optimization algorithms. The second is applicable in error-forgetting algorithms such as split Bregman. We demonstrate these approximations significantly reduce the needed computational time needed for the iterative algorithms needed to solve the reconstruction problem while still providing good reconstructions.

Tables5

Table 1. Table 1 . Total time needed to complete 200 Split Bregman iterations and total time needed during conjugate gradient steps only

Method	$N_{x} = 128$		$N_{x} = 256$		$N_{x} = 512$
	Total	CG	Total	CG	Total	CG
NUFFT, $M_{s p} = 2$	33.2305	27.9056	142.0550	118.9043	581.6318	479.4620
Convolution Fusing, $M_{s p} = 2$	9.6462	7.7387	43.5623	34.7016	199.0465	154.3572
Sparse Surrogate, $M_{s p} = 2$	3.6270	1.8375	18.1970	9.1952	93.2822	49.6176
NUFFT, $M_{s p} = 6$	115.6706	98.5060	462.2991	393.7125	1810.3551	1530.83
Convolution Fusing, $M_{s p} = 6$	37.3927	34.1586	160.1846	145.3728	666.03458	598.0480
Sparse Surrogate, $M_{s p} = 6$	4.8572	1.7213	24.7311	9.0292	116.5725	48.3647
NUFFT, $M_{s p} = 12$	371.3576	317.8252	1470.7238	1258.3341	6334.8399	5403.2094
Convolution Fusing, $M_{s p} = 12$	130.4410	122.7075	658.4656	619.9470	3122.7715	2951.3699
Sparse Surrogate, $M_{s p} = 12$	9.8901	1.7619	39.4063	7.9888	204.9899	47.2547

Table 2. Table 2 . Relative L 1 superscript 𝐿 1 L^{1} errors after 200 Split Bregman iterations, 5 CG steps per iteration

Method	$N_{x} = 128$	$N_{x} = 256$	$N_{x} = 512$
NUFFT, $M_{s p} = 2$	0.090699645	0.056697226	0.042466068
Convolution Fusing, $M_{s p} = 2$	0.090704816	0.056725578	0.042493176
Sparse Surrogate, $M_{s p} = 2$	0.305754799	0.199389404	0.133857662
NUFFT, $M_{s p} = 6$	0.090241338	0.056122227	0.042219283
Convolution Fusing, $M_{s p} = 6$	0.090235638	0.056130045	0.042173041
Sparse Surrogate, $M_{s p} = 6$	0.304973205	0.199030832	0.133646147
NUFFT, $M_{s p} = 12$	0.090241343	0.056122246	0.042219412
Convolution Fusing, $M_{s p} = 12$	0.090235639	0.056130049	0.042173149
Sparse Surrogate, $M_{s p} = 12$	0.304972812	0.199030794	0.133646149

Table 3. Table 3 . Time and iterations needed for Bregman updates to fall below listed tolerances, when N x = 512 subscript 𝑁 𝑥 512 N_{x}=512 and N x = 512 subscript 𝑁 𝑥 512 N_{x}=512 and N x π 4 subscript 𝑁 𝑥 𝜋 4 N_{x}\frac{\pi}{4} angular measurements. ∗ * denotes failure of convergence to listed tolerance, along with time needed to reach maximum iteration.

Method		Bregman update size
		$10^{- 2}$	$10^{- 4}$	$10^{- 6}$	$10^{- 8}$
Convolution Fusing	Iterations needed	676	5984	(*)6000	(*)
	Time Needed	2368.153445	22560.0831	22714.6989	(*)
	Rel. $L^{1}$ error	0.035400827	0.00064873	0.00064175	(*)
Sparse Surrogate	Iterations needed	65	726	1835	3365
	Time Needed	44.807082	489.120469	1199.41347	2175.546164
	Rel. $L^{1}$ error	0.271823575	0.0189007	8.47635E-05	$1.6814 \times 10^{- 5}$

Table 4. Table 4 . Time and iterations needed for Bregman updates to fall below listed tolerances, when N x = 512 subscript 𝑁 𝑥 512 N_{x}=512 , with N x π 8 subscript 𝑁 𝑥 𝜋 8 N_{x}\frac{\pi}{8} angular measurements. ( ∗ ) (*) denotes failure of convergence to listed tolerance, along with time needed to reach maximum iteration.

Method		Bregman update size
		$10^{- 2}$	$10^{- 4}$	$10^{- 6}$	$10^{- 8}$
Convolution Fusing	Iterations needed	60	589	$(*)$ 6000	$(*)$
	Time Needed	221.73419	2071.667173	18517.54548	$(*)$
	Rel. $L^{1}$ error	0.259376623	0.00648995	$4.23422 \times 10^{- 5}$	$(*)$
Sparse Surrogate	Iterations needed	19	270	870	1765
	Time Needed	12.094469	168.529048	538.209406	1114.983433
	Rel. $L^{1}$ error	0.409244885	0.034075116	0.000182022	$1.7571 \times 10^{- 6}$

Table 5. Table 5 . Time and iterations needed for Bregman updates to fall below listed tolerances, when N x = 512 subscript 𝑁 𝑥 512 N_{x}=512 , with N x π 16 subscript 𝑁 𝑥 𝜋 16 N_{x}\frac{\pi}{16} angular measurements. ( ∗ ) (*) denotes failure of convergence to listed tolerance, along with time needed to reach maximum iteration.

Method		Bregman update size
		$10^{- 2}$	$10^{- 4}$	$10^{- 6}$	$10^{- 8}$
Convolution Fusing	Iterations needed	7	159	1116	4815
	Time Needed	20.335822	456.510441	2926.561617	11480.16517
	Rel. $L^{1}$ error	0.589578325	0.019544919	0.000134007	$2.96197 \times 10^{- 6}$
Sparse Surrogate	Iterations needed	3	86	471	1195
	Time Needed	1.821839	49.449029	270.114974	682.877529
	Rel. $L^{1}$ error	0.792041713	0.070293249	0.001610545	$3.48151 \times 10^{- 5}$

Equations37

μ min \frac{α}{2} ∥ M R μ - f ∥_{2}^{2} + \frac{1}{p} ∥ K μ ∥_{p}^{p}

μ min \frac{α}{2} ∥ M R μ - f ∥_{2}^{2} + \frac{1}{p} ∥ K μ ∥_{p}^{p}

\big{(}\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}+K^{*}K\big{)}\mu=\mathcal{R}^{*}M^{*}f.

\big{(}\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}+K^{*}K\big{)}\mu=\mathcal{R}^{*}M^{*}f.

u arg min \frac{∥ u - u ~ ∥ ^{2}}{2 τ} + \frac{α}{2} ∥ M R u - g ∥_{2}^{2}

u arg min \frac{∥ u - u ~ ∥ ^{2}}{2 τ} + \frac{α}{2} ∥ M R u - g ∥_{2}^{2}

\big{(}I+\tau\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}\big{)}u=\tilde{u}+\tau\alpha\mathcal{R}^{*}M^{*}g.

\big{(}I+\tau\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}\big{)}u=\tilde{u}+\tau\alpha\mathcal{R}^{*}M^{*}g.

\displaystyle\big{(}\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}

\displaystyle\big{(}\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}

μ min \frac{α}{2} ∥ F^{*} μ - \hat{f} ∥_{2}^{2} + \frac{1}{p} ∥ K μ ∥_{p}^{p} .

μ min \frac{α}{2} ∥ F^{*} μ - \hat{f} ∥_{2}^{2} + \frac{1}{p} ∥ K μ ∥_{p}^{p} .

(\alpha\mathcal{F}_{\text{NU}}\mathcal{F}_{\text{NU}}^{*}-\lambda K^{*}K\big{)}\mu=\alpha\mathcal{F}_{\text{NU}}\hat{f}+\lambda K^{*}v.

(\alpha\mathcal{F}_{\text{NU}}\mathcal{F}_{\text{NU}}^{*}-\lambda K^{*}K\big{)}\mu=\alpha\mathcal{F}_{\text{NU}}\hat{f}+\lambda K^{*}v.

F (k) = \frac{π}{τ} e^{k^{2} τ} F_{D} (f * g_{τ})_{k}

F (k) = \frac{π}{τ} e^{k^{2} τ} F_{D} (f * g_{τ})_{k}

g_{τ} (x) = ℓ = - \infty \sum + \infty e^{- (x - 2 ℓ π)^{2} /4 τ} .

g_{τ} (x) = ℓ = - \infty \sum + \infty e^{- (x - 2 ℓ π)^{2} /4 τ} .

A F_{D} B B^{*} F_{D}^{*} A

A F_{D} B B^{*} F_{D}^{*} A

B f (π m / M) = j \sum f_{j} ℓ = - \infty \sum \infty e^{(- x_{j} - π m / M - 2 π ℓ)^{2} /4 τ} .

B f (π m / M) = j \sum f_{j} ℓ = - \infty \sum \infty e^{(- x_{j} - π m / M - 2 π ℓ)^{2} /4 τ} .

j \in I_{m} \sum f (x_{j}) e^{- (x_{j} - π m / M - 2 π ℓ)^{2} /4 τ} .

j \in I_{m} \sum f (x_{j}) e^{- (x_{j} - π m / M - 2 π ℓ)^{2} /4 τ} .

\displaystyle BB^{*}f(x_{j})=\sum_{m\in J_{j}}\sum_{k\in I_{m}}\Big{(}E_{1}(j,m)E_{1}(k,m)

\displaystyle BB^{*}f(x_{j})=\sum_{m\in J_{j}}\sum_{k\in I_{m}}\Big{(}E_{1}(j,m)E_{1}(k,m)

\displaystyle e^{-(\pi(m_{j}-m)/2N)^{2}/\tau}e^{-(\pi(m_{k}-m)/2N)^{2}/\tau}\Big{)}f_{k},

\big{(}\alpha A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A-\lambda K^{*}K\big{)}\mu=\alpha\mathcal{F}_{\text{NU}}^{*}\hat{f}+\lambda K^{*}v,

\big{(}\alpha A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A-\lambda K^{*}K\big{)}\mu=\alpha\mathcal{F}_{\text{NU}}^{*}\hat{f}+\lambda K^{*}v,

u_{i, j} = {10 i = N /2, j = N /2 e l se .

u_{i, j} = {10 i = N /2, j = N /2 e l se .

T^{k,\ell}_{m,n}=\begin{cases}\big{[}\big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}(\delta_{i,N/2}\delta_{j,N/2})\big{]}_{m-k,n-\ell}&|(m-k,n-\ell)-(N/2,N/2)|\leq r\\ 0&\text{otherwise}\end{cases}.

T^{k,\ell}_{m,n}=\begin{cases}\big{[}\big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}(\delta_{i,N/2}\delta_{j,N/2})\big{]}_{m-k,n-\ell}&|(m-k,n-\ell)-(N/2,N/2)|\leq r\\ 0&\text{otherwise}\end{cases}.

T := k, ℓ = - N /2 \sum N /2 T^{k, ℓ}

T := k, ℓ = - N /2 \sum N /2 T^{k, ℓ}

(α T - λ K^{*} K) μ = α F_{NU}^{*} \hat{f} + λ K^{*} v .

(α T - λ K^{*} K) μ = α F_{NU}^{*} \hat{f} + λ K^{*} v .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications · Sparse and Compressive Sensing Techniques · Advanced X-ray and CT Imaging

Full text

Accelerating optimization-based computed tomography via sparse matrix approximations

Richard C Barnard and Rick Archibald

Computer Science and Mathematics Division, Oak Ridge National Laboratory, One Bethel Valley Road, P.O. Box 2008, MS-6211, Oak Ridge, TN 37831-6211.

[email protected]

Abstract.

Variational formulations of reconstruction in computed tomography have the notable drawback of requiring repeated evaluations of both the forward Radon transform and either its adjoint or an approximate inverse transform which are relatively expensive. We look at two methods for reducing the effect of this resulting computational bottleneck via approximating the transform evaluation with sparse matrix multiplications. The first method is applicable for general iterative optimization algorithms. The second is applicable in error-forgetting algorithms such as split Bregman. We demonstrate these approximations significantly reduce the needed computational time needed for the iterative algorithms needed to solve the reconstruction problem while still providing good reconstructions.

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

1. Introduction

In the present work, we focus on computed tomographic reconstruction methods based around solving a minimization problem of the general form

[TABLE]

for $p=1,2$ and some linear operator $K$ . For simplicity, we focus on the case of parallel beam tomography. Here, $\mathcal{R}$ denotes the Radon transform and $M$ is a linear operator, similar to a “row selector”, which restricts to those angles along which measurements are taken in constructing $f$ , which is the given data.

Standard methods for tomographic reconstruction typically use filtered back projections [10]. In [2, 1], the computational costs of such methods were reduced via efficient computations of the Radon transform and its inverse. However, in the case of situations where the angular measurements are under resolved, filtered back projections lead to high frequency artifacts in the reconstructions. Solving Equation 1, in the specific case where $p=1$ and $K=\nabla$ , provides improved reconstructions without these artifacts, as demonstrated in [3, 9]. Such methods have the drawback of having an increased computational cost. One source of this increase is in the need for repeatedly computing the Radon transform and its adjoint. After describing in some detail this computational bottleneck in section 2, we describe two methods for reducing this cost in section 3. The first relies on the fixed geometries of the tomographic reconstruction problem to replace relatively expensive transforms with sparse matrix multiplications. The second method focuses on taking advantage of the error-forgetting nature of algorithms such as the split Bregman [7] algorithm to reduce the computational costs even further. After describing these methods, we present in section 4 the ability of these two methods to reduce the computational costs of solving the total variation-penalized reconstruction problem.

2. A computational bottleneck in optimization-based reconstruction

We now turn to specifying the computational bottleneck we are interested in addressing. First, if $p=2,$ the solution to (1) solves the linear operator equation

[TABLE]

If $p=1,$ typically an iterative convex optimization algorithm is needed. Such algorithms often involve solving a series of smooth optimization problems. For instance, the primal-dual method of Chambolle and Pock [5] at each iterate will require us to solve smooth problems of the form

[TABLE]

which means solving

[TABLE]

Similarly, the split Bregman method of Goldstein and Osher [7] (used specifically for neutron tomography in [3]) requires repeatedly solving a linear system of the form

[TABLE]

In contrast to the cases considered in [7] and [5], the solutions to (2),(3), and (4) do not have closed form solutions. This means one must resort to iterative linear solvers. Due to the expense of evaluating $\mathcal{R}^{*}M^{*}M\mathcal{R}$ , a computational bottleneck arises in the iterative linear solvers. In the case of the primal-dual and the split Bregman methods on large-scale problems, this operator may need to be evaluated hundreds, or even thousands of times, necessitating an efficient implementation of its evaluation.

Let $M_{\theta_{i}}$ denote the restriction associated with a single direction $\theta_{i}$ . Via the Fourier slice theorem, for each angle $\theta_{i}$ , the one dimensional Fourier transform of $M_{\theta_{i}}\mathcal{R}\mu$ may be associated with a two dimensional Fourier transform of $\mu$ at points on a slice in the Fourier domain [4]. Repeating for all angles used in constructing $M$ in Equation 1, we may rewrite Equation 1 as

[TABLE]

However, in the discrete setting, $\mathcal{F}_{D}^{*}\mu-\hat{f}$ is evaluated on the slices corresponding to the discrete angles for which measurements have been taken, where $\mathcal{F}_{D}$ denotes a standard FFT. This results in $\mathcal{F}_{D}^{*}\mu$ being needed on a non-Cartesian grid; we denote by $\mathcal{F}_{\text{NU}}$ the associated discrete Fourier transform on nonuniform grids. This means that the split Bregman method, for instance, requires solving the linear system

[TABLE]

Similar statements hold for the analogues of Equations 2 and 3. Therefore, this modified minimization problem still poses the same computational difficulties as mentioned above, however now the computational bottleneck lies in the evaluation of $\mathcal{F}_{\text{NU}}\mathcal{F}_{\text{NU}}^{*}$ .

3. Overcoming the computational bottleneck

In this section we look at two methods for reducing the computational bottleneck outlined in section 2. The first method involves replacing $\mathcal{F}_{\text{NU}}^{*}\mathcal{F}_{\text{NU}}$ with sparse matrix multiplications and standard fast Fourier transforms. This method is applicable to speeding up the conjugate gradient methods needed in solving Equation 5 via any of the already mentioned iterative optimization methods. The second method is specifically applicable to the split Bregman method. Relying on the error-forgetting properties of the algorithm, we see dramatic reductions on the computational times needed without a loss in the overall performance of the algorithm. In each case, we replace evaluating much of the effort of evaluating $\mathcal{F}_{\text{NU}}\mathcal{F}_{\text{NU}}^{*}$ with sparse matrix multiplications. From an application perspective, these matrices depend primarily on the resolution of the instruments, the frequency of angular measurements, and parameters which typically take only a few discrete values. This means for a particular tomographic reconstruction workflow, one may precompute the relevant matrices offline, enabling significant speedup in situations where multiple reconstructions are needed.

3.1. Implementation of convolution fusing

In [8], nonuniform fast Fourier transforms (NUFFT) were developed, providing a computationally efficient method of computing these operators. We briefly review this construction. The one-dimensional, type 1 NUFFT of $f(x)$ , is computed by taking

[TABLE]

where $\tau$ is a parameter determining the accuracy desired, and $g_{\tau}$ is a periodic heat kernel of the form

[TABLE]

For notational convenience, we let $A$ be the linear operator defined by $A(f)(k):=\sqrt{\frac{\pi}{\tau}}e^{k^{2}\tau}f_{k}$ and let $Bf:=g_{\tau}*f(x).$

We may then rewrite $\mathcal{F}_{\text{NU}}\mathcal{F}_{\text{NU}}^{*}$ as

[TABLE]

(noting that $A$ is self-adjoint). As noted in [8], the main computational task in computing the NUFFT is evaluating

[TABLE]

In order to do this efficiently, a parameter $M_{sp}$ is chosen in order to achieve a desired level of accuracy in the NUFFT. For $m\in[0,2N-1]$ , we approximate $Bf$ by truncated summations, taking the nearest $2M_{sp}$ points around $x_{j}$ . Outside of this window, the Gaussians decay sufficiently sharply enough to be ignored.

Defining $m_{j}=\sup\{\tilde{m}:\pi\tilde{m}/M\leq x_{j}\}$ and $I_{m}:=\{j:|m_{j}-m|\leq M_{sp}\}$ , $Bf(\pi m/N)$ is approximately

[TABLE]

It is noted in [8], drawing from [6], that if $M_{sp}=6,~{}\tau=6/N^{2}$ , the NUFFT is accurate to 6 digits of accuracy and if $M_{sp}=12,~{}\tau=12/N^{2}$ , the NUFFT is accurate to 12 digits of accuracy.

We now turn to $B^{*}B$ , which takes the form

[TABLE]

where $J_{j}=\{m:|m_{j}-m|\leq M_{sp}\}.$ This operator is sparse; only grid points within $M_{sp}$ of $x_{j}$ are used, along with appropriate weights taken from the Gaussian sources. In higher dimensions, this sparsity structure holds; for instance in two dimensions, the only grid points used in computing $B^{*}Bf(x_{i,j})$ are those points such that $|m-i|<M_{sp}$ and $|j-n|<M_{sp}.$ The structure of $B^{*}B$ is determined by the radial resolution of the data $f$ , the set of angular measurements used, and the value $M_{sp}$ . The first depends on the accuracy of the instrumentation, which typically is fixed. In situations where the set of angular measurements is either fixed, or from one of several canonical sampling regimes, the $B^{*}B$ may be precomputed as a sparse matrix. The resulting sparse matrix fuses the convolutions into a single operation. Thus, the NUFFT’s needed in solving Equation 6 may be replaced by 2 FFT’s, a single sparse matrix-vector multiplication, and 2 diagonal matrix-vector multiplications.

3.2. Acceleration via surrogate matrices

Taking advantage of fusing the convolutions as outline above in section 3.1 means, at each iterate in the split Bregman algorithm, we now solve

[TABLE]

where now $B^{*}B$ has been replaced with a sparse matrix multiplication. However, the (split) Bregman algorithm has an error forgetting property [7]; that is, one only needs in general to solve Equation 9 approximately in order to still have a convergent algorithm. The benefit for doing exact Bregman updates lies in the ability of Bregman to converge in two exact updates, for sufficiently small initial Bregman distance, as discussed in [12]. However, we note that solving Equation 9 exactly (to machine precision) can be quite expensive. For large-scale imaging problems, the expense of exactly solving Equation 9 may offset the near immediate convergence of Bregman updates for good (with respect to the Bregman distance) initialization. Indeed, in [3], inexact Bregman updates were sufficient for a similar formulation of the tomographic reconstruction problem.

In order to further reduce the computational costs associated with solving Equation 9, we look to leverage the error-forgetting nature of the split Bregman algorithm by precomputing a sparse matrix $T$ which approximates $A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A$ . We first consider the cases where $u$ is the simple $N\times N$ image given by

[TABLE]

The result of $\mathcal{F}^{*}A$ being applied to this $u$ is constant function. As described above in section 3.1, $B^{*}B\mathcal{F}^{*}_{D}A(\delta_{i,N/2}\delta_{j,N/2})$ will then be the sum of Gaussians.

Therefore, $\big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}(u)$ will be a (possibly rescaled) Gaussian centered at $(N/2,N/2)$ . This is shown in Figure 1(a) for $N=256$ where we show the central $128\times 128$ region when $M$ corresponds to a Shannon sampling frequency in angle. We will use the peakedness seen in that image to construct a $T$ . First, for a chosen truncation radius $r>0,$ define for some $k,\ell\in[-N/2,N/2]$ the matrix $T^{k,\ell}$ by

[TABLE]

We then define

[TABLE]

as the sum of these translated stencil matrices. This $T$ is used to compute split Bregman updates; instead of Equation 9, we solve (possibly approximately)

[TABLE]

It is important to recall that for an an image which is a single non-zero pixel near the image’s edge, e.g. $\tilde{u}_{i,j}=\delta_{i,k}\delta_{j,\ell}$ for some $k<r$ , $\big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}\tilde{u}$ will have nontrivial entries near the opposite edge, due to the periodicity of the operators. Our choice of $T$ suppresses these values on the opposite edge. The motivation for this choice is to induce a banded structure in $T$ , as seen in Figures 1(b) and 1(c); in this case $r=3,$ resulting in a banded matrix with 7 diagonal and off-diagonal entries per row. However, this is reasonable in the case of computed tomography where the sample to be measured is fully contained in the interior of the image. The choice of suppressing values on the opposite edge corresponds to an assumption that the object of interest in the reconstructed image is at least $r$ pixels from each edge of the image and that near the edge, the true $\mu$ can be assumed to be [math].

4. Numerical Implementation

In order to test the efficiency of using the sparse linear matrix representation, we solve Equation 1 for $K,p$ corresponding to anisotropic TV regularization via a split-Bregman algorithm (as discussed for the general problem in [7] and specifically for computed tomography in [11, 3]). Throughout, we use the Shepp-Logan phantom as the reference data and $\alpha=\lambda=1$ . All computations reported below were performed in MATLAB on a workstation using Intel 2.20 GHz Xeon Processors with 55 MB caches.

First, in order to compare the cost and accuracy of using the NUFFT as opposed to either the convolution fusing method or the use of sparse surrogates, we look at the time required to compute a fixed number of inexact Bregman iterations. We compute 200 inexact Bregman iterations; each of these iterations involves approximating the solution to Equation 4 via 5 conjugate gradient steps. Therefore, 1000 evaluations of $\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}$ are needed. We perform this for various image sizes as well as various values of $M_{sp}$ used in computing the NUFFT and its corresponding $B$ . For each value of $N_{x},$ we choose $M$ that corresponds to using $N_{x}\frac{\pi}{4}$ evenly spaced angular measurements. We summarize the results in Tables 1 and 2 as well as Figure 2. First, in Table 1, we report the time required for performing 200 inexact Bregman iterations as well the portion of that time required for performing the conjugate gradient computations. As expected, increasing $M_{sp}$ increases the time required when using the NUFFT and convolution fusing-based optimizations. The sparsity structure of $T$ in the surrogate-based method is dominated by the choice of $r$ (here set to be $1$ ), resulting in the time needed being unchanged for different $M_{sp}$ . Additionally, the time required per evaluation of $\alpha\mathcal{R}^{*}M^{*}M\mathcal{R}$ using the NUFFT is, for $M_{sp}=2$ , roughly 3 or 4 times longer than using convolution fusing and 10-20 times longer than using sparse surrogates. At higher values of $M_{sp},$ the speedup using convolution fusing seems to hold; meanwhile the possible speedup by using sparse surrogates becomes one of several orders.

However, the speedup using sparse surrogates is tempered by a significant loss of accuracy in the reconstruction. This can be seen in Table 2, where we report the relative $L^{1}$ error of the reconstruction using these 200 split Bregman iterations. The sparse surrogate-based reconstruction results in a relative error that is between 2 and 5 times higher than reconstructions using the other methods. One perhaps surprising result from this test is that the choice of $M_{sp}$ appears to not significantly affect the quality of the reconstruction after a fixed number of iterations. Finally, in Figure 2, we display the relative errors and the $L^{1}$ norm of the updates over the iterations. Using either the NUFFT or convolution fusing results in nearly the same iterative reconstruction. Additionally, despite the higher error in the sparse surrogate method, the convergence rates are unchanged. Indeed, the sparse surrogate method leads, after a handful of iterations, to a iteration scheme which is largely monotonically reducing the error of the reconstruction, unlike the other two methods. These observations suggest that for a fixed number of iterations, convolution fusing provides a significant speedup over the NUFFT while not impacting the performance of the image reconstruction procedure. Meanwhile, the use of sparse surrogates decreases the accuracy at any given iterate. This also holds if one increases $r$ : for $r=3,$ $200$ Bregman updates require $136.0654$ seconds results in a relative $L^{1}$ error of $0.129994$ error. As expected, then, increasing $r$ results in more expensive (due to decreasing the sparsity of the resulting $T$ ) albeit more accurate iterations. However, due to the shorter time of each iteration in the sparse surrogate method, there is a possibility that the overall performance of solving (1) can be improved: we may replace relatively accurate iterations with more iterations which, individually, are significantly quicker to perform.

To test this, we look to solve (5) for $512\times 512$ images and $K,p$ again corresponding to anisotropic total variation regularization. We consider $M$ corresponding to taking $512\frac{\pi}{4d}$ angular measurements for $d=1,2,4$ and $T$ corresponding to $r=3.$ That is, we consider reconstruction problems where the angular measurements are under-resolved to differing degrees. In [3], the highly under-resolved case was of interest, as angular measurements are relatively expensive in some applications. We will compare the results of using split Bregman to solve (5) with both convolution fusing and with sparse surrogates. That is, we will compare using (9) with using (11) to compute the Bregman update; again, we use at most 5 conjugate gradient steps to compute this update. As a stopping criterion, we terminate either after a maximum of 6000 Bregman updates or if the $L^{1}$ norm of a Bregman update is less than $10^{-8}$ of the first Bregman update. In Tables 3, 4 and 5 we list the results of using each method to compute the Bregman update for $d=1,2,4$ , respectively. In each we show the time and number of iterations needed for Bregman updates to fall below a certain threshold, as well as the relative $L^{1}$ error of the reconstruction at that iteration. Additionally we show in Figures 3, 4 and 5 the $L^{1}$ error and size of the Bregman updates as a function of computational time for $d=1,2,4$ , respectively.

We see that for any fixed time budget, the sparse surrogate-based method provides improved accuracy in very few iterations. In the case of $d=1,$ the time needed to get $10^{-5}$ order relative $L^{1}$ error via the sparse surrogate method is approximately $10\%$ of the time needed to reach $10^{-4}$ $L^{1}$ error via convolution fusing. Similar observations hold for $d=2.$ However, for $d=4,$ the sparse surrogate-based method, while taking vastly less time than the convolution fusing-based method, results in a final result which is more inaccurate. This is due to prematurely terminating the reconstruction method, as the optimization routine detects a convergence due to the relatively small size of the Bregman updates. Nonetheless, the time needed to reach $3.4739\times 10^{-5}$ error via convolution fusing is $5104$ seconds, versus the $683$ seconds needed for sparse surrogates.

5. Conclusions

Approximating the transform $\mathcal{F}^{*}\mathcal{F}$ with sparse matrices reduces the computational bottleneck that arises in solving variational tomographic reconstruction problems. Convolution fusing replaces the most computationally expensive portion of the evaluation of this operator with a sparse matrix multiplication, reducing the computation time without any additional loss of accuracy. In cases where an error-forgetting algorithm is to be used, sparse surrogates further reduce the cost of the computation by replacing the entire operator with a sparse matrix; this comes at the cost of a loss of accuracy. However, the reduced time needed per iteration offsets the need for additional iterations, especially in the case of only moderately unresolved angular measurements. In the case of low angular resolution reconstructions where the spares surrogates result in prematurely terminating algorithms, it may prove useful as a warm-start strategy. That is, as we expect sparse surrogate-based iterations may prematurely converge, we use the result as an initial guess for fused convolution-based iterations which are more expensive.

Bibliography12

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Fredrik Andersson. Fast inversion of the radon transform using log-polar coordinates and partial back-projections. SIAM Journal on Applied Mathematics , 65(3):818–837, 2005.
2[2] Fredrik Andersson, Marcus Carlsson, and Viktor V. Nikitin. Fast algorithms and efficient gpu implementations for the radon transform and the back-projection operator represented as convolution operators. SIAM Journal on Imaging Sciences , 9(2):637–664, 2016.
3[3] Richard Barnard, T. Toops, E. Nafziger, C. Finney, D. Splitter, Hassina Bilheux, and Rick Archibald. Total variation-based neutron computed tomography. Submitted, 2017.
4[4] Achi Brandt, Jordan Mann, Matvei Brodski, and Meirav Galun. A fast and accurate multilevel inversion of the radon transform. SIAM Journal on Applied Mathematics , 60(2):437–462, 2000.
5[5] Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision , 40(1):120–145, 2011.
6[6] A. Dutt and V. Rokhlin. Fast fourier transforms for nonequispaced data. SIAM Journal on Scientific Computing , 14(6):1368–1393, 1993.
7[7] Tom Goldstein and Stanley Osher. The split bregman method for l 1-regularized problems. SIAM Journal on Imaging Sciences , 2(2):323–343, 2009.
8[8] Leslie Greengard and June-Yub Lee. Accelerating the nonuniform fast fourier transform. SIAM Review , 46(3):443–454, 2004.