Accelerating optimization-based computed tomography via sparse matrix approximations
Richard C. Barnard, Rick Archibald

TL;DR
This paper introduces two methods to accelerate computed tomography reconstruction by approximating the Radon transform with sparse matrices, significantly reducing computation time while maintaining reconstruction quality.
Contribution
It proposes novel sparse matrix approximation techniques for Radon transform evaluations applicable to various iterative algorithms in CT reconstruction.
Findings
Significant reduction in computational time for iterative CT algorithms
Maintained high-quality reconstructions with the proposed approximations
Applicable to both general iterative and error-forgetting algorithms
Abstract
Variational formulations of reconstruction in computed tomography have the notable drawback of requiring repeated evaluations of both the forward Radon transform and either its adjoint or an approximate inverse transform which are relatively expensive. We look at two methods for reducing the effect of this resulting computational bottleneck via approximating the transform evaluation with sparse matrix multiplications. The first method is applicable for general iterative optimization algorithms. The second is applicable in error-forgetting algorithms such as split Bregman. We demonstrate these approximations significantly reduce the needed computational time needed for the iterative algorithms needed to solve the reconstruction problem while still providing good reconstructions.
| Method | ||||||
|---|---|---|---|---|---|---|
| Total | CG | Total | CG | Total | CG | |
| NUFFT, | 33.2305 | 27.9056 | 142.0550 | 118.9043 | 581.6318 | 479.4620 |
| Convolution Fusing, | 9.6462 | 7.7387 | 43.5623 | 34.7016 | 199.0465 | 154.3572 |
| Sparse Surrogate, | 3.6270 | 1.8375 | 18.1970 | 9.1952 | 93.2822 | 49.6176 |
| NUFFT, | 115.6706 | 98.5060 | 462.2991 | 393.7125 | 1810.3551 | 1530.83 |
| Convolution Fusing, | 37.3927 | 34.1586 | 160.1846 | 145.3728 | 666.03458 | 598.0480 |
| Sparse Surrogate, | 4.8572 | 1.7213 | 24.7311 | 9.0292 | 116.5725 | 48.3647 |
| NUFFT, | 371.3576 | 317.8252 | 1470.7238 | 1258.3341 | 6334.8399 | 5403.2094 |
| Convolution Fusing, | 130.4410 | 122.7075 | 658.4656 | 619.9470 | 3122.7715 | 2951.3699 |
| Sparse Surrogate, | 9.8901 | 1.7619 | 39.4063 | 7.9888 | 204.9899 | 47.2547 |
| Method | |||
|---|---|---|---|
| NUFFT, | 0.090699645 | 0.056697226 | 0.042466068 |
| Convolution Fusing, | 0.090704816 | 0.056725578 | 0.042493176 |
| Sparse Surrogate, | 0.305754799 | 0.199389404 | 0.133857662 |
| NUFFT, | 0.090241338 | 0.056122227 | 0.042219283 |
| Convolution Fusing, | 0.090235638 | 0.056130045 | 0.042173041 |
| Sparse Surrogate, | 0.304973205 | 0.199030832 | 0.133646147 |
| NUFFT, | 0.090241343 | 0.056122246 | 0.042219412 |
| Convolution Fusing, | 0.090235639 | 0.056130049 | 0.042173149 |
| Sparse Surrogate, | 0.304972812 | 0.199030794 | 0.133646149 |
| Method | Bregman update size | ||||
|---|---|---|---|---|---|
| Convolution Fusing | Iterations needed | 676 | 5984 | (*)6000 | (*) |
| Time Needed | 2368.153445 | 22560.0831 | 22714.6989 | (*) | |
| Rel. error | 0.035400827 | 0.00064873 | 0.00064175 | (*) | |
| Sparse Surrogate | Iterations needed | 65 | 726 | 1835 | 3365 |
| Time Needed | 44.807082 | 489.120469 | 1199.41347 | 2175.546164 | |
| Rel. error | 0.271823575 | 0.0189007 | 8.47635E-05 | ||
| Method | Bregman update size | ||||
|---|---|---|---|---|---|
| Convolution Fusing | Iterations needed | 60 | 589 | 6000 | |
| Time Needed | 221.73419 | 2071.667173 | 18517.54548 | ||
| Rel. error | 0.259376623 | 0.00648995 | |||
| Sparse Surrogate | Iterations needed | 19 | 270 | 870 | 1765 |
| Time Needed | 12.094469 | 168.529048 | 538.209406 | 1114.983433 | |
| Rel. error | 0.409244885 | 0.034075116 | 0.000182022 | ||
| Method | Bregman update size | ||||
|---|---|---|---|---|---|
| Convolution Fusing | Iterations needed | 7 | 159 | 1116 | 4815 |
| Time Needed | 20.335822 | 456.510441 | 2926.561617 | 11480.16517 | |
| Rel. error | 0.589578325 | 0.019544919 | 0.000134007 | ||
| Sparse Surrogate | Iterations needed | 3 | 86 | 471 | 1195 |
| Time Needed | 1.821839 | 49.449029 | 270.114974 | 682.877529 | |
| Rel. error | 0.792041713 | 0.070293249 | 0.001610545 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Sparse and Compressive Sensing Techniques · Advanced X-ray and CT Imaging
Accelerating optimization-based computed tomography via sparse matrix approximations
Richard C Barnard and Rick Archibald
Computer Science and Mathematics Division, Oak Ridge National Laboratory, One Bethel Valley Road, P.O. Box 2008, MS-6211, Oak Ridge, TN 37831-6211.
Abstract.
Variational formulations of reconstruction in computed tomography have the notable drawback of requiring repeated evaluations of both the forward Radon transform and either its adjoint or an approximate inverse transform which are relatively expensive. We look at two methods for reducing the effect of this resulting computational bottleneck via approximating the transform evaluation with sparse matrix multiplications. The first method is applicable for general iterative optimization algorithms. The second is applicable in error-forgetting algorithms such as split Bregman. We demonstrate these approximations significantly reduce the needed computational time needed for the iterative algorithms needed to solve the reconstruction problem while still providing good reconstructions.
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
1. Introduction
In the present work, we focus on computed tomographic reconstruction methods based around solving a minimization problem of the general form
[TABLE]
for and some linear operator . For simplicity, we focus on the case of parallel beam tomography. Here, denotes the Radon transform and is a linear operator, similar to a “row selector”, which restricts to those angles along which measurements are taken in constructing , which is the given data.
Standard methods for tomographic reconstruction typically use filtered back projections [10]. In [2, 1], the computational costs of such methods were reduced via efficient computations of the Radon transform and its inverse. However, in the case of situations where the angular measurements are under resolved, filtered back projections lead to high frequency artifacts in the reconstructions. Solving Equation 1, in the specific case where and , provides improved reconstructions without these artifacts, as demonstrated in [3, 9]. Such methods have the drawback of having an increased computational cost. One source of this increase is in the need for repeatedly computing the Radon transform and its adjoint. After describing in some detail this computational bottleneck in section 2, we describe two methods for reducing this cost in section 3. The first relies on the fixed geometries of the tomographic reconstruction problem to replace relatively expensive transforms with sparse matrix multiplications. The second method focuses on taking advantage of the error-forgetting nature of algorithms such as the split Bregman [7] algorithm to reduce the computational costs even further. After describing these methods, we present in section 4 the ability of these two methods to reduce the computational costs of solving the total variation-penalized reconstruction problem.
2. A computational bottleneck in optimization-based reconstruction
We now turn to specifying the computational bottleneck we are interested in addressing. First, if the solution to (1) solves the linear operator equation
[TABLE]
If typically an iterative convex optimization algorithm is needed. Such algorithms often involve solving a series of smooth optimization problems. For instance, the primal-dual method of Chambolle and Pock [5] at each iterate will require us to solve smooth problems of the form
[TABLE]
which means solving
[TABLE]
Similarly, the split Bregman method of Goldstein and Osher [7] (used specifically for neutron tomography in [3]) requires repeatedly solving a linear system of the form
[TABLE]
In contrast to the cases considered in [7] and [5], the solutions to (2),(3), and (4) do not have closed form solutions. This means one must resort to iterative linear solvers. Due to the expense of evaluating , a computational bottleneck arises in the iterative linear solvers. In the case of the primal-dual and the split Bregman methods on large-scale problems, this operator may need to be evaluated hundreds, or even thousands of times, necessitating an efficient implementation of its evaluation.
Let denote the restriction associated with a single direction . Via the Fourier slice theorem, for each angle , the one dimensional Fourier transform of may be associated with a two dimensional Fourier transform of at points on a slice in the Fourier domain [4]. Repeating for all angles used in constructing in Equation 1, we may rewrite Equation 1 as
[TABLE]
However, in the discrete setting, is evaluated on the slices corresponding to the discrete angles for which measurements have been taken, where denotes a standard FFT. This results in being needed on a non-Cartesian grid; we denote by the associated discrete Fourier transform on nonuniform grids. This means that the split Bregman method, for instance, requires solving the linear system
[TABLE]
Similar statements hold for the analogues of Equations 2 and 3. Therefore, this modified minimization problem still poses the same computational difficulties as mentioned above, however now the computational bottleneck lies in the evaluation of .
3. Overcoming the computational bottleneck
In this section we look at two methods for reducing the computational bottleneck outlined in section 2. The first method involves replacing with sparse matrix multiplications and standard fast Fourier transforms. This method is applicable to speeding up the conjugate gradient methods needed in solving Equation 5 via any of the already mentioned iterative optimization methods. The second method is specifically applicable to the split Bregman method. Relying on the error-forgetting properties of the algorithm, we see dramatic reductions on the computational times needed without a loss in the overall performance of the algorithm. In each case, we replace evaluating much of the effort of evaluating with sparse matrix multiplications. From an application perspective, these matrices depend primarily on the resolution of the instruments, the frequency of angular measurements, and parameters which typically take only a few discrete values. This means for a particular tomographic reconstruction workflow, one may precompute the relevant matrices offline, enabling significant speedup in situations where multiple reconstructions are needed.
3.1. Implementation of convolution fusing
In [8], nonuniform fast Fourier transforms (NUFFT) were developed, providing a computationally efficient method of computing these operators. We briefly review this construction. The one-dimensional, type 1 NUFFT of , is computed by taking
[TABLE]
where is a parameter determining the accuracy desired, and is a periodic heat kernel of the form
[TABLE]
For notational convenience, we let be the linear operator defined by and let
We may then rewrite as
[TABLE]
(noting that is self-adjoint). As noted in [8], the main computational task in computing the NUFFT is evaluating
[TABLE]
In order to do this efficiently, a parameter is chosen in order to achieve a desired level of accuracy in the NUFFT. For , we approximate by truncated summations, taking the nearest points around . Outside of this window, the Gaussians decay sufficiently sharply enough to be ignored.
Defining and , is approximately
[TABLE]
It is noted in [8], drawing from [6], that if , the NUFFT is accurate to 6 digits of accuracy and if , the NUFFT is accurate to 12 digits of accuracy.
We now turn to , which takes the form
[TABLE]
where This operator is sparse; only grid points within of are used, along with appropriate weights taken from the Gaussian sources. In higher dimensions, this sparsity structure holds; for instance in two dimensions, the only grid points used in computing are those points such that and The structure of is determined by the radial resolution of the data , the set of angular measurements used, and the value . The first depends on the accuracy of the instrumentation, which typically is fixed. In situations where the set of angular measurements is either fixed, or from one of several canonical sampling regimes, the may be precomputed as a sparse matrix. The resulting sparse matrix fuses the convolutions into a single operation. Thus, the NUFFT’s needed in solving Equation 6 may be replaced by 2 FFT’s, a single sparse matrix-vector multiplication, and 2 diagonal matrix-vector multiplications.
3.2. Acceleration via surrogate matrices
Taking advantage of fusing the convolutions as outline above in section 3.1 means, at each iterate in the split Bregman algorithm, we now solve
[TABLE]
where now has been replaced with a sparse matrix multiplication. However, the (split) Bregman algorithm has an error forgetting property [7]; that is, one only needs in general to solve Equation 9 approximately in order to still have a convergent algorithm. The benefit for doing exact Bregman updates lies in the ability of Bregman to converge in two exact updates, for sufficiently small initial Bregman distance, as discussed in [12]. However, we note that solving Equation 9 exactly (to machine precision) can be quite expensive. For large-scale imaging problems, the expense of exactly solving Equation 9 may offset the near immediate convergence of Bregman updates for good (with respect to the Bregman distance) initialization. Indeed, in [3], inexact Bregman updates were sufficient for a similar formulation of the tomographic reconstruction problem.
In order to further reduce the computational costs associated with solving Equation 9, we look to leverage the error-forgetting nature of the split Bregman algorithm by precomputing a sparse matrix which approximates . We first consider the cases where is the simple image given by
[TABLE]
The result of being applied to this is constant function. As described above in section 3.1, will then be the sum of Gaussians.
Therefore, \big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}(u) will be a (possibly rescaled) Gaussian centered at . This is shown in Figure 1(a) for where we show the central region when corresponds to a Shannon sampling frequency in angle. We will use the peakedness seen in that image to construct a . First, for a chosen truncation radius define for some the matrix by
[TABLE]
We then define
[TABLE]
as the sum of these translated stencil matrices. This is used to compute split Bregman updates; instead of Equation 9, we solve (possibly approximately)
[TABLE]
It is important to recall that for an an image which is a single non-zero pixel near the image’s edge, e.g. for some , \big{(}A\mathcal{F}_{D}(B^{*}B)\mathcal{F}_{D}^{*}A\big{)}\tilde{u} will have nontrivial entries near the opposite edge, due to the periodicity of the operators. Our choice of suppresses these values on the opposite edge. The motivation for this choice is to induce a banded structure in , as seen in Figures 1(b) and 1(c); in this case resulting in a banded matrix with 7 diagonal and off-diagonal entries per row. However, this is reasonable in the case of computed tomography where the sample to be measured is fully contained in the interior of the image. The choice of suppressing values on the opposite edge corresponds to an assumption that the object of interest in the reconstructed image is at least pixels from each edge of the image and that near the edge, the true can be assumed to be [math].
4. Numerical Implementation
In order to test the efficiency of using the sparse linear matrix representation, we solve Equation 1 for corresponding to anisotropic TV regularization via a split-Bregman algorithm (as discussed for the general problem in [7] and specifically for computed tomography in [11, 3]). Throughout, we use the Shepp-Logan phantom as the reference data and . All computations reported below were performed in MATLAB on a workstation using Intel 2.20 GHz Xeon Processors with 55 MB caches.
First, in order to compare the cost and accuracy of using the NUFFT as opposed to either the convolution fusing method or the use of sparse surrogates, we look at the time required to compute a fixed number of inexact Bregman iterations. We compute 200 inexact Bregman iterations; each of these iterations involves approximating the solution to Equation 4 via 5 conjugate gradient steps. Therefore, 1000 evaluations of are needed. We perform this for various image sizes as well as various values of used in computing the NUFFT and its corresponding . For each value of we choose that corresponds to using evenly spaced angular measurements. We summarize the results in Tables 1 and 2 as well as Figure 2. First, in Table 1, we report the time required for performing 200 inexact Bregman iterations as well the portion of that time required for performing the conjugate gradient computations. As expected, increasing increases the time required when using the NUFFT and convolution fusing-based optimizations. The sparsity structure of in the surrogate-based method is dominated by the choice of (here set to be ), resulting in the time needed being unchanged for different . Additionally, the time required per evaluation of using the NUFFT is, for , roughly 3 or 4 times longer than using convolution fusing and 10-20 times longer than using sparse surrogates. At higher values of the speedup using convolution fusing seems to hold; meanwhile the possible speedup by using sparse surrogates becomes one of several orders.
However, the speedup using sparse surrogates is tempered by a significant loss of accuracy in the reconstruction. This can be seen in Table 2, where we report the relative error of the reconstruction using these 200 split Bregman iterations. The sparse surrogate-based reconstruction results in a relative error that is between 2 and 5 times higher than reconstructions using the other methods. One perhaps surprising result from this test is that the choice of appears to not significantly affect the quality of the reconstruction after a fixed number of iterations. Finally, in Figure 2, we display the relative errors and the norm of the updates over the iterations. Using either the NUFFT or convolution fusing results in nearly the same iterative reconstruction. Additionally, despite the higher error in the sparse surrogate method, the convergence rates are unchanged. Indeed, the sparse surrogate method leads, after a handful of iterations, to a iteration scheme which is largely monotonically reducing the error of the reconstruction, unlike the other two methods. These observations suggest that for a fixed number of iterations, convolution fusing provides a significant speedup over the NUFFT while not impacting the performance of the image reconstruction procedure. Meanwhile, the use of sparse surrogates decreases the accuracy at any given iterate. This also holds if one increases : for Bregman updates require seconds results in a relative error of error. As expected, then, increasing results in more expensive (due to decreasing the sparsity of the resulting ) albeit more accurate iterations. However, due to the shorter time of each iteration in the sparse surrogate method, there is a possibility that the overall performance of solving (1) can be improved: we may replace relatively accurate iterations with more iterations which, individually, are significantly quicker to perform.
To test this, we look to solve (5) for images and again corresponding to anisotropic total variation regularization. We consider corresponding to taking angular measurements for and corresponding to That is, we consider reconstruction problems where the angular measurements are under-resolved to differing degrees. In [3], the highly under-resolved case was of interest, as angular measurements are relatively expensive in some applications. We will compare the results of using split Bregman to solve (5) with both convolution fusing and with sparse surrogates. That is, we will compare using (9) with using (11) to compute the Bregman update; again, we use at most 5 conjugate gradient steps to compute this update. As a stopping criterion, we terminate either after a maximum of 6000 Bregman updates or if the norm of a Bregman update is less than of the first Bregman update. In Tables 3, 4 and 5 we list the results of using each method to compute the Bregman update for , respectively. In each we show the time and number of iterations needed for Bregman updates to fall below a certain threshold, as well as the relative error of the reconstruction at that iteration. Additionally we show in Figures 3, 4 and 5 the error and size of the Bregman updates as a function of computational time for , respectively.
We see that for any fixed time budget, the sparse surrogate-based method provides improved accuracy in very few iterations. In the case of the time needed to get order relative error via the sparse surrogate method is approximately of the time needed to reach error via convolution fusing. Similar observations hold for However, for the sparse surrogate-based method, while taking vastly less time than the convolution fusing-based method, results in a final result which is more inaccurate. This is due to prematurely terminating the reconstruction method, as the optimization routine detects a convergence due to the relatively small size of the Bregman updates. Nonetheless, the time needed to reach error via convolution fusing is seconds, versus the seconds needed for sparse surrogates.
5. Conclusions
Approximating the transform with sparse matrices reduces the computational bottleneck that arises in solving variational tomographic reconstruction problems. Convolution fusing replaces the most computationally expensive portion of the evaluation of this operator with a sparse matrix multiplication, reducing the computation time without any additional loss of accuracy. In cases where an error-forgetting algorithm is to be used, sparse surrogates further reduce the cost of the computation by replacing the entire operator with a sparse matrix; this comes at the cost of a loss of accuracy. However, the reduced time needed per iteration offsets the need for additional iterations, especially in the case of only moderately unresolved angular measurements. In the case of low angular resolution reconstructions where the spares surrogates result in prematurely terminating algorithms, it may prove useful as a warm-start strategy. That is, as we expect sparse surrogate-based iterations may prematurely converge, we use the result as an initial guess for fused convolution-based iterations which are more expensive.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Fredrik Andersson. Fast inversion of the radon transform using log-polar coordinates and partial back-projections. SIAM Journal on Applied Mathematics , 65(3):818–837, 2005.
- 2[2] Fredrik Andersson, Marcus Carlsson, and Viktor V. Nikitin. Fast algorithms and efficient gpu implementations for the radon transform and the back-projection operator represented as convolution operators. SIAM Journal on Imaging Sciences , 9(2):637–664, 2016.
- 3[3] Richard Barnard, T. Toops, E. Nafziger, C. Finney, D. Splitter, Hassina Bilheux, and Rick Archibald. Total variation-based neutron computed tomography. Submitted, 2017.
- 4[4] Achi Brandt, Jordan Mann, Matvei Brodski, and Meirav Galun. A fast and accurate multilevel inversion of the radon transform. SIAM Journal on Applied Mathematics , 60(2):437–462, 2000.
- 5[5] Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision , 40(1):120–145, 2011.
- 6[6] A. Dutt and V. Rokhlin. Fast fourier transforms for nonequispaced data. SIAM Journal on Scientific Computing , 14(6):1368–1393, 1993.
- 7[7] Tom Goldstein and Stanley Osher. The split bregman method for l 1-regularized problems. SIAM Journal on Imaging Sciences , 2(2):323–343, 2009.
- 8[8] Leslie Greengard and June-Yub Lee. Accelerating the nonuniform fast fourier transform. SIAM Review , 46(3):443–454, 2004.
