Fixing Nonconvergence of Algebraic Iterative Reconstruction with an Unmatched Backprojector
Yiqiu Dong, Per Christian Hansen, Michiel E. Hochstenbach, Nicolai, Andre Brogaard Riis

TL;DR
This paper addresses convergence issues in algebraic iterative reconstruction methods with unmatched projector/backprojector pairs by proposing a shifted algorithm that guarantees convergence under certain conditions, with practical implementation guidance.
Contribution
It introduces a shifted algorithm for unmatched algebraic reconstruction that ensures convergence and provides eigenvalue estimation techniques for parameter selection.
Findings
The shifted algorithm guarantees convergence under specific conditions.
Perturbation bounds for the fixed point are established.
Numerical tests demonstrate improved convergence in computed tomography.
Abstract
We consider algebraic iterative reconstruction methods with applications in image reconstruction. In particular, we are concerned with methods based on an unmatched projector/backprojector pair; i.e., the backprojector is not the exact adjoint or transpose of the forward projector. Such situations are common in large-scale computed tomography, and we consider the common situation where the method does not converge due to the nonsymmetry of the iteration matrix. We propose a modified algorithm that incorporates a small shift parameter, and we give the conditions that guarantee convergence of this method to a fixed point of a slightly perturbed problem. We also give perturbation bounds for this fixed point. Moreover, we discuss how to use Krylov subspace methods to efficiently estimate the leftmost eigenvalue of a certain matrix to select a proper shift parameter. The modified algorithm…
| 25 trials with opTomo that utilizes the GPU | ||
| Mean MVM | Mean (st. dev.) | |
| fov10 | 1660 | () |
| fov15 | 1960 | () |
| fov20 | 1260 | () |
| ks | 1041 | () |
| eigs | 1278 | () |
| 25 trials with and explicitly stored | ||
| Mean MVM | Mean (st. dev.) | |
| fov10 | 1660 | () |
| fov15 | 1960 | () |
| fov20 | 1260 | () |
| ks | 1039 | () |
| eigs | 1261 | () |
| jd | 1404 | () |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Advanced MRI Techniques and Applications · Numerical methods in inverse problems
\headers
Fixing NonconvergenceDong, Hansen, Hochstenbach, and Riis
Fixing Nonconvergence of Algebraic Iterative Reconstruction with
an Unmatched Backprojector††thanks: Submitted to the editors DATE. \fundingThis work has been partially funded by Advanced grant 291405 from the European Research Council. The third author has been supported by an NWO Vidi research grant.
Yiqiu Dong Department of Applied Mathematics and Computer Science, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark (, http://www.compute.dtu.dk/~yido/, , http://www.compute.dtu.dk/~pcha/, , http://www.compute.dtu.dk/~nabr/). [email protected]
Per Christian Hansen22footnotemark: 2
Michiel E. Hochstenbach Department of Mathematics and Computer Science, TU Eindhoven, 5600 MB Eindhoven, The Netherlands (http://www.win.tue.nl/~hochsten/)
Nicolai André Brogaard Riis22footnotemark: 2
Abstract
We consider algebraic iterative reconstruction methods with applications in image reconstruction. In particular, we are concerned with methods based on an unmatched projector/backprojector pair; i.e., the backprojector is not the exact adjoint or transpose of the forward projector. Such situations are common in large-scale computed tomography, and we consider the common situation where the method does not converge due to the nonsymmetry of the iteration matrix. We propose a modified algorithm that incorporates a small shift parameter, and we give the conditions that guarantee convergence of this method to a fixed point of a slightly perturbed problem. We also give perturbation bounds for this fixed point. Moreover, we discuss how to use Krylov subspace methods to efficiently estimate the leftmost eigenvalue of a certain matrix to select a proper shift parameter. The modified algorithm is illustrated with test problems from computed tomography.
keywords:
Unmatched transpose, algebraic iterative reconstruction, perturbation theory, leftmost eigenvalue estimation, computed tomography
{AMS}
65F10, 65F15, 65F22, 15A18, 15A60
1 Introduction
Algebraic iterative reconstruction techniques [10] — such as the methods by Kaczmarz and Cimmino — play an important role in solving inverse problems. In particular, they are popular in computed tomography (CT) due to their great flexibility with respect to the measurement geometry of the X-ray scanner and their ability to handle very underdetermined problems. This is in contrast to filtered backprojection and similar algorithms [16] that rely on specific geometries and a large amount of data. Algebraic iterative methods are also used successfully in other image reconstruction problems such as image deblurring.
The fundamental mechanism behind these methods is known as semi-convergence [16]. During the initial iterations, the iterates approach the exact (and unattainable) solution to the noise-free problem, while in later stages the iterates converge to the undesired noisy solution. The methods produce filtered, or regularized, solutions and the number of iterations plays the role of a regularization parameter [7].
The algebraic iterative methods, in their basic form as well as their block versions, lend themselves very well to implementations that utilize GPUs and other hardware accelerators, and where the coefficient matrix is never stored; rather, the matrix-vector multiplications are computed on the fly. This has led to an implementation paradigm that is routinely used in software packages for computed tomography (such as ASTRA [22] and TIGRE [1]), namely, to focus on the computational speed of the matrix-vector multiplication. However, this introduces a convergence issue that has largely been ignored.
To set the stage, we formulate the noise-free problem as
[TABLE]
where the vectors and represent the exact image and the noise-free data, while represents the forward model — known as the (forward) projection in CT where both and are common. The multiplication with , the transpose of , is known in CT as the backprojection. These two operations form the computational core of any algebraic iterative method and therefore — to optimize for computational speed — the software developers often choose different discretization schemes, and different model approximations, for these operations [24]. Consequently, in such software the backprojection corresponds to multiplication with a matrix that is not the exact transpose of . We refer to this situation as having an unmatched projector/backprojector pair, and we call the unmatched transpose.
It appears that very little attention has been given to iterations based on such unmatched pairs; see [25] for an early reference and [2], [13] for two more recent ones. The latter paper has introduced a methodology for analyzing such iterations and formulated conditions for convergence, and it has been shown that an unmatched projector/backprojector pair deteriorates the best possible solution at the point of semi-convergence.
In the common situation of an unmatched projector/backprojector pair where the convergence criterion from [2] is not satisfied, the iterations will fail to converge for noise-free data (although some kind of semi-convergence may be observed experimentally for noisy data). In this paper we show that a small and cost-efficient modification of the basic algorithm can guarantee convergence to a solution of a slightly perturbed problem. This ensures that the fast implementations of the forward projections and backprojections can still be used without sacrificing convergence. Moreover, to provide a theoretical foundation we extend the convergence and perturbation analysis from [2] to the modified algorithm.
Our paper has been organized as follows. Section 2 sets the stage by summarizing convergence results for a generic iterative algorithm (proposed in [2]) that allows an unmatched transpose. Section 3 introduces the modified algorithm, gives the associated convergence conditions, and discusses the perturbation theory for the underlying problem. The modified algorithm is based on the introduction of a shift parameter, and in Section 4 we discuss how to efficiently estimate the leftmost eigenvalue of a certain matrix, which defines this shift. Finally, in Section 5 we give numerical examples that illustrate the new method for solving inverse problems, followed by some conclusions in Section 6.
We use the following notations: denotes the vector and matrix 2-norm, and are the range and null space of , respectively, and we split a complex eigenvalue into its real and imaginary parts denoted by and , respectively. For a vector , we use for its conjugate transpose.
In Section 2 we use notations and concepts associated with oblique projections and oblique pseudoinverses to obtain compact expression that would otherwise be quite lengthy. We refer to [9] for details and geometric interpretations of these quantities in relation to inverse problems. Given two complementary subspaces and of that intersect trivially, and matrices , , and such that
[TABLE]
Then the matrix denotes the oblique projector onto along which satisfies
[TABLE]
and the projection matrix can be written as
[TABLE]
where † denotes the Moore–Penrose pseudoinverse. Moreover, if then the matrix denotes the oblique pseudoinverse of along , given by
[TABLE]
The case requires and to be complementary, while the case requires and to be complementary. If then is the orthogonal projector on and is the ordinary pseudoinverse.
2 The BA Iteration
When we consider noisy data , where the vector represents the perturbation, then it is common to compute a (weighted) least squares solution. In the simplest case with unit weights we can compute the solution by means of the Landweber iteration (or gradient descent method) with initial guess :
[TABLE]
where is a relaxation parameter satisfying .
To analyze the behavior of this and similar algebraic iterative methods with an unmatched transpose, we follow [2] and consider the BA Iteration defined by
[TABLE]
where different choices of the matrix give unmatched-transpose versions of known iterative methods. For example, can be an unmatched transpose for Landweber’s method, or can be an unmatched approximation to for Cimmino’s method; see [10] for an overview of methods.
The convergence of the BA Iteration is governed by the (complex) eigenvalues of the matrix : (5) converges if and only if the relaxation parameter and all nonzero satisfy
[TABLE]
see [2, Prop. 3.2] for details. Note that this specializes to the standard condition when .
We will now investigate when the BA Iteration (5) has a unique fixed point. From the definition of the BA Iteration (5) with it follows that any fixed point must satisfy . Moreover, it is also clear from (5) that all iterates ; in particular this holds for . Therefore, we can write any fixed point in the form for some vector . (This vector may not be unique, as one can add an arbitrary component , but this is irrelevant for what is to follow.) Inserting we obtain an equation for :
[TABLE]
Here, is an operator from to itself. Recall that and , where both and are possible in CT applications.
From (7) it follows there is a unique fixed point if and only if one of the following eight equivalent conditions holds.
Proposition 2.1**.**
Consider the two matrices and with ranks and with the singular value decompositions (SVDs) and . The following statements are equivalent:
- (i)
* is nonsingular (meaning that and imply that );* 2. (ii)
For every , the equation has a unique solution ; 3. (iii)
; 4. (iv)
; 5. (v)
; 6. (vi)
; 7. (vii)
* is nonsingular on and is nonsingular on ;* 8. (viii)
* and ; *
Proof 2.2**.**
*The equivalences follow relatively straightforwardly from the (dimensions of) nullspaces and ranges of , , , and . For (iv) we have with equality if and only if (i) holds. For (v) one has with equality if and only if (i) holds, where the ranks in (vi) are the dimensions of the subspaces of (v). A nonzero vector in cannot be mapped to zero by the consecutive action of , which is stated in (vii) and (viii). *
We assume from now on that there is a unique solution, and the next theorem provides specific expressions for this fixed point. We will see that, even in the absence of noise, the fixed point of (5) is not the exact solution . One way to understand this — following the discussion in [2] — is by the fact that the unmatched normal equations may be viewed as an oblique projection of , instead of the common orthogonal projection underlying the normal equations .
Theorem 2.3**.**
Assume that and satisfy the criteria in Proposition 2.1. Then the fixed point of the BA Iteration (5) with starting vector can be expressed in three ways:
[TABLE]
and for noise-free data the fixed point is given by
[TABLE]
*If and and have full rank then and . *
Proof 2.4**.**
By writing the fixed point as with it follows from (7) that , and the first two expressions in (8) are obtained by recognizing that and ; cf. (3). We now introduce the SVD
[TABLE]
where . Inserting this in (7) we get , and by multiplying from the left with we obtain . The solution of minimum norm is given by
[TABLE]
and hence
[TABLE]
By recognizing as the oblique projector onto along , cf. (2), and as the oblique pseudoinverse of along , cf. (3), we obtain the third expression in (8).
For the special case the fixed point satisfies , and hence we can write it as for some vector . According to , we obtain
[TABLE]
*and we recognize P_{\,\mathcal{R}(BA),\,\mathcal{N}(BA)}=B\,A\bigl{(}((B\,A)^{T})^{T}B\,A\bigr{)}^{\dagger}((B\,A)^{T})^{T} as the oblique projector onto along . *
The sensitivity of the fixed point to perturbations of the right-hand side can be characterized as follows.
Corollary 2.5**.**
Let and denote the fixed point of the BA Iteration when applied to the noise-free data and the noisy data , respectively. Then
[TABLE]
If and and have full rank then
[TABLE]
*When then the oblique pseudoinverse is the ordinary pseudoinverse and we obtain the traditional least-squares perturbation bound. *
Proof 2.6**.**
*The first bound is a direct consequence of (8), and the second bound follows from the fact that when . *
The conclusions to be drawn from the analysis in this section is that the conditions for the existence of a fixed point of the BA Iteration are rather strict, and that the fixed point is potentially very sensitive to data errors since the matrix is ill-conditioned in inverse problems. Moreover, it is very difficult to check the existence conditions in a practical application, and it appears that they are very often violated in the available software systems. This motivates the development of a modified iterative method that is always guaranteed to have a fixed point, which we introduce and analyze in the rest of this paper.
3 A Modified Algorithm
In our numerical studies with ASTRA and other software packages for CT, we have found that very often the convergence condition in (6) is violated in that has one or more eigenvalues with negative real part. As a consequence the iteration has no fixed point and the typical situation is that the iterates , after some iterations, start to diverge. We illustrate this in Section 5.
3.1 The Shifted BA Iteration
To remedy this non-convergence issue, we propose a modified version of the BA Iteration that has guaranteed convergence and whose fixed point approximates the exact solution . In addition, the modified method should exhibit semi-convergence properties similar to the BA Iteration. We refer to the modified algorithm as the Shifted BA Iteration, and it guarantees convergence of the iterations for appropriate choices of the two parameters and :
[TABLE]
This scheme is motivated by the Tikhonov problem,
[TABLE]
for which a gradient descent step takes the form
[TABLE]
Hence, if then with a properly chosen the iteration (10) converges to a Tikhonov solution . Below we study the convergence properties of (10) with .
The matrix that governs the iterations for the Shifted BA Iteration (10) is with as the identity matrix, whose eigenvalues are (where are the eigenvalues of ). Our key idea is that by a proper choice of the additional positive parameter we can ensure that all these eigenvalues have a nonnegative real part — thus ensuring convergence. The shift needs to be just large enough that for those . At the same time, if is small then the fixed point will be an approximation to the exact solution . Hence, in contrast to the BA Iteration the shifted version has a unique fixed point that is always attained for both noisy and noise-free data. Our new approach can therefore be viewed as a modification where both a regularization term and semi-convergence of the iterations are used for noisy data.
We note that the Shifted BA Iteration is mathematically equivalent to applying the BA Iteration to the augmented matrices and vector
[TABLE]
A similar idea was used in [3, §3.2], for the case , to perform convergence analysis for the case . According to [2, Props. 3.1 and 3.2], with the augmented matrices and vector defined in (12) we obtain the following convergence criterion for the Shifted BA Iteration.
Theorem 3.1**.**
Let denote those eigenvalues of that are different from . Then the Shifted BA Iteration (10) converges to a fixed point if and only if and satisfy
[TABLE]
Proof 3.2**.**
Replacing the matrices and in the BA Iteration with the augmented ones from (12), we define and . Then is the iteration matrix for the Shifted BA Iteration (10), i.e.,
[TABLE]
and any fixed point of (10) satisfies the equation
[TABLE]
Let be the orthogonal projector onto . In view of the presence of the projection , we consider the new coordinate system given by the orthogonal matrix , where and are matrices with orthonormal columns spanning and , respectively. We examine in the new coordinates where takes the form
[TABLE]
For the first block row, it holds that
[TABLE]
This shows the eigenvalue of is associated with the eigenspace . We see that because of the projector operator, this eigenvalue, which corresponds to eigenvalues of equal to , is irrelevant for convergence. From the second block row, it suffices to consider as operator , where it holds that . Combining these facts, we conclude that it is enough to consider the eigenvalues of , where is not equal to .
*Then, applying the results in [2, Props. 3.1 and 3.2] to the augmented matrices and vector (12), we obtain the sufficient and necessary condition of convergence with respect to and . *
When we have convergence, a fixed point of the Shifted BA Iteration satisfies
[TABLE]
and, similar to Theorem 2.3, we can characterize this fixed point as follows.
Theorem 3.3**.**
Assume that is nonsingular. The fixed point of the Shifted BA Iteration (10) applied to , with starting vector and , satisfies
[TABLE]
For noise-free data the fixed point satisfies
[TABLE]
Proof 3.4**.**
The first relation follows immediately from (16). To obtain the second relation we write and note that . Hence must satisfy
[TABLE]
and therefore has the general form with an arbitrary component in the null space of as well as :
[TABLE]
We note that is nonsingular due to our assumption that is nonsingular, cf. [11, Thm. 1.3.22]. Thus
[TABLE]
and it follows immediately that . The first results for follows immediately from (17). To show the second result let have the eigendecomposition ; then from which it follows that . The third result follows from the relation
[TABLE]
Note that the results in (17) can also be derived by applying the augmented matrices and vector defined in (12) to (8).
To summarize, we formulated the convergence conditions for the Shifted BA Iteration in terms of the shift and the relaxation parameter , and we gave explicit expressions for the fixed point of this iterative method.
3.2 First-Order Perturbation Analysis
Recall that the fixed point of the Shifted BA Iteration is the Tikhonov solution in (11) when . Following [2] it is natural to give a general perturbation analysis of the Tikhonov problem when different perturbations are introduced in the matrices and in the corresponding normal equations . A special instance of this analysis is when is an unmatched transpose of , and where our analysis lets us bound the difference between the fixed point and the exact solution .
We introduce the perturbed quantities
[TABLE]
with , and . Moreover we define as the solution to the Regularized Unmatched Normal Equations
[TABLE]
We want to compare to the exact solution . To do this, we introduce the Tikhonov solution to the unperturbed problem (11).
We then split the error into a perturbation error and a regularization error as follows:
[TABLE]
This approach allows us to study the effect of the matrix and right-hand side perturbations in isolation from the effect that Tikhonov regularization has on a noise-free system.
Theorem 3.5**.**
With the definitions in (18) and (19) we have the following first-order error bounds obtained by omitting higher-order terms:
[TABLE]
Proof 3.6**.**
Let and consider the perturbed system
[TABLE]
Moreover note that from (18) we have
[TABLE]
and
[TABLE]
where we have introduced
[TABLE]
Inserting these equations in (20) we obtain
[TABLE]
and rearranging we get
[TABLE]
Now using that and neglecting higher-order terms we get
[TABLE]
which can also be obtained by using the augmented form in the proof of [2, Prop. 2.1], i.e., replacing , , and with
[TABLE]
This leads to the bound
[TABLE]
where we use that, with being the th singular value of ,
[TABLE]
and
[TABLE]
For the relative error we get
[TABLE]
where we used that
[TABLE]
To complete the analysis we need to bound the regularization error associated with the noise-free system. To obtain a useful bound we need to incorporate the fact that we solve a discretized inverse problem. This is done in the following theorem from [5] (see also [7, Thm. 4.5.1]).
Theorem 3.7**.**
Introduce SVD of as and assume that the noise-free right-hand side is given by the model
[TABLE]
in which is a model parameter that controls the decay of these coefficients. Then
[TABLE]
In practice the Tikhonov regularization parameter is always less than [7]. This theorem then says that the noise-free problem must satisfy the discrete Picard condition for Tikhonov regularization to produce a useful result — which is, of course, the case for the imaging problems we have in mind.
To summarize our results, the shift parameter plays the following roles. The regularization error decreases as decreases, and if the noise-free data satisfies the discrete Picard condition (as we expect) then a small nonzero has little influence on the regularization error. On the other hand, as decreases then the perturbation error increases. We want to use an just large enough to ensure convergence.
4 Eigenvalue Estimation
The motivation behind the Shifted BA Iteration is to introduce a small positive shift parameter , just large enough to ensure that all the shifted eigenvalues have a positive real part, i.e., . To turn this principle into an efficient working algorithm, we therefore need to be able to estimate the leftmost eigenvalue of , the eigenvalue with the minimal real part. If then we just use the BA Iteration, otherwise we use the Shifted BA Iteration with slightly larger than . (In view of Theorem 3.1, we might theoretically even take exactly equal to this quantity, but this is not important in practice, since the approximation to the smallest real part of the leftmost eigenvalue will usually be an upper bound.) It is important to note that we only have actions with and at our disposal, and no actions with , , or exact shift-and-invert transformations, are possible.
Various approaches have been investigated by Meerbergen and coauthors for the rightmost eigenvalue of a matrix (or, equivalently, the leftmost of ). Several of these are “matrix-free”, which means that only matrix-vector products are necessary. An approach based on Chebyshev polynomials has been proposed in [14]. In [15], the search space is expanded by an approximation to , where is the current Ritz vector. However, these methods usually take a considerable number of matrix-vector multiplications to expand the search space by one vector.
Since the leftmost eigenvalue is an eigenvalue located at the exterior of the spectrum, Krylov based methods are often well suited. Stewart’s Krylov–Schur method [20] is one of the most popular methods to compute such eigenvalues. This method is essentially equivalent to implicitly restarted Arnoldi [19], as for instance implemented in Matlab’s eigs, but has a particularly elegant and easy-to-understand implementation. In our experiments, a custom-made implementation of the Krylov–Schur method proved to be, on average, a factor 1.2–1.3 faster than eigs in terms of matrix-vector multiplications. We give pseudocode for the Krylov–Schur method in Algorithm 1.
Algorithm 1: Krylov–Schur for the leftmost eigenvalue
Input: Minimal and maximal subspace dimensions , starting vector , tole-
rance tol, functions to perform matrix-vector multiplications with and .
Output: A pair that approximates the leftmost eigenpair ,
with .
1: Form the Krylov decomposition
2: for
3: M Extract Schur pairs from with ,
MM sorted on nondecreasing real part
4: M if
5: MMM Accept with leftmost Schur vector , stop
6: M end
7: M Truncate decomposition to dimension by selecting leftmost Schur vectors
8: M Expand the Krylov decomposition to dimension
9: end
This algorithm uses the Krylov decomposition from [20] which is a generalization of the Arnoldi decomposition and which may have complex factors. In Line 1, the first Krylov decomposition has , the th standard basis vector. In subsequent Krylov decompositions this vector is changed, as indicated below. In Line 4, we exploit the fact that
[TABLE]
As described in [20], the restart in Lines 7–8 is performed as follows. Suppose is the Schur decomposition with the most relevant Schur vectors (corresponding to the leftmost Ritz values in ) sorted in the beginning of . Then the method is restarted with instead of ; instead of ; and instead of . We present numerical experiments with this method in Section 5.2.
An alternative approach that uses inexact shift-and-invert operators by carrying out inner iterations is Jacobi–Davidson [18]. This inexact inner-outer type of method may be worthwhile when the leftmost eigenvalue is not well separated from neighboring eigenvalues. In our applications this does not seem the case, and Jacobi–Davidson performs usually worse than Krylov–Schur. In addition, it is not obvious to generate a preconditioner for a shifted version of , which would be very helpful for Jacobi–Davidson.
In our experiments with examples from computed tomography, we find that the matrix is often close to normal (in fact, even close to symmetric; cf. Section 5.2). For such problems, another alternative approach to approximate the leftmost eigenvalue is the following. Let
[TABLE]
be the field of values (or numerical range) of . Then we can expect the quantity
[TABLE]
to be close to the leftmost eigenvalue of . This would in principle be relatively easy to approximate, since it is equal to , which results in computing an exterior eigenvalue of a symmetric eigenproblem. This would mean that we can use a symmetric version of Krylov–Schur, which saves roughly half of the reorthogonalization costs.
Unfortunately, in our applications we do not have the action with or , and the described approach is not an option. However, as an alternative, we can still approximate by
[TABLE]
where is the matrix in the Krylov decomposition obtained with after several iterations. The computation of this quantity only requires the known , and therefore bypasses the need of the transposes of and . Although there are usually no error bounds for this type of approximation, it may in practice be of very good quality. This algorithm is summarized below.
Algorithm 2: A field of values approximation for the leftmost eigenvalue
Input: Minimal and maximal subspace dimensions , starting vector ,
maximum iterations maxit, functions to perform actions with and .
Output: , the leftmost point of a projected field of values, which approximates the
leftmost point of .
1: Form the Krylov decomposition
2: for
3: M Extract Schur pairs from with ,
MM sorted on nondecreasing real part
4: M Truncate decomposition to dimension by selecting leftmost Schur vectors
5: M Expand the Krylov decomposition to dimension
6: end
7: Accept = real part of leftmost eigenvalue of
Note that in a type of method as in Algorithm 2, there is typically no error estimate available; there only is a user-chosen parameter maxit, which often can be modest (see Section 5). A main advantage of Algorithm 2 over Algorithm 1 is that it may be possible to stop the iterations improving the Krylov decomposition earlier, before the eigenpair of Algorithm 1 has converged to the desired tolerance. We test both approaches in the next section.
5 Numerical Examples
We present numerical examples with two different test problems, in order to demonstrate the performance of our new algorithm. All computations are carried out in MATLAB.
5.1 Small Illustrative Test Problem
The first test problem is quite small, with , such that we can explicitly compute the desired eigenvalues and other quantities that allow us to analyze the algorithms’ performance related to the above theory. The matrix is full, and it is generated by means of the function regutm from Regularization Tools [8] by which we can generate random test matrices with specified singular values, while the singular vectors have the characteristic spectral behavior of inverse problems [6]. We generate a well-conditioned matrix and a more ill-conditioned matrix with the following distribution of singular values:
[TABLE]
We then generate a corresponding unmatched transpose by adding random Gaussian elements to ; all elements are from where the variance is chosen such that . Both and have full rank.
The exact solution is the one from the shaw test problem [8]; it is smooth with two humps. Then we generate the exact and noisy right-hand sides by
[TABLE]
where the random elements of are Gaussian; all elements are from where the variance scales the noise as desired.
We applied both the BA Iteration and the Shifted BA Iteration with to these problems. For the BA Iteration we use the relaxation parameter and for the Shifted BA Iteration we use equal to the upper bound in (13) with the factor 2 replaced by 1.9. For both systems we used eig to compute the eigenvalues; the spectral radius is by construction, and the leftmost eigenvalues are
[TABLE]
Note that for the leftmost eigenvalue is a complex conjugate pair with a negative real part. Hence we expect the BA Iteration to exhibit non-convergence for the problems with . To ensure convergence of the Shifted BA Iteration we choose
[TABLE]
The convergence histories for the noisy data () are shown in Figure 1; the similar plots for the noise-free data () are almost identical. We make the following observations:
- •
The left plot confirms that the BA Iteration converges only for the well-conditioned matrix for which all eigenvalues have a positive real part, and that it converges to .
- •
The right plot confirms that the Shifted BA Iteration converges for both matrices, and that it converges to . For the well-conditioned system the convergence is much faster compared to the ill-conditioned system. Also, the Shifted BA Iteration converges faster than the BA Iteration for the well-conditioned system.
Having confirmed the convergence of the methods, it is also relevant to study how well the methods are able to approximate the exact solution . To illustrate this, Figure 2 shows plots of the reconstruction errors versus iteration number . We make several observations:
- •
For the well-conditioned matrix and noise-free data () the BA Iteration converges to the exact solution as predicted by Theorem 2.3 with square and full-rank and .
- •
For the ill-conditioned matrix the BA Iteration always diverges.
- •
For noise-free data () the Shifted BA Iteration converges to a slightly perturbed solution with
[TABLE]
- •
For noisy data () the BA Iteration for , as well as the Shifted BA Iteration for both and , converge to a fixed point that is quite far from the exact solution. Exactly the same behavior occurs for iterations that use a matching transpose. The main point is that for noisy data the iterations exhibit semi-convergence, where the iterates produce a good approximation to during the initial iterations.
In conclusion, these experiments verify the benefit of using the Shifted BA Iteration, namely, guaranteed convergence while retaining the semi-convergence that all algebraic iterative methods rely on for noisy data.
5.2 Test Problem From the ASTRA Toolbox
The second test problem comes from X-ray computed tomography (CT) using a parallel-beam geometry with 90 projections in the angular range –, and a detector with 80 pixels and of length is equal to the image size. The exact solution represents a discretization of the Shepp–Logan phantom. Hence the problem size is and , and the problem is underdetermined. For such underdetermined CT problems the algebraic iterative methods can give much better results than the “standard” methods based on filtered backprojection [17]. As before, the exact and noisy data are generated according to (21).
The matrices and that represent the forward and backprojections come from the ASTRA toolbox [22] used in conjunction with “spot operators” [23]. Specifically, we use the ASTRA function opTomo to compute the matrix-vector multiplications with these matrices. For the forward projection the GPU-version of ASTRA uses the interpolation model, also known as Joseph’s method [12], while the backprojection uses the line model with linear interpolation between detector pixels (as done, e.g., in MATLAB’s iradon). The matrices are not stored; if we store them then the sparsity of and is 1.32% and 2.35%, respectively. Measures of nonsymmetry and nonnormality of are
[TABLE]
The spectral radius is .
For this test problem we use the algorithms from Section 4 to estimate the leftmost eigenvalue of , and we compare the performance of the following strategies:
- •
eigs from MATLAB with options maxit = 1500, tol = , SIGMA = ’sr’;
- •
ks is the Krylov–Schur method (Algorithm 1) with options maxit = 1500, absolute tolerance tol = , mindim = 30, maxdim = 60, target = ’-inf’ (for the leftmost eigenvalue);
- •
jd is the Jacobi–Davidson method with options maxit = 1500, tol = , mindim = 30, maxdim = 60, target = ’-inf’;
- •
fovN is the field of values based method (Algorithm 2) with options maxit = N, mindim = 30, maxdim = 60.
Table 1 shows results for two cases:
All matrix-vector multiplications are performed on the GPU, using the ASTRA function opTomo. These multiplications are performed in single precision. The Jacobi–Davidson method jd did not converge, and this may be due to the single-precision computations on the GPU. 2. 2.
The two matrices and are explicitly computed and stored as sparse matrices. Like all the other computations, these computations are performed on the CPU in double precision.
We see that for these CT problems the Krylov–Schur method uses the least amount of computations, corresponding to the work in performing about 500 iterations of the Shifted BA Iteration method. When solving several CT problems with the same geometry, and hence the same matrices, this is acceptable. Even when an unmatched pair is used only once, this may be an acceptable overhead to ensure convergence and trust in the computed solution.
Figure 3 reports the work involved in the eigenvalue estimation with the Krylov-Schur method, as measured by matrix-vector multiplications (MVMs), for different numbers of image pixels . The number of rows is . We see that for larger problems the number of MVMs seems to be proportional to .
Figure 4 shows the convergence histories when applying the two iterative algorithms to this problem, with and chosen as before. We observe the same behavior as before: The BA Iteration diverges, while the Shifted BA Iteration converges to a fixed point. Moreover, the Shifted BA Iteration exhibits semi-convergence as expected, and the minimum reconstruction error — at the point of semi-convergence — does not deteriorate when using the shifted method.
6 Conclusions
We have considered algebraic iterative reconstruction methods with an unmatched backprojector, i.e., the backprojector is not the exact adjoint or transpose of the forward projector. In particular we are concerned with the common situation where the iterative method does not converge, due to the nonsymmetry of the iteration matrix. We propose a modified algorithm that uses a small shift parameter, we define the conditions that guarantee convergence to a fixed point of a slightly perturbed problem, and we give perturbation bounds for this fixed point. We also discuss how to efficiently estimate the leftmost eigenvalue of a certain matrix, which is needed to computed the shift parameter in the modified algorithm. Numerical experiments with artificial test problems as well as a test problem from computed tomography illustrate the use of the new algorithm. Our MATLAB code is available on request.
Acknowledgements
We thank Willem Jan Palenstijn (CWI) for sharing his insight into the ASTRA package. We acknowledge the inspiration from Tommy Elfving and Martin S. Andersen who, independently, suggested the shift as a remedy for the nonconvergence. Finally, we thank two anonymous referees for many valuable comments that helped to improve the paper.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Biguri, M. Dosanjh, S. Hancock, and M. Soleimani, TIGRE: a MATLAB-GPU toolbox for CBCT image reconstruction , Biomed. Phys. Eng. Express, 2 (2016), 055010; the software available from https://github.com/CERN/TIGRE .
- 2[2] T. Elfving and P. C. Hansen Unmatched projector/backprojector pairs: perturbation and convergence analysis , SIAM J. Sci. Comput., 40 (2018), pp. A 573–A 591.
- 3[3] T. Elfving, P. C. Hansen, and T. Nikazad, Semiconvergence and relaxation parameters for projected SIRT algorithms , SIAM J. Sci. Comput., 34 (2012), pp. A 2000–A 2017.
- 4[4] W. Hackbusch, Iterative Solution of Large Sparse Systems of Equations , Springer, New York, 2016.
- 5[5] P. C. Hansen, The discrete Picard condition for discrete ill-posed problems , BIT, 5 (1990), pp. 658–672.
- 6[6] P. C. Hansen, Test matrices for regularization methods , SIAM J. Sci. Comput., 16 (1995), pp. 506–512.
- 7[7] P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems , SIAM, Philadelphia (1998).
- 8[8] P. C. Hansen, Regularization Tools Version 4.0 for Matlab 7.3 , Numer. Algorithms, 46 (2007), pp. 189–194.
