On a fast Arnoldi method for BML matrices
Bernhard Beckermann (LPP), Clara Mertens, Raf Vandebril

TL;DR
This paper introduces a fast Arnoldi method tailored for BML matrices, leveraging GMRES residuals and low-rank structures to efficiently generate Krylov bases and analyze Hessenberg matrices.
Contribution
It presents a novel short recurrence Arnoldi algorithm for BML matrices using GMRES residuals and explores the low-rank structure of the Hessenberg matrix.
Findings
Efficient Krylov basis generation for BML matrices
Short recurrence relation based on GMRES residuals
Low-rank structure of the Hessenberg matrix
Abstract
Matrices whose adjoint is a low rank perturbation of a rational function of the matrix naturally arise when trying to extend the well known Faber-Manteuffel theorem, which provides necessary and sufficient conditions for the existence of a short Arnoldi recurrence. We show that an orthonormal Krylov basis for this class of matrices can be generated by a short recurrence relation based on GMRES residual vectors. These residual vectors are computed by means of an updating formula. Furthermore, the underlying Hessenberg matrix has an accompanying low rank structure, which we will investigate closely.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Electromagnetic Scattering and Analysis · Theoretical and Computational Physics
On a fast Arnoldi method for -matrices
Bernhard Beckermann
Clara Mertens
and Raf Vandebril
Abstract
Matrices whose adjoint is a low rank perturbation of a rational function of the matrix naturally arise when trying to extend the well known Faber-Manteuffel theorem [8, 7], which provides necessary and sufficient conditions for the existence of a short Arnoldi recurrence. We show that an orthonormal Krylov basis for this class of matrices can be generated by a short recurrence relation based on GMRES residual vectors. These residual vectors are computed by means of an updating formula. Furthermore, the underlying Hessenberg matrix has an accompanying low rank structure, which we will investigate closely.
1 Introduction
In this article we will discuss a new variant of the Arnoldi method applied to a class of sparse matrices which allows to compute the first Arnoldi vectors in complexity . We will refer to this class of matrices as BML-matrices, following the fundamental work of Barth & Manteuffel [2] and Liesen [13] on matrices whose adjoint is a low rank perturbation of a rational function of . More specifically, we assume that
[TABLE]
with polynomials of degree and , respectively, and
[TABLE]
matrices of full column rank. Moreover, it is assumed that the roots of are simple. By taking , we see that Hermitian matrices and unitary matrices are -matrices, and the same is true for low rank perturbations of such matrices. Furthermore, if in (1.1), the matrix is normal [13]. In what follows we suppose that , since these quantities (as well as the sparsity pattern of ) are hidden in the constant of the above-claimed complexity result.
After steps of the Arnoldi process with initial vector one obtains the expression
[TABLE]
with , , and satisfying and . The matrix is upper Hessenberg.
A fast variant of the Arnoldi process will exploit additional structure of the upper Hessenberg matrix . For example, to compute the successive vectors for Hermitian we get a tridiagonal and the Arnoldi process reduces to the Lanczos method. For matrices satisfying (1.1), turns out to be a rank structured matrix. In order to specify this statement, the following definition is introduced.
Definition 1.1**.**
We say that a matrix is upper-separable with , if for all it holds that (in Matlab notation)
[TABLE]
In other words, any submatrix of including elements on and above the th diagonal of is of rank at most . The [math]th diagonal corresponds to the main diagonal, while the th diagonal refers to the th superdiagonal if and to the th subdiagonal if .
Before proceeding, we give some comments on Definition 1.1 whose formulation is inspired by some related well-established definitions. For example, matrices with both and being –upper-separable (and –upper-separable, respectively) are referred to as semiseparable matrices (and quasi separable, respectively) [18, §1.1, §9.3.1]. As a simple example, a tridiagonal matrix is quasi separable, and its inverse is known to be semiseparable. Matrices being –upper-separable with their adjoint being –upper-separable are usually called semiseparable, while matrices being –upper-separable with their adjoint being –upper-separable are called quasi separable [18, §8.2.2 and §8.2.3].
The above statement on the rank structure of can now be made exact. Assume the Arnoldi process breaks down after iterations, i.e., in (1.2). We will refer to as the ‘Arnoldi termination index’ in the rest of the article. In Corollary 3.2 it is shown that for a -matrix , the underlying Hessenberg matrix is upper-separable, where and are functions of . However, for a general matrix , there is no reason for the underlying Hessenberg matrix to be upper-separable. The upper-separable structure of the underlying Hessenberg matrix gives rise to the design of a multiple recurrence relation [2, 14], signifying that each new Arnoldi vector can be written as
[TABLE]
The smaller the quantities in (1.1), the shorter the recurrence relation becomes. In [5] the same recurrence relations are derived for the class of so called -well-free matrices. We refer to Remark 3.7 for some more details.
In this article we investigate a different version of the recurrence relation (1.3) by rewriting it in terms of GMRES residual vectors, aiming to overcome some of the numerical problems which relation (1.3) entails, such as the possibility of a breakdown [14]. It will be shown how these GMRES residual vectors can be computed progressively by means of an updating formula. This partially extends the discussion on a progressive GMRES method for nearly Hermitian matrices as presented by Beckermann & Reichel [4].
The article is organized as follows. Section 2 describes the structure of the unitary factor in the -decomposition of a Hessenberg matrix in terms of orthogonal polynomials (the results in this section are valid for a general matrix ). It is well known that this unitary factor can be represented as a product of Givens rotations, see e.g., the isometric Arnoldi process introduced by Gragg [10] and its extension to the class of shifted unitary matrices by Jagels & Reichel [12]. We will describe these Givens rotations by means of orthogonal polynomials. Furthermore, links between orthogonal polynomials and the GMRES algorithm are discussed, leading to an updating formula to compute the GMRES residuals progressively. In section 3 the upper-separable structure of the Hessenberg matrix related to the Arnoldi process applied to a -matrix is investigated. It is shown how this upper-separable structure can be generated by the GMRES residual vectors, allowing to construct a short recurrence relation and an accompanying algorithm. In section 4 we compare our findings with those presented in [2]. Section 5 discusses some computational reductions that can be made in case the matrix is nearly unitary or nearly shifted unitary. Finally, section 6 discusses the numerical performance and stability of the algorithm.
Throughout this article we will make use of the following notation. Vectors are written in bold face lower case letters, e.g., , and . The vector denotes the th column of an identity matrix of applicable order. The standard inner product is denoted as , with the Hermitian conjugate. We write for the induced Euclidean norm as well as the subordinate spectral matrix norm. Matrices are denoted by upper case letters , and denotes the identity matrix of order . We will frequently express formula (1.2) in the form
[TABLE]
where
[TABLE]
and , revealing the nested structure of the Arnoldi matrices and .
2 Towards a progressive GMRES method
Many Krylov space methods and in particular the Arnoldi method can be described in polynomial language which reveals some particular properties, and makes the link to the rich theory of orthogonal polynomials. For example, from (1.2) one deduces by recurrence on the degree that, for any polynomial
[TABLE]
illustrating that Krylov spaces are intimately related to polynomials. See e.g., [16, §6.6.2] for the case of Hermitian . However, polynomial language can be used also for general matrices [3, §1.3]. In §2.1 we will see that the Arnoldi vectors correspond to a (finite) family of polynomials which are orthogonal with respect to
[TABLE]
a scalar product on the set of polynomials of degree , with the termination index of the Arnoldi method. The scalar product (2.2) induces a norm . This implies in particular the well known fact [16, Proposition 6.7] that Arnoldi vectors are normalized FOM residuals. In §2.2 we will use polynomial language to give an explicit expression for the -factor in a -decomposition of an upper Hessenberg matrix. Such a formula for a unitary upper Hessenberg matrix is not known in literature, though of course there is a close link with Gragg’s explicit formula in terms of Givens rotations [16, 10, §6.5.3]. This formula will enable us to deduce in Corollary 2.4 a decay property of the entries of far from the main diagonal, and reveals immediately a rank structure for . In §2.3 we recall that GMRES residuals can be expressed in terms of orthogonal polynomials: we are faced with a well-studied extremal problem for general orthogonal polynomials. This allows us in §2.4 to establish a well known link between (normalized) FOM and GMRES residuals [16, §6.5.5], allowing for a recursive computation of (normalized) GMRES residuals. Such a progressive GMRES implementation has been discussed before [16, §6.5.3] and [4]. The implementation presented in this article is inspired by the work of Beckermann & Reichel [4]. Both implementations make use of the decomposition of into a product of Givens rotations, but differ in how to find the angles of these rotations. We will consider in §2.4 not only FOM and GMRES for systems of linear equations but more generally for shifted systems for some parameters . All findings of this section hold for general matrices .
2.1 Orthogonal polynomials linked to the Arnoldi process
Given an upper Hessenberg matrix with positive real entries on the subdiagonal. We define polynomials recursively through the formula
[TABLE]
Direct computation yields the following well known link with the characteristic polynomials of the principal submatrices:
[TABLE]
showing that is of degree , with positive leading coefficient. In what follows we will write (2.3) in the form
[TABLE]
where is the principal minor of and . One deduces from (2.3) by recurrence on that . Assume we apply the Arnoldi process to a matrix , that is the Arnoldi termination index and the underlying Hessenberg matrix. Using (2.1), this implies
[TABLE]
and thus indeed the is the th orthonormal polynomial with respect to the scalar product (2.2). Also, from (2.4) we see that the th FOM iterate for the shifted system exists if and only if , and that in this case the th FOM residual is given by , compare with [16, Proposition 6.7].
2.2 The -factorization of a Hessenberg matrix
We will derive an explicit formula for the unitary factor in the -decomposition of the upper Hessenberg matrix in terms of the orthonormal polynomials . To our knowledge, such a result is new. It could also be potentially useful for studying the convergence of the -method with shifts.
Let be the unitary factor in the -decomposition of , i.e.,
[TABLE]
an upper triangular matrix with positive real entries on its main diagonal. It is well known (see, e.g., [16, Subsection 6.5.3]) that can be obtained as a product of Givens rotations, which are applied to to annihilate the first subdiagonal. We follow [4], imposing the matrices to have determinant 1.
Definition 2.1**.**
Let , and define for ,
[TABLE]
with and , such that (2.7) holds.
Proposition 2.2**.**
Let , and
[TABLE]
Then the unitary factor is given by
[TABLE]
Moreover,
[TABLE]
Proof.
We leave it to the reader to check that the candidate for indeed has orthonormal rows. It remains to check the subdiagonal and diagonal entries of . For we find that
[TABLE]
where the second equality is because of the fact that only the first entries of are nonzero and the third equality because of (2.5). Similarly, for ,
[TABLE]
Finally, for ,
[TABLE]
according to (2.5). To prove (2.8), observe that
[TABLE]
if and only if by multiplying on the left with \left[\begin{array}[]{rr}\overline{c_{k}(\delta)}&s_{k}(\delta)\\ -s_{k}(\delta)&c_{k}(\delta)\end{array}\right] we transform
[TABLE]
into
[TABLE]
the latter being true for and as in (2.8). ∎
Notice that, according to the nested structure of the Hessenberg matrices, is also upper triangular, but its last diagonal entry given by
[TABLE]
is not necessarily a positive real number. Hence, for obtaining the unitary factor in the unique -decomposition of we should rescale the last row of as given in Proposition 2.2 by a phase of modulus .
Let us consider the special case of and unitary as a running example.
Example 2.3**.**
Suppose that and is unitary. Then and thus has orthonormal columns, showing that and . From Proposition 2.2 we get explicit formulas for the entries of , in particular, for unitary ,
[TABLE]
Furthermore, . Therefore,
[TABLE]
with . Also, . Hence, the orthonormal vectors can be constructed using two short recurrence relations:
Note that and . The above double recurrence relation is known as the ‘Isometric Arnoldi algorithm’ designed by Gragg [10].
As a consequence of Proposition 2.2, according to Definition 1.1, we can derive some statements on the rank structure of the unitary Hessenberg matrix , and on a decay property of its entries.
Corollary 2.4**.**
* is -upper-separable111This implies that is quasi separable, but in general not semiseparable.. Moreover, for the submatrix of formed with the first rows and the last columns we have that*
[TABLE]
Proof.
The first statement follows by observing that
[TABLE]
with strictly lower triangular, i.e., with zero entries on the main diagonal. This is a direct consequence of Proposition 2.2. In particular, we deduce that is of rank . From Proposition 2.2 it follows that
[TABLE]
giving the claimed result. ∎
We end this subsection by observing that Corollary 2.4 immediately implies a rank property as well as a decay of entries for resolvents of .
Corollary 2.5**.**
Suppose that is invertible. Then is -upper-separable.
Moreover, for the submatrix of formed with the first rows and the last columns we have that
[TABLE]
Proof.
Let us write with upper triangular and invertible . Then
[TABLE]
with lower triangular of norm . Replacing by (2.12) yields
[TABLE]
with strictly lower triangular; proving the first statement. This implies that
[TABLE]
and thus . Notice that is obtained by multiplying the first row of with the last columns of . As is lower triangular, , a row vector formed with the last entries of the first row of and the lower-right minor of . Therefore, applying Corollary 2.4 to yields
[TABLE]
which together with and proves the second statement. ∎
2.3 The GMRES residual and orthogonal polynomials
We will give some more details on the link between the GMRES residual vectors and orthogonal polynomials. More specifically, we will write the GMRES residual vector as a linear combination of Arnoldi vectors. In the particular case of unitary , an even nicer relation arises for the GMRES residual vectors.
In what follows we denote by the th GMRES residual for the shifted system , with starting vector , and denote by its normalized version.
Proposition 2.6**.**
The th GMRES residual for the shifted system with starting vector , can be written as
[TABLE]
For its normalized version we have
[TABLE]
Proof.
Since we choose as starting vector , we find the initial GMRES residual . Then we have
[TABLE]
Note that can be written as with a polynomial of degree at most and . Then , , with
[TABLE]
where denotes the set of polynomials of degree at most , and we use the norm induced by (2.2). Note that if , then by orthonormality. Therefore, the Cauchy-Schwarz inequality yields
[TABLE]
where the minimum is attained for . Combining (2.18) and (2.17) we conclude that (2.14) holds. In particular, , which together with (2.6) implies (2.15). ∎
Remark 2.7**.**
As , we see that the decay rates in Corollary 2.4 and Corollary 2.5 are linked to the convergence of the GMRES algorithm. If the GMRES algorithm converges faster, the decay pattern becomes more pronounced.
Remark 2.8**.**
By taking norms in (2.14), we see that for the relative GMRES residual
[TABLE]
More generally, for the decay rates in Corollary 2.4 and Corollary 2.5 for , we have
[TABLE]
Let be a simply connected and compact -spectral set for ; that is, for all polynomials (and hence ) with . Also, let be a map, mapping conformally onto the exterior of the closed unit disk. Then it can be shown that the right-hand side of (2.19) can be bounded above by times a modest constant, see, e.g., [16, Chapter 6.11.2] for the case where is an ellipse.
Example 2.9**.**
As in Example 2.3, assume the matrix is unitary. Given a polynomial of degree , its reversed polynomial is defined as
[TABLE]
The orthogonal polynomials can be expressed as
[TABLE]
Note that is the leading coefficient of . As is unitary, . Hence, (2.20) can be rewritten as
[TABLE]
Therefore,
[TABLE]
Because of (2.4), . Also, . Combining this with (2.17) and (2.14), (2.21) yields
[TABLE]
Therefore, the th normalized GMRES residual of a unitary matrix can be expressed as
[TABLE]
2.4 A progressive GMRES residual formula
By (2.15) we have that the th normalized GMRES residual satisfies
[TABLE]
the last equality following from (2.8). This demonstrates the existence of an updating formula to compute the residual vectors progressively. This formula can also be derived by means of the -factorization of , see Proposition 2.2 and [16, Subsection 6.5.3]. The next result shows that it is not necessary to compute such a factorization for obtaining and , if one is willing to compute the additional scalar product (2.25).
Proposition 2.10**.**
Define . Then
[TABLE]
and
[TABLE]
Proof.
From Proposition 2.2 and (2.15) it follows that
[TABLE]
Therefore,
[TABLE]
establishing (2.25). We now claim that
[TABLE]
which is a direct consequence of (2.16):
[TABLE]
where .
It remains to prove (2.26). Taking inner products with in all terms of (2.24) and making use of (2.25) and (2.32), results in
[TABLE]
Together with , this yields (2.26). ∎
Example 2.11**.**
Let us return to the particular case of and unitary as discussed in Examples 2.3 and 2.9. Inserting (2.6) and (2.23) in (2.24) and identifying the underlying polynomials gives
[TABLE]
Since is times a polynomial of degree and , we get from (2.25) that
[TABLE]
and , where we applied (2.22) and (2.9). Notice that this simplification for unitary is in accordance with (2.26). Taking the star operation in (2.34) gives the second relation
[TABLE]
We should mention that, in case of unitary , the scalar product (2.2) can be written as a scalar product in terms of a (discrete) measure supported on the unit circle. Thus we have the whole theory of orthogonal polynomials on the unit circle, and in particular relations (2.34) and (2.35) are known as the Szegő recurrence relations, a coupled two-term recurrence for orthonormal polynomials on the unit circle [11, Formula 1.2-1.7].
3 The Arnoldi process for -matrices
We will establish a fast variant of the Arnoldi process which is applicable to the class of -matrices as described by formula (1.1). In §3.1 we will use the explicit representation of in terms of orthogonal polynomials as stated in (2.13) to demonstrate that the underlying Hessenberg matrix has an upper-separable structure. The assumption on simple poles makes it possible to easily express generators for the upper-separable structure in polynomial language. In particular, the GMRES residual vectors, which can be expressed in terms of orthogonal polynomials by (2.15), are showing up. In §3.2 we will see that the orthonormal basis vectors (up to a correction term incorporating the low rank perturbation in (1.1)) can be written as a linear combination of previously computed basis vectors and GMRES residual vectors. In §3.3 the results from §3.2 and §2.4, enabling to compute the necessary GMRES residual vectors progressively, will be combined into a new Arnoldi iteration for -matrices.
3.1 Structure formula for -matrices
A rank structure revealing formula for -matrices will be derived in terms of the orthogonal polynomials defined in §2.1. This formula (3.3) will be the key for the design of short recurrence relations in the following subsection. We first show that the -structure (1.1) is inherited by the underlying Hessenberg matrix.
Proposition 3.1**.**
Let be an invertible matrix satisfying relation (1.1), and denote by the Arnoldi termination index. Then is also a -matrix with the same polynomials and indices , and . More precisely,
[TABLE]
where and .
Proof.
By definition of we have , i.e., the columns of span an invariant subspace of . By recurrence on the degree one shows that , or , for any polynomial . Moreover, if is of full rank, then so is . In particular, since we assume to be invertible, so is the matrix . Hence, . This implies that
[TABLE]
as claimed in (3.1). ∎
A combination with Corollary 2.5 gives us the following result.
Corollary 3.2**.**
Let be a matrix satisfying relation (1.1). Denote by the Arnoldi termination index, and . Then is upper-separable. More precisely,
[TABLE]
where denote the poles of the rational function .
Proof.
The fact that is upper-separable is already known from [14]. Below we give an alternative, more constructive, proof leading to explicit generators for the low rank part of . By assumption, the rational function in (1.1) has simple poles and thus has the partial fraction decomposition
[TABLE]
for some constants , and a polynomial of degree ( if ). Replacing by , taking adjoints and using (3.1) and (2.13) leads to
[TABLE]
with a strictly lower triangular matrix . Finally, according to the upper Hessenberg structure of , the matrix has zero entries on and above the th diagonal, establishing the upper-separable structure and formula (3.2). ∎
Remark 3.3**.**
The assumption on simple poles is not necessary for to have an upper-separable structure. However, once we drop this constraint, it is not clear whether or not there exists a link between the generators of the low-rank structure and the GMRES residual vectors. We refer to [14] for more information on this topic.
Note that for also has an upper-separable structure as it is a leading principal minor (submatrix) of . However, does not satisfy the same matrix equation as .
3.2 Short recurrence relations for -matrices
We will now derive a short recurrence relation for -matrices. To do so, the vector
[TABLE]
is introduced, which is equal to up to a correction term induced by the low rank perturbation in (1.1). In [4] it is shown that for nearly Hermitian , i.e., and in (1.1), the Arnoldi vectors satisfy a three term recurrence relation. More specifically,
[TABLE]
where , and are entries of the underlying Hessenberg matrix. We refer to [4] for a detailed discussion. For a general -matrix, Proposition 3.4 states that is a linear combination of GMRES residual vectors and Arnoldi vectors, including . As we will discuss below, this results in a short recurrence relation which reduces to (3.5) in the specific case of nearly Hermitian matrices and which can be used to compute the Arnoldi vectors in an efficient way.
Proposition 3.4**.**
Let be a matrix satisfying relation (1.1), and denote by the Arnoldi termination index, and . Then for , the vector as defined by (3.4) can be written as a linear combination of for and of for . More precisely,
[TABLE]
for some constants , entries of the underlying Hessenberg matrix , and .
Proof.
Notice that by construction,
[TABLE]
lies in the Krylov space spanned by the columns of . As a result we have
[TABLE]
Define
[TABLE]
Then . Combining the above yields
[TABLE]
The vector consists of the first components of the th column of . As the entries of and in (3.3) are zero on and above the th diagonal, it follows that
[TABLE]
which by (2.15) can be rewritten as
[TABLE]
with . As a result, can be written as , a linear combination of (with coefficients being entries of the underlying Hessenberg matrix), plus a linear combination of . ∎
Remark 3.5**.**
Proposition 3.4 remains true if we replace by or/and by for . This is the direct result of the observation that both and are elements of if and equal to zero if . However, choosing causes to lose orthogonality between and and to loose the link with the entries of the underlying Hessenberg matrix in (3.6).
Next, let us show how the recurrence relation of Proposition 3.4 reduces to the well-known Szegő recurrence if the matrix under consideration is unitary.
Example 3.6**.**
Let us return to the particular case of and unitary as discussed in Examples 2.3, 2.9, and 2.11. Inserting (2.6) and (2.23) in the second Szegő relation (2.35) leads to
[TABLE]
which is exactly the variant with of Proposition 3.4 discussed in Remark 3.5.
Remark 3.7**.**
In [5] a class of matrices, called -well-free matrices is investigated, and it is established that these matrices satisfy the recurrence relation (1.3) with . Briefly described, these matrices form a subset of upper-separable Hessenberg matrices which satisfy an additional constraint preventing a breakdown in the recurrence relation (1.3). Intuitively, this additional “well-free” constraint signifies that there are no rank deficiencies encountered in the low rank part of the upper-separable Hessenberg matrix. The problem of breakdown is discussed in [2] and [14], where it is overcome by making use of a set of several multiple recurrence relations instead of the single recurrence relation (1.3), and the use of an algorithm based on (1.3) which provides a set of column vectors to generate the low rank structure of the underlying Hessenberg matrix, respectively. We will, however, not need any well-free constraint to prevent a breakdown in the recurrence relation stated in Proposition 3.4, as our approach does not impose any limitations on the matrix structure beyond (1.1).
3.3 The algorithm
Algorithm 1, which we will name Fast Arnoldi throughout the text, describes a fast variant of the Arnoldi algorithm for -matrices based on Proposition 3.4. We will give a short description of each of the components of the algorithm and print the corresponding piece of pseudocode.
The idea is to make alternate use of the recurrence relations
[TABLE]
The recurrence relation (3.7) is used to compute the next orthonormal basis vector of the Krylov subspace as a linear combination of previously computed orthonormal vectors as well as GMRES residual vectors, while the recurrence relation (3.8) is used to update the GMRES residual vectors once a new orthonormal vector is retrieved. As (3.7) is only valid for , the first orthonormal vectors are computed by means of the classical Arnoldi iteration.
Each time the relation (3.7) is employed, the vector is formed, causing products between vectors and matrices to be computed. The total complexity to compute is .
; ;
;
;
The coefficients and in (3.7) are the solution of the least squares problem
[TABLE]
Note that for all , allowing to solve the above least squares problem without knowing in advance. To shorten notation in subsequent discussions, we define .
Each of the coefficients are entries of the th column of the corresponding Hessenberg matrix and are computed as , which has a computational complexity of .
for do
; ;
end for
Next, a -decomposition of the matrix is computed, after which the coefficients are retrieved by back-substitution. The complexity of this operation is .
,
such that ;
for do
;
;
end for
; ;
Then for each , , recurrence relation (3.8) is used to update the GMRES residual vectors, which are all equal to the starting vector at the beginning of the iteration (). Each time the relation (3.8) is employed, a matrix vector product needs to be computed. This leads to a total complexity of .
for do
;
;
;
;
end for
Note that it is numerically more stable if we do not divide by the square root
in the computation of and , but instead normalize after each iteration (this however, leads to additional scalar products). If we assume the matrix under consideration is sparse; allowing a computational complexity of to compute a matrix vector product; the total complexity to compute the first orthonormal Arnoldi vectors can be estimated as .
Remark 3.8**.**
If the rational function in (1.1) has only one pole, i.e., in (3.6) then the order in which the coefficients are determined can be reversed. More precisely, we can first compute as and then orthonormalize the resulting difference against to obtain . This might be of influence on the numerical performance.
4 Connection with the Barth-Manteuffel multiple recurrence relation
The aim of this section is to show how our work is related to that of Barth & Manteuffel in their article on ‘Multiple recursion conjugate gradient algorithms’ [2]. They introduce an economical conjugate gradient algorithm for the class of -matrices, by making use of short recurrence relations. We will give a short summary of their findings and discuss both the differences and similarities with our approach.
To prevent the possibility of a breakdown in their so-called ‘single recurrence relation’ Barth & Manteuffel rewrote it as a set of recurrence relations, that are stated in (4.1)-(4.3).
[TABLE]
Unfortunately, they use different letters, shift indices, and construct an orthogonal, but not orthonormal basis of the Krylov space. The new normalization comes from the fact that they consider a Hessenberg matrix which has ones on its first subdiagonal. Their basis vectors satisfy . Moreover, they use the integers instead of . Also, as seen in (4.1)-(4.3), two other families of vectors with double indices are used. For simplicity and consistency we will abbreviate them as
[TABLE]
As the reader will see below, to compare our approach with [2], we will not explicitly make use of (4.1)-(4.3), but instead make use of a mathematical equivalent of (4.1)-(4.3) which is adapted to our notation and scalings. The original pseudocode used by Barth & Manteuffel is stated in Algorithm 2.
In [2, Eqn. (4.16)] the authors provide an explicit formula for the entries of the upper Hessenberg matrix :
[TABLE]
for all , in which the reader recognizes generators and for the low-rank part of . Note that, in contrast to [2], we start numbering with instead of . To be able to use the recurrence relations (4.1)-(4.3) in practice, the generators and need to be known in advance. Therefore, Algorithm 2 is based on a rewritten form of the recurrence relations (4.1)-(4.3) which enables to compute the orthogonal basis vectors and the generators and simultaneously.
We define and [2, Eqn. (4.22) and Eqn. (4.23)] as
[TABLE]
for , which is the equivalent of (4.2)-(4.3). With a similar reasoning as in [2, Eqn. (5.3)] it can be proven that
[TABLE]
which is mathematically equivalent to (4.1).
Suppose now that the generators as defined in (4.4) are known. Then one can use (4.6) to compute out of and , then use (4.5) to compute out of and , and so on. Hence, it remains to find the generators. Two of them can be computed with an explicit formula [2, Eqn. (4.15) and (4.13)], namely
[TABLE]
The vector is obtained [2, Eqn. (4.12)] as the ‘remainder’ in the polynomial division of (2.6) by the denominator (1.1):
[TABLE]
where we observe that form the canonical basis of , and thus . We may rewrite the remainder in terms of the Lagrange polynomials of the roots of , leading to
[TABLE]
Substituting the expression for into (4.5) allows to conclude that
[TABLE]
Recalling that gives
[TABLE]
which makes a partial link between (4.6) and Proposition 3.4. In particular, if the matrix is unitary and no low-rank perturbation is involved, is a multiple of and the Barth-Manteuffel multiple recurrence relation turns out to be equivalent to the Szegő recurrence relations. The quantities are computed recursively [2, Eqn. (5.11)] by computing all entries of , and by taking remainders after division by in the relation , the polynomial translation of the Arnoldi relation . We refer to lines and of Algorithm 2.
It remains to show how to compute (after having computed ) and relate the term in (4.6) to the term of our short recurrence of Proposition 3.4. In fact, at this place the authors of [2] require an additional delay in the recurrence (4.6) by replacing by , which is possible according to (4.4). According to [2, Eqn. (5.2)] and (4.4), the computation of is done by solving the system
[TABLE]
We refer to line of Algorithm 2. However, there is a possible problem with this system which is not discussed in [2]. As noticed after [2, Eqn. (5.2)], it is consistent, but one may not insure that the matrix is invertible, i.e., we might have several solutions, each of them being a generator suitable for , but not necessarily for . This is a general problem with computing generators for in a recursive manner: there is no guarantee that is equal to , and thus whether the minimal number of generators for is equal to . In addition, at stage of a recursive computation, it might happen that the minimal number of generators for is strictly lower, i.e., , i.e., we should have a depending on . Finally, the matrix of coefficients is just obtained by picking the last columns of which might also lower the rank. However, going through the proof of (4.4) we can derive an explicit formula for . From [2, Eqn. (4.14)] it can be deduced that
[TABLE]
Combining (4.12) and (4.8), we obtain
[TABLE]
Note that (4.13) could be used to compute without introducing the additional delay in (4.6). Inserting (4.13) into (4.5) gives the explicit formula
[TABLE]
In the following remark we intend to compare our approach with that of Barth & Manteuffel [2].
Remark 4.1**.**
- (a)
Both approaches heavily use the fact that a certain upper right part of the Hessenberg matrix is of rank at most . In other words, one is able to express as a linear combination of correction vectors plus a linear combination of the last or columns of , a kind of corrected “short” recurrence. Notice however that our recurrence is “shorter” if . 2. (b)
In our approach, correction vectors are explicitly given (the term ) and can be updated explicitly. They are not necessarily linearly independent. The other correction vectors are identified as GMRES residuals for shifted systems, allowing for easy updating.
In contrast, Barth & Manteuffel compute explicitly the four sequences of generators of the low-rank structure of , given in (4.4). Notice, however, that at the th step of the algorithm one can only deduce generators for and not for . As mentioned above, in order not to be obliged to correct generators found earlier, there should be an additional assumption on not mentioned by the authors. However, there is a variant of the Barth & Manteuffel approach: instead of using (4.11) requiring a delay in the recurrence relation (4.6), one can use (4.13), which is not mentioned in **[2]**, to compute just after having computed . 3. (c)
In our approach, for finding the coefficients of the GMRES correction vectors, we suggest to solve a least square problem, the matrix of coefficients having as columns these (normalized) GMRES correction vectors. Notice that has full column rank (since has), but might be ill-conditioned. Thus standard techniques (SVD dropping small singular values, or decomposition with column pivoting and threshold) can be applied, where the residual error in solving this least-square problem leads to a loss of orthogonality for of the same order.
*In contrast, Barth & Manteuffel suggest one of the missing generators by solving system (4.11). The computation of the other generator is quite involved and requires the knowledge of the whole *th column of the Hessenberg matrix (which is not necessarily computed using our approach). 4. (d)
If GMRES is converging fast, we believe that the normalization (4.10) is not appropriate since
[TABLE]
Also the are very small due to the above-mentioned decay property of the entries of our Hessenberg matrix.
5 Some special matrices
In this section we give special attention to some classes of matrices where we can slightly reduce the computational complexity of Algorithm 1. The first class consists of matrices which satisfy an equation of the form
[TABLE]
for some . This includes the class of normal matrices of which all but eigenvalues are collinear. We will address this kind of matrices as nearly Hermitian matrices. If and is real, this corresponds to the class of nearly symmetric matrices as discussed in [4]. However, one can easily check that all results derived in [4] are also valid for a matrix of the form (5.1).
The second class consists of matrices which satisfy an equation of the form
[TABLE]
for some . This includes the class of normal matrices of which all but eigenvalues are concyclic. If , we speak of nearly unitary matrices, if we speak of nearly shifted unitary matrices.
The class of matrices satisfying (3.5) or (5.2) include all examples of -matrices known to us which are of practical interest. By this we mean, matrices that are suitably large with respect to the quantities and . More information on matrices satisfying equation (1.1) can be found in [13].
5.1 Nearly Hermitian matrices
Assume the matrix is nearly Hermitian. Then (3.6) reduces to
[TABLE]
Define the vectors recursively as
[TABLE]
for and . By recurrence on it follows that . Therefore, in combination with (2.31),
[TABLE]
Then (2.25) yields
[TABLE]
the latter equality because of (5.3) and (5.5). Expression (5.6) can now be used to compute instead of (2.25), reducing the computational complexity222Expression (5.6) was also proved alternatively By Beckermann and Reichel [4, Proposition 4.2]..
5.2 Nearly unitary matrices
Assume the matrix is nearly unitary. Then (3.6) reduces to
[TABLE]
where and such that is of unit length. Again we make use of the vector as defined in (5.4). Then because of (5.5), (2.25) yields
[TABLE]
Expression (5.8) can now be used to compute instead of (2.25).
5.3 Nearly shifted unitary matrices
Assume the matrix is nearly shifted unitary. Then (3.6) reduces to
[TABLE]
where and , are entries of the corresponding Hessenberg matrix. From (2.24) we deduce that
[TABLE]
Hence, due to (2.24) and (5.9),
[TABLE]
Finally, we know that
[TABLE]
with recursively defined as
[TABLE]
for and . Making use of (5.10), (5.11) and (5.12), (2.25) yields
[TABLE]
As before, the above expression can now be used to compute instead of (2.25).
6 Numerical examples
In this section we will compare our fast Arnoldi algorithm with the one of Barth-Manteuel and classical Arnoldi. We focus especially on the orthogonality of the obtained Arnoldi vectors. The orthogonality in the forthcoming figures is measured by a method described originally by Paige [6, 15]. Given with strictly upper triangular, we define . The norm of is used as an orthogonality measure for the columns of , i.e., where when they are orthonormal and when they are linearly dependent [15]. To make the fairest possible comparison with the Barth-Manteuffel algorithm, we implemented their pseudocode as stated in [2] and recalled in Algorithm 2. However, their pseudocode returns an orthogonal basis, while recurrence relation (3.6) returns an orthonormal basis. Therefore, we have normalized these vectors first.
We will start in §6.1 and §6.2 by discussing the special case of nearly unitary (with ) and nearly shifted unitary (with shift ), where we replaced in our BML-Arnoldi algorithm formula (2.23) for the computation of by the less expensive formulas described in §5.2, and §5.3, respectively. Subsequently we report in §6.3 about an example of a nearly Hermitian matrix discussed already in [6].
Quite often there is some correlation between loss of orthogonality between Arnoldi vectors and convergence of the GMRES residual of the shifted system with starting vector . This phenomenon is probably related to the decay properties mentioned in Remark 2.7. We therefore draw in each of the figures below the relative GMRES residual
[TABLE]
the last identity following from (2.8). Notice that the quantities are already computed in the BML-Arnoldi algorithm in the case of §6.1 and §6.2, whereas in §6.3 we have to add the computation of , here for , following the formulas given in §5.1. One may understand (6.1) as the recursive computation of the GMRES residuals following some progressive residual scheme, where the underlying least squares problem is solved by successive Givens rotations. We will refer to this residual in the forthcoming figures as the progressive residual. However, due to loss of orthogonality, it might be that these progressive residuals are badly computed. This is why each time we display also the ”exact” relative GMRES residual, obtained by computing the th iterate of GMRES for the shifted system with starting vector via the black box routine of Matlab (which does not use our Arnoldi vectors but recomputes them via full Arnoldi, and solves the least squares problem via Householder transforms). It turns out that, in all our numerical experiments, that when both Arnoldi and fast Arnoldi behave well, that the progressive residual and the GMRES residual exhibit the same convergence history.
All computations were carried out in Matlab R2015a. As a starting vector for the Krylov subspace we always consider a vector that has normally distributed random entries with mean zero and variance one.
6.1 Perturbed diagonal and unitary matrices
We consider diagonal matrices for which all but eigenvalues lie on a circle. Clearly, such matrices satisfy equation (1.1) with . We considered various cases; for each case we show the eigenvalues of the matrix and a comparison of the orthogonality of the computed Arnoldi vectors for classical Arnoldi, Barth-Manteuffel, and the fast Arnoldi method. The legend is plotted in Figure 1 and is identical for all similar graphs in this section.
In the first experiment, see Figure 1, we consider eigenvalues on three quarter of the unit circle. Clearly the full Arnoldi and fast Arnoldi perform best and in all the tests we ran the orthogonality of the computed vectors was comparable. Other experiments revealed that Barth-Manteuffel performed just slightly worse when considering eigenvalues distributed over the entire unit circle, the performance of Barth-Manteuffel started to degrade when segments were excluded from the unit circle. The progressive residual seems to align almost perfectly with the GMRES residual. We have also tested various radii and similar conclusions hold when the radius of the circle is changed. 2. 2.
In the second experiment, see Figure 2, we have shifted the unit circle in the complex plane. Barth-Manteuffel seems to have problems with this case, Arnoldi, and the fast Arnoldi method on the other hand exhibit good accuracy. The progressive residual and the GMRES residual align again almost perfectly. 3. 3.
In a third experiment, see Figure 3, all eigenvalues except for two are located on the unit circle. In this case Barth-Manteuffel outperforms our code slightly. We note that the location of the eigenvalues outside the circle does not have a significant impact on the overall picture of the accuracy. Tests revealed also that if one would shift the midpoint or exclude eigenvalues out of parts of the circle the fast Arnoldi method would outperform Barth-Manteuffel.
Overall we can conclude that the Arnoldi method is the most accurate one and the fast Arnoldi method is also typically quite close. The Barth-Manteuffel algorithm, however, exhibits quick loss of orthogonality when the circle is shifted or the eigenvalues do not span the entire circle. Moreover, the progressive residual and the GMRES residual align almost perfectly.
We also considered a real non-normal matrix which is of the form
[TABLE]
for some vectors and and a unitary matrix . Using the Sherman-Morrison formula [9], equation (6.2) yields
[TABLE]
Hence,
[TABLE]
The orthogonality of the computed Arnoldi vectors was examined for a matrix of the form (6.2) where the unitary matrix and the vectors and are randomly generated. In this case the orthogonality behaved similar to Figure 1, implying that the fast Arnoldi behaves similar to Arnoldi and Barth-Manteuffel deteriorates.
6.2 Unitary matrix from Quantum Chromodynamics
We consider a shifted unitary matrix which finds its origin in Quantum Chromodynamics (QCD)[1]. QCD is the theory which describes the fundamental interaction between quarks, which are the building blocks of protons and neutrons. This theory makes use of the Neuberger overlap operator , where and are scalars and is the Hermitian Wilson fermian matrix. As a result the Neuberger overlap operator is a shifted unitary matrix. To construct the matrix a parameter and a hopping matrix are needed. We have selected these parameters equal to the ones from [1], i.e. and as hopping matrix conf5.0-00.14x4-2600.mtx from the Matrix Market333A repository of test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services..
To compute , we invoke the software package designed by Arnold, et al., [1], which makes use of the Zolotarev algorithm [17]. We choose parameters and for the Neuberger overlap operator. On the left of Figure 4 we have drawn the eigenvalues of the Neuberger overlap operator and observe that the density of the eigenvalues is much higher on the right than on the left.
The orthogonality of the computed Arnoldi vectors is depicted in Figure 4. Approximately the first ten iterations of the recurrence relation (3.6) fast Arnoldi and the Barth-Manteuffel algorithm show comparable accuracy. After that, the orthogonality of the vectors computed with the Barth-Manteuffel algorithm deteriorates fast at almost the same rate as classical Arnoldi. Even though the progressive residuals and the GMRES residual align, classical Arnoldi seems to suffer heavily from loss of accuracy. The fast Arnoldi method clearly outperforms the other approaches.
6.3 Departure from orthogonality
Beckermann & Reichel [4] proposed a Krylov subspace method for solving a linear system in which the coefficient matrix is nearly Hermitian. Their method, based on a short recurrence for generating an orthonormal Krylov basis, is better known as Progressive GMRES, shortly named PGMRES. For nearly Hermitian matrices, this short recurrence coincides with the recurrence relation (3.6) described above.
However, Embree, et al., [6] showed how in certain cases the PGMRES method exhibits an instability which finds its origin in the loss of orthogonality of the computed Arnoldi vectors. A specific class of examples is described and the corresponding departure from orthogonality is shown when using the recurrence relation (3.6). In the forthcoming experiments we have used the algorithm for nearly Hermitian matrices as presented in Section 5.1.
This class of matrices is of the form
[TABLE]
where
[TABLE]
with eigenvalues
uniformly distributed in the interval , 2. 2.
uniformly distributed in the interval , 3. 3.
.
We take two examples from this class and compare the loss of orthogonality of the recurrence relation (3.6) as predicted in [6] with the Barth-Manteuffel algorithm. The orthogonality of the Arnoldi vectors stored in the matrix is depicted in Figure 5.
As seen in Figure 5 recurrence relation (3.6) gives rise to significantly less orthogonal vectors than the standard Arnoldi iteration. However, it may also be observed that the Barth-Manteuffel algorithm suffers from the same loss of orthogonality as recurrence relation (3.6). We see that the loss of orthogonality emerges as soon as the progressive residual vectors start to differ significantly from the actual GMRES residual, explaining the inaccuracies in the computed vectors. Figure 6 shows the gradual loss of orthogonality between the vectors. White stands for a perfect orthogonality, black for complete loss, The colors assigned are, for black (no orthogonality) and for white (orthogonal up to machine precision). In Figure 6 we observe that is numerically orthogonal to for , , and also at later stages for , as expected from the local reorthogonalization of our algorithm. However, globally, the orthogonality gets quite quickly lost, as observed already by Embree et al. [6], who suggested Schur complement techniques to tackle this problem.
We can conclude that in this case both Barth-Manteuffel and fast Arnoldi exhibit a fast and almost identical loss of orthogonality.
7 Conclusion
An economic variant for the Arnoldi algorithm has been established for matrices whose adjoint is a low-rank perturbation of a rational function of the matrix. In the process, some aspects of the Arnoldi process are described in terms of orthogonal polynomials. This includes an explicit formula for the unitary factor in the -decomposition of a Hessenberg matrix and a decay property of the entries of this Hessenberg matrix which is related to the convergence of the GMRES algorithm. Also, the existence of a progressive GMRES residual formula has been shown, extending the findings of [4]. Furthermore, comparisons are made with the algorithm described by Barth and Manteuffel [2] for matrices whose adjoint is a low-rank perturbation of a rational function of the matrix, both theoretically and numerically.
8 Acknowledgements
Part of this research has been established during a visit at the university of Science and Technology of Lille in 2014. We thank Bernd Beckermann and Ana Matos for their hospitality.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] G. Arnold, N. Cundy, J. van den Eshof, A. Frommer, S. Krieg, Th. Lippert, and K. Shäfer. Numerical methods for the QCD overlap operator II: Optimal Krylov subspace methods, in QCD and Numerical Analysis III, eds. A. Boricci, A. Frommer, B. Joo, A. Kennedy and B. Pendleton , volume 47 of Lecture Notes in Computational Science and Engineering . Springer, Berlin, 2005.
- 2[2] T. Barth and T. Manteuffel. Multiple recursion Conjugate Gradient algorithms Part I: Sufficient conditions. SIAM Journal on Matrix Analysis and Applications , 21(3):768–796, 2000.
- 3[3] B. Beckermann. Discrete orthogonal polynomials and superlinear convergence of Krylov subspace methods in numerical linear algebra. Orthogonal Polynomials and Special Functions, Lecture Notes in Mathematics , 1883:119–185, 2006.
- 4[4] B. Beckermann and L. Reichel. The Arnoldi process and GMRES for nearly symmetric matrices. SIAM Journal on Matrix Analysis and Applications , 30(1):102–120, 2008.
- 5[5] T. Bella, V. Olshevsky, and P. Zhlobich. Classifications of recurrence relations via subclasses of (H,m)-quasiseparable matrices. Lecture Notes in Electrical Engineering , 80:23–53, 2011.
- 6[6] M. Embree, J.A. Sifuentes, K.M. Soodhalter, D.B Szyld, and F. Xue. Short-term recurrence Krylov subspace methods for nearly Hermitian matrices. SIAM Journal on Matrix Analysis and Applications , 33(2):480–500, 2012.
- 7[7] V. Faber, J. Liesen, and P. Tichý. The Faber-Manteuffel theorem for linear operators. SIAM Journal on Numerical Analysis , 46(3):1323–1337, 2008.
- 8[8] V. Faber and T. Manteuffel. Necessary and sufficient conditions for the existence of a Conjugate Gradient method. SIAM Journal on Numerical Analysis , 21(2):352–362, 1984.
