On a fast Arnoldi method for BML matrices

Bernhard Beckermann (LPP); Clara Mertens; Raf Vandebril

arXiv:1702.00671·math.NA·February 3, 2017

On a fast Arnoldi method for BML matrices

Bernhard Beckermann (LPP), Clara Mertens, Raf Vandebril

PDF

Open Access

TL;DR

This paper introduces a fast Arnoldi method tailored for BML matrices, leveraging GMRES residuals and low-rank structures to efficiently generate Krylov bases and analyze Hessenberg matrices.

Contribution

It presents a novel short recurrence Arnoldi algorithm for BML matrices using GMRES residuals and explores the low-rank structure of the Hessenberg matrix.

Findings

01

Efficient Krylov basis generation for BML matrices

02

Short recurrence relation based on GMRES residuals

03

Low-rank structure of the Hessenberg matrix

Abstract

Matrices whose adjoint is a low rank perturbation of a rational function of the matrix naturally arise when trying to extend the well known Faber-Manteuffel theorem, which provides necessary and sufficient conditions for the existence of a short Arnoldi recurrence. We show that an orthonormal Krylov basis for this class of matrices can be generated by a short recurrence relation based on GMRES residual vectors. These residual vectors are computed by means of an updating formula. Furthermore, the underlying Hessenberg matrix has an accompanying low rank structure, which we will investigate closely.

Equations239

A^{*} = p (A) q (A)^{- 1} + F G^{*},

A^{*} = p (A) q (A)^{- 1} + F G^{*},

F = [f_{1}, f_{2}, \dots, f_{m_{3}}], G = [g_{1}, g_{2}, \dots, g_{m_{3}}] \in C^{n \times m_{3}}

F = [f_{1}, f_{2}, \dots, f_{m_{3}}], G = [g_{1}, g_{2}, \dots, g_{m_{3}}] \in C^{n \times m_{3}}

A V_{k} = V_{k} H_{k} + h_{k + 1, k} v_{k + 1} e_{k}^{*},

A V_{k} = V_{k} H_{k} + h_{k + 1, k} v_{k + 1} e_{k}^{*},

\mbox r ank \leavevmode B (1 : j - min (r, 0), j + max (r, 0) : n) \leq s .

\mbox r ank \leavevmode B (1 : j - min (r, 0), j + max (r, 0) : n) \leq s .

v_{k + 1} = j = k - m_{3} - m_{2} \sum k α_{k, j} A v_{j} + j = k - m_{3} - m_{1} \sum k β_{k, j} v_{j} .

v_{k + 1} = j = k - m_{3} - m_{2} \sum k α_{k, j} A v_{j} + j = k - m_{3} - m_{1} \sum k β_{k, j} v_{j} .

A V_{k} = V_{k + 1} \underline{H}_{k}, with V_{k + 1} := [V_{k}, v_{k + 1}],

A V_{k} = V_{k + 1} \underline{H}_{k}, with V_{k + 1} := [V_{k}, v_{k + 1}],

\underline{I}_{k}:=\left[\begin{array}[]{c}I_{k}\\ \mathbf{0}\end{array}\right]\in\mathbb{C}^{(k+1)\times k},\quad\underline{H}_{k}:=\left[\begin{array}[]{c}H_{k}\\ h_{k+1,k}\mathbf{e}_{k}^{\ast}\end{array}\right]=H_{k+1}\underline{I}_{k}\in\mathbb{C}^{(k+1)\times k},

\underline{I}_{k}:=\left[\begin{array}[]{c}I_{k}\\ \mathbf{0}\end{array}\right]\in\mathbb{C}^{(k+1)\times k},\quad\underline{H}_{k}:=\left[\begin{array}[]{c}H_{k}\\ h_{k+1,k}\mathbf{e}_{k}^{\ast}\end{array}\right]=H_{k+1}\underline{I}_{k}\in\mathbb{C}^{(k+1)\times k},

p(A)\mathbf{v}_{1}=V_{k}p(H_{k})\mathbf{e}_{1},\quad\mbox{provided that $\deg p<k$},

p(A)\mathbf{v}_{1}=V_{k}p(H_{k})\mathbf{e}_{1},\quad\mbox{provided that $\deg p<k$},

⟨ p, q ⟩_{A, v_{1}} = ⟨ p (A) v_{1}, q (A) v_{1} ⟩ = (q (A) v_{1})^{*} p (A) v_{1},

⟨ p, q ⟩_{A, v_{1}} = ⟨ p (A) v_{1}, q (A) v_{1} ⟩ = (q (A) v_{1})^{*} p (A) v_{1},

q_{0} (z)

q_{0} (z)

q_{k} (z) h_{k + 1, k}

q_{k} (z) = \frac{1}{\prod _{i = 1}^{k} h _{i + 1, i}} det (z I_{k} - H_{k}),

q_{k} (z) = \frac{1}{\prod _{i = 1}^{k} h _{i + 1, i}} det (z I_{k} - H_{k}),

(q_{0} (z), \dots, q_{k} (z)) \underline{H}_{k} = z (q_{0} (z), \dots, q_{k - 1} (z)),

(q_{0} (z), \dots, q_{k} (z)) \underline{H}_{k} = z (q_{0} (z), \dots, q_{k - 1} (z)),

v_{k} = q_{k - 1} (A) v_{1},

v_{k} = q_{k - 1} (A) v_{1},

Q_{k+1}(\delta)^{\ast}(\underline{H}_{k}-\delta\underline{I}_{k})=\underline{R}_{k}(\delta):=\left[\begin{array}[]{c}R_{k}(\delta)\\ 0\end{array}\right]\in\mathbb{C}^{(k+1)\times k},

Q_{k+1}(\delta)^{\ast}(\underline{H}_{k}-\delta\underline{I}_{k})=\underline{R}_{k}(\delta):=\left[\begin{array}[]{c}R_{k}(\delta)\\ 0\end{array}\right]\in\mathbb{C}^{(k+1)\times k},

Q_{k+1}(\delta)^{\ast}=\Omega_{k+1}(\delta)\left[\begin{array}[]{cc}Q_{k}(\delta)^{\ast}&0\\ 0&1\end{array}\right],\quad\Omega_{k+1}(\delta)=\left[\begin{array}[]{crr}I_{k-1}&0&0\\ 0&\overline{c_{k}(\delta)}&s_{k}(\delta)\\ 0&-s_{k}(\delta)&c_{k}(\delta)\end{array}\right],

Q_{k+1}(\delta)^{\ast}=\Omega_{k+1}(\delta)\left[\begin{array}[]{cc}Q_{k}(\delta)^{\ast}&0\\ 0&1\end{array}\right],\quad\Omega_{k+1}(\delta)=\left[\begin{array}[]{crr}I_{k-1}&0&0\\ 0&\overline{c_{k}(\delta)}&s_{k}(\delta)\\ 0&-s_{k}(\delta)&c_{k}(\delta)\end{array}\right],

σ_{k} (z) := j = 0 \sum k ∣ q_{j} (z) ∣^{2} .

σ_{k} (z) := j = 0 \sum k ∣ q_{j} (z) ∣^{2} .

\left[\begin{array}[]{cccccc}-\frac{q_{0}(\delta)\overline{q_{1}(\delta)}}{\sigma_{1}(\delta)\sigma_{0}(\delta)}&\frac{\sigma_{0}(\delta)}{\sigma_{1}(\delta)}&0&\cdots&\cdots&0\\[5.69046pt] -\frac{q_{0}(\delta)\overline{q_{2}(\delta)}}{\sigma_{2}(\delta)\sigma_{1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{2}(\delta)}}{\sigma_{2}(\delta)\sigma_{1}(\delta)}&\frac{\sigma_{1}(\delta)}{\sigma_{2}(\delta)}&0&\\[5.69046pt] \vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\[5.69046pt] \vdots&\vdots&&\ddots&\ddots&0\\[5.69046pt] -\frac{q_{0}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\cdots&\cdots&-\frac{q_{k-1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\frac{\sigma_{k-1}(\delta)}{\sigma_{k}(\delta)}\\[5.69046pt] (-1)^{k}\frac{q_{0}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{1}(\delta)}{\sigma_{k}(\delta)}&\cdots&\cdots&(-1)^{k}\frac{q_{k-1}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{k}(\delta)}{\sigma_{k}(\delta)}\end{array}\right].

\left[\begin{array}[]{cccccc}-\frac{q_{0}(\delta)\overline{q_{1}(\delta)}}{\sigma_{1}(\delta)\sigma_{0}(\delta)}&\frac{\sigma_{0}(\delta)}{\sigma_{1}(\delta)}&0&\cdots&\cdots&0\\[5.69046pt] -\frac{q_{0}(\delta)\overline{q_{2}(\delta)}}{\sigma_{2}(\delta)\sigma_{1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{2}(\delta)}}{\sigma_{2}(\delta)\sigma_{1}(\delta)}&\frac{\sigma_{1}(\delta)}{\sigma_{2}(\delta)}&0&\\[5.69046pt] \vdots&\vdots&\ddots&\ddots&\ddots&\vdots\\[5.69046pt] \vdots&\vdots&&\ddots&\ddots&0\\[5.69046pt] -\frac{q_{0}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\cdots&\cdots&-\frac{q_{k-1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\frac{\sigma_{k-1}(\delta)}{\sigma_{k}(\delta)}\\[5.69046pt] (-1)^{k}\frac{q_{0}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{1}(\delta)}{\sigma_{k}(\delta)}&\cdots&\cdots&(-1)^{k}\frac{q_{k-1}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{k}(\delta)}{\sigma_{k}(\delta)}\end{array}\right].

s_{k} (δ) = \frac{σ _{k - 1} ( δ )}{σ _{k} ( δ )} and c_{k} (δ) = (- 1)^{k} \frac{q _{k} ( δ )}{σ _{k} ( δ )}, for k \geq 1.

s_{k} (δ) = \frac{σ _{k - 1} ( δ )}{σ _{k} ( δ )} and c_{k} (δ) = (- 1)^{k} \frac{q _{k} ( δ )}{σ _{k} ( δ )}, for k \geq 1.

e_{j}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

e_{j}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

\displaystyle=-\frac{\overline{q_{j}(\delta)}}{\sigma_{j}(\delta)\sigma_{j-1}(\delta)}\bigl{(}q_{0}(\delta),\ldots,q_{j-1}(\delta),\underbrace{0,\ldots,0}_{k+1-j})(\underline{H}_{k}-\delta\underline{I}_{k})\mathbf{e}_{\ell}

+ j 0, \dots, 0, \frac{σ _{j - 1} ( δ )}{σ _{j} ( δ )}, k - j 0, \dots, 0 (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

= - \frac{q _{j} ( δ )}{σ _{j} ( δ ) σ _{j - 1} ( δ )} (q_{0} (δ), \dots, q_{ℓ} (δ)) (\underline{H}_{ℓ} - δ \underline{I}_{ℓ}) e_{ℓ} = 0,

e_{ℓ}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

e_{ℓ}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

= - \frac{q _{ℓ} ( δ )}{σ _{ℓ} ( δ ) σ _{ℓ - 1} ( δ )} (q_{0} (δ), \dots, q_{ℓ - 1} (δ), k + 1 - ℓ 0, \dots, 0) (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

+ ℓ 0, \dots, 0, \frac{σ _{ℓ - 1} ( δ )}{σ _{ℓ} ( δ )}, k - ℓ 0, \dots, 0 (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

= - \frac{q _{ℓ} ( δ )}{σ _{ℓ} ( δ ) σ _{ℓ - 1} ( δ )} (q_{0} (δ), \dots, q_{ℓ - 1} (δ)) (H_{ℓ} - δ I_{ℓ}) e_{ℓ} + \frac{σ _{ℓ - 1} ( δ )}{σ _{ℓ} ( δ )} h_{ℓ + 1, ℓ}

= - \frac{q _{ℓ} ( δ )}{σ _{ℓ} ( δ ) σ _{ℓ - 1} ( δ )} (- q_{ℓ} (δ) h_{ℓ + 1, ℓ}) + \frac{σ _{ℓ - 1} ( δ )}{σ _{ℓ} ( δ )} h_{ℓ + 1, ℓ}

= \frac{h _{ℓ + 1, ℓ}}{σ _{ℓ} ( δ ) σ _{ℓ - 1} ( δ )} (∣ q_{ℓ} (δ) ∣^{2} + σ_{ℓ - 1} (δ)^{2}) = h_{ℓ + 1, ℓ} \frac{σ _{ℓ} ( δ )}{σ _{ℓ - 1} ( δ )} > 0.

e_{k + 1}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

e_{k + 1}^{*} Q_{k + 1} (δ)^{*} (\underline{H}_{k} - δ \underline{I}_{k}) e_{ℓ}

= \frac{( - 1 ) ^{k}}{σ _{k} ( δ )} (q_{0} (δ), q_{1} (δ), \dots, q_{k} (δ)) (\underline{H}_{k} - δ I_{k}) e_{ℓ} = 0,

Q_{k+1}(\delta)^{\ast}=\Omega_{k+1}(\delta)\left[\begin{array}[]{cc}Q_{k}(\delta)^{\ast}&0\\ 0&1\end{array}\right],

Q_{k+1}(\delta)^{\ast}=\Omega_{k+1}(\delta)\left[\begin{array}[]{cc}Q_{k}(\delta)^{\ast}&0\\ 0&1\end{array}\right],

\left[\begin{array}[]{cccccc}(-1)^{k-1}\frac{q_{0}(\delta)}{\sigma_{k-1}(\delta)}&(-1)^{k-1}\frac{q_{1}(\delta)}{\sigma_{k-1}(\delta)}&\cdots&(-1)^{k-1}\frac{q_{k-1}(\delta)}{\sigma_{k-1}(\delta)}&0\\[5.69046pt] 0&0&\cdots&0&1\end{array}\right]

\left[\begin{array}[]{cccccc}(-1)^{k-1}\frac{q_{0}(\delta)}{\sigma_{k-1}(\delta)}&(-1)^{k-1}\frac{q_{1}(\delta)}{\sigma_{k-1}(\delta)}&\cdots&(-1)^{k-1}\frac{q_{k-1}(\delta)}{\sigma_{k-1}(\delta)}&0\\[5.69046pt] 0&0&\cdots&0&1\end{array}\right]

\left[\begin{array}[]{cccccc}-\frac{q_{0}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\cdots&-\frac{q_{k-1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\frac{\sigma_{k-1}(\delta)}{\sigma_{k}(\delta)}\\[5.69046pt] (-1)^{k}\frac{q_{0}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{1}(\delta)}{\sigma_{k}(\delta)}&\cdots&(-1)^{k}\frac{q_{k-1}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{k}(\delta)}{\sigma_{k}(\delta)}\end{array}\right],

\left[\begin{array}[]{cccccc}-\frac{q_{0}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&-\frac{q_{1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\cdots&-\frac{q_{k-1}(\delta)\overline{q_{k}(\delta)}}{\sigma_{k}(\delta)\sigma_{k-1}(\delta)}&\frac{\sigma_{k-1}(\delta)}{\sigma_{k}(\delta)}\\[5.69046pt] (-1)^{k}\frac{q_{0}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{1}(\delta)}{\sigma_{k}(\delta)}&\cdots&(-1)^{k}\frac{q_{k-1}(\delta)}{\sigma_{k}(\delta)}&(-1)^{k}\frac{q_{k}(\delta)}{\sigma_{k}(\delta)}\end{array}\right],

(- 1)^{k + 1} h_{k + 2, k + 1} q_{k + 1} (δ) / σ_{k} (δ)

(- 1)^{k + 1} h_{k + 2, k + 1} q_{k + 1} (δ) / σ_{k} (δ)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Electromagnetic Scattering and Analysis · Theoretical and Computational Physics

Full text

On a fast Arnoldi method for $BML$ -matrices

Bernhard Beckermann

Clara Mertens

and Raf Vandebril

Abstract

Matrices whose adjoint is a low rank perturbation of a rational function of the matrix naturally arise when trying to extend the well known Faber-Manteuffel theorem [8, 7], which provides necessary and sufficient conditions for the existence of a short Arnoldi recurrence. We show that an orthonormal Krylov basis for this class of matrices can be generated by a short recurrence relation based on GMRES residual vectors. These residual vectors are computed by means of an updating formula. Furthermore, the underlying Hessenberg matrix has an accompanying low rank structure, which we will investigate closely.

1 Introduction

In this article we will discuss a new variant of the Arnoldi method applied to a class of sparse matrices $A\in\mathbb{C}^{n\times n}$ which allows to compute the first $k$ Arnoldi vectors in complexity $\mathcal{O}(kn)$ . We will refer to this class of matrices as BML-matrices, following the fundamental work of Barth & Manteuffel [2] and Liesen [13] on matrices $A$ whose adjoint is a low rank perturbation of a rational function of $A$ . More specifically, we assume that

[TABLE]

with $p,q$ polynomials of degree $m_{1}$ and $m_{2}$ , respectively, and

[TABLE]

matrices of full column rank. Moreover, it is assumed that the roots of $q$ are simple. By taking $p(z)/q(z)\in\{z,1/z\}$ , we see that Hermitian matrices and unitary matrices are $BML$ -matrices, and the same is true for low rank perturbations of such matrices. Furthermore, if $FG^{\ast}=0$ in (1.1), the matrix $A$ is normal [13]. In what follows we suppose that $m_{j}\ll n$ , since these quantities (as well as the sparsity pattern of $A$ ) are hidden in the constant of the above-claimed complexity result.

After $k$ steps of the Arnoldi process with initial vector $\mathbf{b}$ one obtains the expression

[TABLE]

with $V_{k}=[\mathbf{v}_{1},\mathbf{v}_{2},\ldots,\mathbf{v}_{k}]$ , $\mathbf{v}_{1}=\mathbf{b}/\|\mathbf{b}\|$ , $\mathbf{v}_{k+1}\in\mathbb{C}^{n}$ and $h_{k+1,k}\geq 0$ satisfying $V_{k}^{\ast}V_{k}=I_{k}$ and $V_{k}^{\ast}\mathbf{v}_{k+1}=0$ . The matrix $H_{k}$ is upper Hessenberg.

A fast variant of the Arnoldi process will exploit additional structure of the upper Hessenberg matrix $H_{k}$ . For example, to compute the successive vectors $\mathbf{v}_{k}$ for Hermitian $A$ we get a tridiagonal $H_{k}$ and the Arnoldi process reduces to the Lanczos method. For matrices $A$ satisfying (1.1), $H_{k}$ turns out to be a rank structured matrix. In order to specify this statement, the following definition is introduced.

Definition 1.1.

We say that a matrix $B\in\mathbb{C}^{n\times n}$ is $(r,s)-$ upper-separable with $s\geq 0$ , if for all $j=1,2,\ldots,n-|r|$ it holds that (in Matlab notation)

[TABLE]

In other words, any submatrix of $B$ including elements on and above the $r$ th diagonal of $B$ is of rank at most $s$ . The [math]th diagonal corresponds to the main diagonal, while the $r$ th diagonal refers to the $r$ th superdiagonal if $r>0$ and to the $-r$ th subdiagonal if $r<0$ .

Before proceeding, we give some comments on Definition 1.1 whose formulation is inspired by some related well-established definitions. For example, matrices with both $B$ and $B^{*}$ being $(0,1)$ –upper-separable (and $(1,1)$ –upper-separable, respectively) are referred to as semiseparable matrices (and quasi separable, respectively) [18, §1.1, §9.3.1]. As a simple example, a tridiagonal matrix is quasi separable, and its inverse is known to be semiseparable. Matrices being $(1-s,s)$ –upper-separable with their adjoint being $(1-r,r)$ –upper-separable are usually called $(r,s)-$ semiseparable, while matrices being $(1,s)$ –upper-separable with their adjoint being $(1,r)$ –upper-separable are called $(r,s)-$ quasi separable [18, §8.2.2 and §8.2.3].

The above statement on the rank structure of $H_{k}$ can now be made exact. Assume the Arnoldi process breaks down after $N\leq n$ iterations, i.e., $\mathbf{v}_{N+1}=0$ in (1.2). We will refer to $N$ as the ‘Arnoldi termination index’ in the rest of the article. In Corollary 3.2 it is shown that for a $BML$ -matrix $A$ , the underlying Hessenberg matrix $H_{N}$ is $(r,s)-$ upper-separable, where $r$ and $s$ are functions of $m_{j}$ . However, for a general matrix $A$ , there is no reason for the underlying Hessenberg matrix to be upper-separable. The upper-separable structure of the underlying Hessenberg matrix gives rise to the design of a multiple recurrence relation [2, 14], signifying that each new Arnoldi vector can be written as

[TABLE]

The smaller the quantities $m_{j}$ in (1.1), the shorter the recurrence relation becomes. In [5] the same recurrence relations are derived for the class of so called $(H,m)$ -well-free matrices. We refer to Remark 3.7 for some more details.

In this article we investigate a different version of the recurrence relation (1.3) by rewriting it in terms of GMRES residual vectors, aiming to overcome some of the numerical problems which relation (1.3) entails, such as the possibility of a breakdown [14]. It will be shown how these GMRES residual vectors can be computed progressively by means of an updating formula. This partially extends the discussion on a progressive GMRES method for nearly Hermitian matrices as presented by Beckermann & Reichel [4].

The article is organized as follows. Section 2 describes the structure of the unitary factor $Q$ in the $QR$ -decomposition of a Hessenberg matrix in terms of orthogonal polynomials (the results in this section are valid for a general matrix $A\in\mathbb{C}^{n\times n}$ ). It is well known that this unitary factor can be represented as a product of Givens rotations, see e.g., the isometric Arnoldi process introduced by Gragg [10] and its extension to the class of shifted unitary matrices by Jagels & Reichel [12]. We will describe these Givens rotations by means of orthogonal polynomials. Furthermore, links between orthogonal polynomials and the GMRES algorithm are discussed, leading to an updating formula to compute the GMRES residuals progressively. In section 3 the upper-separable structure of the Hessenberg matrix related to the Arnoldi process applied to a $BML$ -matrix is investigated. It is shown how this upper-separable structure can be generated by the GMRES residual vectors, allowing to construct a short recurrence relation and an accompanying algorithm. In section 4 we compare our findings with those presented in [2]. Section 5 discusses some computational reductions that can be made in case the matrix $A$ is nearly unitary or nearly shifted unitary. Finally, section 6 discusses the numerical performance and stability of the algorithm.

Throughout this article we will make use of the following notation. Vectors are written in bold face lower case letters, e.g., $\mathbf{x},\mathbf{y}$ , and $\mathbf{z}$ . The vector $\mathbf{e}_{k}$ denotes the $k$ th column of an identity matrix of applicable order. The standard inner product is denoted as $\langle\mathbf{x},\mathbf{y}\rangle=\mathbf{y}^{\ast}\mathbf{x}$ , with $\cdot^{\ast}$ the Hermitian conjugate. We write $\|\cdot\|$ for the induced Euclidean norm as well as the subordinate spectral matrix norm. Matrices are denoted by upper case letters $A=(a_{ij})$ , and $I_{k}$ denotes the identity matrix of order $k$ . We will frequently express formula (1.2) in the form

[TABLE]

where

[TABLE]

and $V_{k}=V_{k+1}\underline{I}_{k}$ , revealing the nested structure of the Arnoldi matrices $V_{k}$ and $H_{k}$ .

2 Towards a progressive GMRES method

Many Krylov space methods and in particular the Arnoldi method can be described in polynomial language which reveals some particular properties, and makes the link to the rich theory of orthogonal polynomials. For example, from (1.2) one deduces by recurrence on the degree that, for any polynomial $p$

[TABLE]

illustrating that Krylov spaces are intimately related to polynomials. See e.g., [16, §6.6.2] for the case of Hermitian $A$ . However, polynomial language can be used also for general matrices [3, §1.3]. In §2.1 we will see that the Arnoldi vectors correspond to a (finite) family of polynomials which are orthogonal with respect to

[TABLE]

a scalar product on the set of polynomials of degree $<N$ , with $N$ the termination index of the Arnoldi method. The scalar product (2.2) induces a norm $\|\cdot\|_{A,\mathbf{v}_{1}}$ . This implies in particular the well known fact [16, Proposition 6.7] that Arnoldi vectors are normalized FOM residuals. In §2.2 we will use polynomial language to give an explicit expression for the $Q$ -factor in a $QR$ -decomposition of an upper Hessenberg matrix. Such a formula for a unitary upper Hessenberg matrix is not known in literature, though of course there is a close link with Gragg’s explicit formula in terms of Givens rotations [16, 10, §6.5.3]. This formula will enable us to deduce in Corollary 2.4 a decay property of the entries of $Q$ far from the main diagonal, and reveals immediately a rank structure for $Q$ . In §2.3 we recall that GMRES residuals can be expressed in terms of orthogonal polynomials: we are faced with a well-studied extremal problem for general orthogonal polynomials. This allows us in §2.4 to establish a well known link between (normalized) FOM and GMRES residuals [16, §6.5.5], allowing for a recursive computation of (normalized) GMRES residuals. Such a progressive GMRES implementation has been discussed before [16, §6.5.3] and [4]. The implementation presented in this article is inspired by the work of Beckermann & Reichel [4]. Both implementations make use of the decomposition of $Q$ into a product of Givens rotations, but differ in how to find the angles of these rotations. We will consider in §2.4 not only FOM and GMRES for systems of linear equations $A\mathbf{x}=\mathbf{b}$ but more generally for shifted systems $(A-\delta I)\mathbf{x}=\mathbf{b}$ for some parameters $\delta\in\mathbb{C}$ . All findings of this section hold for general matrices $A$ .

2.1 Orthogonal polynomials linked to the Arnoldi process

Given an $N\times N$ upper Hessenberg matrix $H_{N}$ with positive real entries $h_{k+1,k}$ on the subdiagonal. We define polynomials $q_{0},q_{1},...,q_{N-1}$ recursively through the formula

[TABLE]

Direct computation yields the following well known link with the characteristic polynomials of the principal submatrices:

[TABLE]

showing that $q_{k}$ is of degree $k$ , with positive leading coefficient. In what follows we will write (2.3) in the form

[TABLE]

where $H_{k}$ is the $k\times k$ principal minor of $H_{N}$ and $N-1\geq k\geq 1$ . One deduces from (2.3) by recurrence on $k$ that $q_{k}(H_{N})\mathbf{e}_{1}=\mathbf{e}_{k+1}$ . Assume we apply the Arnoldi process to a matrix $A\in\mathbb{C}^{n\times n}$ , that $N\leq n$ is the Arnoldi termination index and $H_{N}$ the underlying Hessenberg matrix. Using (2.1), this implies

[TABLE]

and thus indeed the $q_{j}$ is the $j$ th orthonormal polynomial with respect to the scalar product (2.2). Also, from (2.4) we see that the $k$ th FOM iterate for the shifted system $(A-\delta I)\mathbf{x}=\mathbf{b}$ exists if and only if $q_{k}(\delta)\neq 0$ , and that in this case the $k$ th FOM residual is given by $q_{k}(A)\mathbf{b}/q_{k}(\delta)$ , compare with [16, Proposition 6.7].

2.2 The $QR$ -factorization of a Hessenberg matrix

We will derive an explicit formula for the unitary factor in the $QR$ -decomposition of the upper Hessenberg matrix $\underline{H}_{k}$ in terms of the orthonormal polynomials $q_{0},...,q_{k}$ . To our knowledge, such a result is new. It could also be potentially useful for studying the convergence of the $QR$ -method with shifts.

Let $Q_{k+1}(\delta)$ be the unitary factor in the $QR$ -decomposition of $\underline{H}_{k}-\delta\underline{I}_{k}$ , i.e.,

[TABLE]

$R_{k}(\delta)$ an upper triangular matrix with positive real entries on its main diagonal. It is well known (see, e.g., [16, Subsection 6.5.3]) that $Q_{k+1}(\delta)$ can be obtained as a product of Givens rotations, which are applied to $\underline{H}_{k}-\delta\underline{I}_{k}$ to annihilate the first subdiagonal. We follow [4], imposing the matrices $Q_{k+1}(\delta)$ to have determinant 1.

Definition 2.1.

Let $Q_{1}=[1]$ , and define for $k\geq 1$ ,

[TABLE]

with $s_{k}(\delta)\geq 0$ and $s_{k}(\delta)^{2}+|c_{k}(\delta)|^{2}=1$ , such that (2.7) holds.

Proposition 2.2.

Let $\delta\in\mathbb{C}$ , and

[TABLE]

Then the unitary factor $Q_{k+1}(\delta)^{\ast}$ is given by

[TABLE]

Moreover,

[TABLE]

Proof.

We leave it to the reader to check that the candidate for $Q_{k+1}(\delta)^{\ast}$ indeed has orthonormal rows. It remains to check the subdiagonal and diagonal entries of $Q_{k+1}(\delta)^{*}(\underline{H}_{k}-\delta\underline{I}_{k})$ . For $k\geq j>\ell$ we find that

[TABLE]

where the second equality is because of the fact that only the first $\ell+1\leq j$ entries of $(\underline{H}_{k}-\delta\underline{I}_{k})\mathbf{e}_{\ell}$ are nonzero and the third equality because of (2.5). Similarly, for $k\geq j=\ell$ ,

[TABLE]

Finally, for $k+1=j\geq\ell$ ,

[TABLE]

according to (2.5). To prove (2.8), observe that

[TABLE]

if and only if by multiplying on the left with $\left[\begin{array}[]{rr}\overline{c_{k}(\delta)}&s_{k}(\delta)\\ -s_{k}(\delta)&c_{k}(\delta)\end{array}\right]$ we transform

[TABLE]

into

[TABLE]

the latter being true for $c_{k}(\delta)$ and $s_{k}(\delta)$ as in (2.8). ∎

Notice that, according to the nested structure of the Hessenberg matrices, $Q_{k+1}(\delta)^{\ast}(H_{k+1}-\delta I_{k+1})$ is also upper triangular, but its last diagonal entry given by

[TABLE]

is not necessarily a positive real number. Hence, for obtaining the unitary factor in the unique $QR$ -decomposition of $H_{k+1}-\delta I_{k+1}$ we should rescale the last row of $Q_{k+1}(\delta)^{\ast}$ as given in Proposition 2.2 by a phase of modulus $1$ .

Let us consider the special case of $\delta=0$ and unitary $A$ as a running example.

Example 2.3.

Suppose that $\delta=0$ and $A$ is unitary. Then $AV_{k}=V_{k+1}\underline{H}_{k}$ and thus $\underline{H}_{k}$ has orthonormal columns, showing that $\underline{H}_{k}=Q_{k+1}(0)\underline{I}_{k}$ and $R_{k}(0)=I_{k}$ . From Proposition 2.2 we get explicit formulas for the entries of $\underline{H}_{k}$ , in particular, for unitary $A$ ,

[TABLE]

Furthermore, $AV_{k}=V_{k+1}Q_{k+1}(0)\underline{I}_{k}$ . Therefore,

[TABLE]

with $\mathbf{\tilde{v}}_{k}:=\frac{1}{\sigma_{k-1}(0)}\sum_{j=0}^{k-1}\overline{q_{j}(0)}\mathbf{v}_{j+1}$ . Also, $\mathbf{\tilde{v}}_{k+1}=s_{k}(0)\mathbf{\tilde{v}}_{k}+(-1)^{k}\overline{c_{k}(0)}\mathbf{v}_{k+1}$ . Hence, the orthonormal vectors $\mathbf{v}_{k}$ can be constructed using two short recurrence relations:

Note that $\gamma_{k}=(-1)^{k}c_{k}(0)$ and $\sigma_{k}=s_{k}(0)$ . The above double recurrence relation is known as the ‘Isometric Arnoldi algorithm’ designed by Gragg [10].

As a consequence of Proposition 2.2, according to Definition 1.1, we can derive some statements on the rank structure of the unitary Hessenberg matrix $Q_{k+1}(\delta)$ , and on a decay property of its entries.

Corollary 2.4.

$Q_{k+1}(\delta)$ * is $(0,1)$ -upper-separable111This implies that $Q_{k+1}(\delta)$ is quasi separable, but in general not semiseparable.. Moreover, for the submatrix $\widetilde{Q}$ of $Q_{k+1}(\delta)$ formed with the first $m\leq k+1$ rows and the last $\ell\leq k-m+2$ columns we have that*

[TABLE]

Proof.

The first statement follows by observing that

[TABLE]

with $L_{k+1}(\delta)$ strictly lower triangular, i.e., with zero entries on the main diagonal. This is a direct consequence of Proposition 2.2. In particular, we deduce that $\widetilde{Q}$ is of rank $1$ . From Proposition 2.2 it follows that

[TABLE]

giving the claimed result. ∎

We end this subsection by observing that Corollary 2.4 immediately implies a rank property as well as a decay of entries for resolvents of $H_{k+1}$ .

Corollary 2.5.

Suppose that $H_{k+1}-\delta I_{k+1}$ is invertible. Then $(H_{k+1}-\delta I_{k+1})^{-*}$ is $(0,1)$ -upper-separable.

Moreover, for the submatrix $\widetilde{H}$ of $(H_{k+1}-\delta I_{k+1})^{-*}$ formed with the first $m\leq k+1$ rows and the last $\ell\leq k-m+2$ columns we have that

[TABLE]

Proof.

Let us write $H_{k+1}-\delta I_{k+1}=Q_{k+1}(\delta)R$ with upper triangular and invertible $R$ . Then

[TABLE]

with $R^{-*}$ lower triangular of norm $\|(H_{k+1}-\delta I_{k+1})^{-1}\|$ . Replacing $Q_{k+1}(\delta)$ by (2.12) yields

[TABLE]

with $\widetilde{L}_{k+1}(\delta)$ strictly lower triangular; proving the first statement. This implies that

[TABLE]

and thus $\|\widetilde{H}\|=\sigma_{m-1}(\delta)\,\|\mathbf{e}_{1}^{\ast}\widetilde{H}\|$ . Notice that $\mathbf{e}_{1}^{\ast}\widetilde{H}$ is obtained by multiplying the first row of $Q_{k+1}(\delta)$ with the last $\ell$ columns of $R^{-*}$ . As $R^{-\ast}$ is lower triangular, $\mathbf{e}_{1}^{\ast}\widetilde{H}=\widetilde{Q}\widetilde{R}$ , $\widetilde{Q}$ a row vector formed with the last $\ell$ entries of the first row of $Q_{k+1}(\delta)$ and $\widetilde{R}$ the lower-right $\ell\times\ell$ minor of $R^{-\ast}$ . Therefore, applying Corollary 2.4 to $\widetilde{Q}$ yields

[TABLE]

which together with $\|\widetilde{H}\|=\sigma_{m-1}(\delta)\,\|\mathbf{e}_{1}^{\ast}\widetilde{H}\|$ and $\|R^{-*}\|=\|(H_{k+1}-\delta I_{k+1})^{-1}\|$ proves the second statement. ∎

2.3 The GMRES residual and orthogonal polynomials

We will give some more details on the link between the GMRES residual vectors and orthogonal polynomials. More specifically, we will write the GMRES residual vector as a linear combination of Arnoldi vectors. In the particular case of unitary $A$ , an even nicer relation arises for the GMRES residual vectors.

In what follows we denote by $\mathbf{r}_{k}(\delta)$ the $k$ th GMRES residual for the shifted system $(A-\delta I_{n})\mathbf{x}=\mathbf{b}$ , with starting vector $\mathbf{x}_{0}=0$ , and denote by $\mathbf{w}_{k}(\delta):=\mathbf{r}_{k}(\delta)/\|\mathbf{r}_{k}(\delta)\|$ its normalized version.

Proposition 2.6.

The $k$ th GMRES residual for the shifted system $(A-\delta I_{n})\mathbf{x}=\mathbf{b}$ with starting vector $\mathbf{x}_{0}=0$ , can be written as

[TABLE]

For its normalized version we have

[TABLE]

Proof.

Since we choose as starting vector $\mathbf{x}_{0}=0$ , we find the initial GMRES residual $\mathbf{r}_{0}(\delta)=\mathbf{b}-(A-\delta I_{n})0=\mathbf{b}=\mathbf{v}_{1}\,\|\mathbf{b}\|$ . Then we have

[TABLE]

Note that $\mathbf{r}_{0}-(A-\delta I_{n})V_{k}\mathbf{y}$ can be written as $p_{k}(A)\mathbf{r}_{0}(\delta)$ with $p_{k}$ a polynomial of degree at most $k$ and $p_{k}(\delta)=1$ . Then $\mathbf{r}_{k}(\delta)=p_{k}(A)\mathbf{r}_{0}$ , $p_{k}=\widetilde{p_{k}}/\widetilde{p_{k}}(\delta)$ , with

[TABLE]

where $\mathcal{P}_{k}$ denotes the set of polynomials of degree at most $k$ , and we use the norm induced by (2.2). Note that if $p(z)=\sum_{j=0}^{k}c_{j}q_{j}(z)$ , then $\|p\|_{A,\mathbf{v}_{1}}^{2}=\sum_{j=0}^{k}|c_{k}|^{2}$ by orthonormality. Therefore, the Cauchy-Schwarz inequality yields

[TABLE]

where the minimum is attained for $c_{j}=\overline{q_{j}(\delta)}$ . Combining (2.18) and (2.17) we conclude that (2.14) holds. In particular, $\|\mathbf{r}_{k}(\delta)\|=\|b\|/\sigma_{k}(\delta)$ , which together with (2.6) implies (2.15). ∎

Remark 2.7.

As $\sigma_{m-1}(\delta)/\sigma_{N-\ell+1}=||\mathbf{r}_{N-\ell+1}(\delta)||/||\mathbf{r}_{m-1}(\delta)||$ , we see that the decay rates in Corollary 2.4 and Corollary 2.5 are linked to the convergence of the GMRES algorithm. If the GMRES algorithm converges faster, the decay pattern becomes more pronounced.

Remark 2.8.

By taking norms in (2.14), we see that for the relative GMRES residual

[TABLE]

More generally, for the decay rates in Corollary 2.4 and Corollary 2.5 for $\ell\leq k-m+2$ , we have

[TABLE]

Let $\Omega\subset\mathbb{C}$ be a simply connected and compact $K$ -spectral set for $A$ ; that is, $\|\pi(A)\|\leq K\,\max_{z\in\Omega}|\pi(z)|$ for all polynomials $\pi$ (and hence $\Lambda(A)\subset\Omega$ ) with $\delta\not\in\Omega$ . Also, let $\varphi$ be a map, mapping $\mathbb{C}\setminus\Omega$ conformally onto the exterior of the closed unit disk. Then it can be shown that the right-hand side of (2.19) can be bounded above by $|1/\varphi(\delta)|^{k+2-\ell-m}<1$ times a modest constant, see, e.g., [16, Chapter 6.11.2] for the case where $\Omega$ is an ellipse.

Example 2.9.

As in Example 2.3, assume the matrix $A$ is unitary. Given a polynomial $p$ of degree $k$ , its reversed polynomial $p^{\ast}$ is defined as

[TABLE]

The orthogonal polynomials $q_{k}(z)$ can be expressed as

[TABLE]

Note that $q_{k}^{\ast}(0)$ is the leading coefficient of $q_{k}$ . As $A$ is unitary, $\|q\|_{A,\mathbf{v}_{1}}=\|q^{\ast}\|_{A,\mathbf{v}_{1}}$ . Hence, (2.20) can be rewritten as

[TABLE]

Therefore,

[TABLE]

Because of (2.4), $q_{k}^{\ast}(0)\geq 0$ . Also, $\|q_{k}^{\ast}\|_{A,\mathbf{v}_{1}}=1$ . Combining this with (2.17) and (2.14), (2.21) yields

[TABLE]

Therefore, the $k$ th normalized GMRES residual $\mathbf{w}_{k}(0)$ of a unitary matrix $A$ can be expressed as

[TABLE]

2.4 A progressive GMRES residual formula

By (2.15) we have that the $k$ th normalized GMRES residual satisfies

[TABLE]

the last equality following from (2.8). This demonstrates the existence of an updating formula to compute the residual vectors progressively. This formula can also be derived by means of the $QR$ -factorization of $\underline{H}_{k}-\delta\underline{I}_{k}$ , see Proposition 2.2 and [16, Subsection 6.5.3]. The next result shows that it is not necessary to compute such a factorization for obtaining $s_{k}(\delta)$ and $c_{k}(\delta)$ , if one is willing to compute the additional scalar product (2.25).

Proposition 2.10.

Define $\tau_{k}(\delta)=\mathbf{e}_{k}^{\ast}Q_{k}(\delta)^{\ast}(H_{k}-\delta I_{k})\mathbf{e}_{k}$ . Then

[TABLE]

and

[TABLE]

Proof.

From Proposition 2.2 and (2.15) it follows that

[TABLE]

Therefore,

[TABLE]

establishing (2.25). We now claim that

[TABLE]

which is a direct consequence of (2.16):

[TABLE]

where $\mathcal{K}=(A-\delta I_{n})\text{span}\{\mathbf{v}_{1},\ldots,\mathbf{v}_{k}\}$ .

It remains to prove (2.26). Taking inner products with $(A-\delta I_{n})\mathbf{v}_{k}$ in all terms of (2.24) and making use of (2.25) and (2.32), results in

[TABLE]

Together with $s_{k}^{2}(\delta)+|c_{k}(\delta)|^{2}=1$ , this yields (2.26). ∎

Example 2.11.

Let us return to the particular case of $\delta=0$ and unitary $A$ as discussed in Examples 2.3 and 2.9. Inserting (2.6) and (2.23) in (2.24) and identifying the underlying polynomials gives

[TABLE]

Since $q_{k-1}^{*}(z)-q_{k-1}^{*}(0)$ is $z$ times a polynomial of degree $<k-1$ and $q_{0}(z)=1$ , we get from (2.25) that

[TABLE]

and $h_{k+1,k}=s_{k}(0)$ , where we applied (2.22) and (2.9). Notice that this simplification for unitary $A$ is in accordance with (2.26). Taking the star operation in (2.34) gives the second relation

[TABLE]

We should mention that, in case of unitary $A$ , the scalar product (2.2) can be written as a scalar product in terms of a (discrete) measure supported on the unit circle. Thus we have the whole theory of orthogonal polynomials on the unit circle, and in particular relations (2.34) and (2.35) are known as the Szegő recurrence relations, a coupled two-term recurrence for orthonormal polynomials on the unit circle [11, Formula 1.2-1.7].

3 The Arnoldi process for $BML$ -matrices

We will establish a fast variant of the Arnoldi process which is applicable to the class of $BML$ -matrices as described by formula (1.1). In §3.1 we will use the explicit representation of $(H_{k+1}-\delta I_{k+1})^{-*}$ in terms of orthogonal polynomials as stated in (2.13) to demonstrate that the underlying Hessenberg matrix has an upper-separable structure. The assumption on simple poles makes it possible to easily express generators for the upper-separable structure in polynomial language. In particular, the GMRES residual vectors, which can be expressed in terms of orthogonal polynomials by (2.15), are showing up. In §3.2 we will see that the orthonormal basis vectors (up to a correction term incorporating the low rank perturbation in (1.1)) can be written as a linear combination of previously computed basis vectors and GMRES residual vectors. In §3.3 the results from §3.2 and §2.4, enabling to compute the necessary GMRES residual vectors progressively, will be combined into a new Arnoldi iteration for $BML$ -matrices.

3.1 Structure formula for $BML$ -matrices

A rank structure revealing formula for $BML$ -matrices will be derived in terms of the orthogonal polynomials $q_{k}$ defined in §2.1. This formula (3.3) will be the key for the design of short recurrence relations in the following subsection. We first show that the $BML$ -structure (1.1) is inherited by the underlying Hessenberg matrix.

Proposition 3.1.

Let $A$ be an invertible matrix satisfying relation (1.1), and denote by $N\leq n$ the Arnoldi termination index. Then $H_{N}$ is also a $BML$ -matrix with the same polynomials $p,q$ and indices $\widetilde{m}_{1}=m_{1}$ , $\widetilde{m}_{2}=m_{2}$ and $\widetilde{m}_{3}\leq m_{3}$ . More precisely,

[TABLE]

where $F_{k}:=V_{k}^{\ast}F$ and $G_{k}:=V_{k}^{\ast}G$ .

Proof.

By definition of $N$ we have $AV_{N}=V_{N}H_{N}$ , i.e., the columns of $V_{N}$ span an invariant subspace of $A$ . By recurrence on the degree one shows that $\pi(A)V_{N}=V_{N}\pi(H_{N})$ , or $\pi(H_{N})=V_{N}^{\ast}\pi(A)V_{N}$ , for any polynomial $\pi$ . Moreover, if $\pi(A)$ is of full rank, then so is $\pi(H_{N})$ . In particular, since we assume $q(A)$ to be invertible, so is the matrix $q(H_{N})$ . Hence, $V_{N}q(H_{N})^{-1}=q(A)^{-1}V_{N}$ . This implies that

[TABLE]

as claimed in (3.1). ∎

A combination with Corollary 2.5 gives us the following result.

Corollary 3.2.

Let $A$ be a matrix satisfying relation (1.1). Denote by $N\leq n$ the Arnoldi termination index, and $m:=\max(0,m_{1}-m_{2}+1)$ . Then $H_{N}\in\mathbb{C}^{N\times N}$ is $(m,m_{2}+m_{3})-$ upper-separable. More precisely,

[TABLE]

where $z_{1},\ldots,z_{m_{2}}$ denote the poles of the rational function $p(z)/q(z)$ .

Proof.

The fact that $H_{N}$ is $(m,m_{2}+m_{3})-$ upper-separable is already known from [14]. Below we give an alternative, more constructive, proof leading to explicit generators for the low rank part of $H_{N}$ . By assumption, the rational function $p/q$ in (1.1) has simple poles and thus has the partial fraction decomposition

[TABLE]

for some constants $z_{j}$ , $d_{j}$ and a polynomial $\pi$ of degree $m_{1}-m_{2}$ ( $\pi=0$ if $m_{1}<m_{2}$ ). Replacing $z$ by $H_{N}$ , taking adjoints and using (3.1) and (2.13) leads to

[TABLE]

with a strictly lower triangular matrix $L_{N}$ . Finally, according to the upper Hessenberg structure of $H_{N}$ , the matrix $\pi(H_{N})^{*}$ has zero entries on and above the $m$ th diagonal, establishing the upper-separable structure and formula (3.2). ∎

Remark 3.3.

The assumption on simple poles is not necessary for $H_{N}$ to have an upper-separable structure. However, once we drop this constraint, it is not clear whether or not there exists a link between the generators of the low-rank structure and the GMRES residual vectors. We refer to [14] for more information on this topic.

Note that $H_{k}$ for $k<N$ also has an upper-separable structure as it is a leading principal minor (submatrix) of $H_{N}$ . However, $H_{k}$ does not satisfy the same matrix equation as $H_{N}$ .

3.2 Short recurrence relations for $BML$ -matrices

We will now derive a short recurrence relation for $BML$ -matrices. To do so, the vector

[TABLE]

is introduced, which is equal to $A\mathbf{v}_{k}$ up to a correction term induced by the low rank perturbation in (1.1). In [4] it is shown that for nearly Hermitian $A$ , i.e., $p(z)=z$ and $q(z)=1$ in (1.1), the Arnoldi vectors satisfy a three term recurrence relation. More specifically,

[TABLE]

where $h_{k-1,k}=\mathbf{v}_{k-1}^{\ast}\mathbf{v}_{k}^{\prime}$ , $h_{k,k}=\mathbf{v}_{k}^{\ast}\mathbf{v}_{k}^{\prime}$ and $h_{k+1,k}=\mathbf{v}_{k+1}^{\ast}\mathbf{v}_{k}^{\prime}$ are entries of the underlying Hessenberg matrix. We refer to [4] for a detailed discussion. For a general $BML$ -matrix, Proposition 3.4 states that $\mathbf{v}_{k}^{\prime}$ is a linear combination of GMRES residual vectors and Arnoldi vectors, including $\mathbf{v}_{k+1}$ . As we will discuss below, this results in a short recurrence relation which reduces to (3.5) in the specific case of nearly Hermitian matrices and which can be used to compute the Arnoldi vectors in an efficient way.

Proposition 3.4.

Let $A$ be a matrix satisfying relation (1.1), and denote by $N\leq n$ the Arnoldi termination index, and $m:=\max(0,m_{1}-m_{2}+1)$ . Then for $m<k\leq N$ , the vector $\mathbf{v}_{k}^{\prime}$ as defined by (3.4) can be written as a linear combination of $\mathbf{v}_{j}$ for $j=k-m+1,...,k+1$ and of $\mathbf{w}_{k-m-1}(z_{j})$ for $j=1,...,m_{2}$ . More precisely,

[TABLE]

for some constants $a_{j,k}$ , $h_{j,k}$ entries of the underlying Hessenberg matrix $H_{N}$ , and $m<k\leq N$ .

Proof.

Notice that by construction,

[TABLE]

lies in the Krylov space spanned by the columns of $V_{k-m}$ . As a result we have

[TABLE]

Define

[TABLE]

Then $V_{k-m}=V_{N}\widehat{I}_{k-m}$ . Combining the above yields

[TABLE]

The vector $\widehat{I}_{k-m}^{\ast}\left(H_{N}-G_{N}F_{N}^{*}\right)\mathbf{e}_{k}\in\mathbb{C}^{k-m}$ consists of the first $k-m$ components of the $k$ th column of $H_{N}-G_{N}F_{N}^{*}$ . As the entries of $L_{N}$ and $\pi(H_{N})^{\ast}$ in (3.3) are zero on and above the $m$ th diagonal, it follows that

[TABLE]

which by (2.15) can be rewritten as

[TABLE]

with $a_{j,k}:=\overline{d_{j}}\sigma_{k-m-1}(z_{j})\mathbf{e}_{1}(H_{N}-z_{j}I_{N})^{-\ast}\mathbf{e}_{k}$ . As a result, $\mathbf{v}^{\prime}_{k}$ can be written as $\mathbf{v}_{k}^{\prime}-\mathbf{v}_{k}^{\prime\prime}$ , a linear combination of $\mathbf{v}_{k+1},...,\mathbf{v}_{k-m+1}$ (with coefficients being entries of the underlying Hessenberg matrix), plus a linear combination of $\mathbf{w}_{k-m-1}(z_{1}),...,\mathbf{w}_{k-m-1}(z_{m})$ . ∎

Remark 3.5.

Proposition 3.4 remains true if we replace $\mathbf{w}_{k-m-1}(z_{j})$ by $\mathbf{w}_{\ell}(z_{j})$ or/and $V_{k-m}G_{k-m}$ by $V_{\ell+1}G_{\ell+1}$ for $\ell\in\{k-m-1,k-m,...,k\}$ . This is the direct result of the observation that both $V_{k-m}G_{k-m}F^{\ast}\mathbf{v}_{k}-V_{\ell+1}G_{\ell+1}F^{\ast}\mathbf{v}_{k}$ and $\mathbf{w}_{k-m-1}(z_{j})-\sigma_{\ell}(z_{j})/\sigma_{k-m-1}(z_{j})\mathbf{w}_{\ell}(z_{j})$ are elements of $\mbox{span}(\mathbf{v}_{k-m+1},\ldots,\mathbf{v}_{\ell+1})$ if $\ell>k-m-1$ and equal to zero if $\ell=k-m-1$ . However, choosing $\ell\neq k-m-1$ causes to lose orthogonality between $\mathbf{v}_{k}^{\prime}-\mathbf{v}_{k}^{\prime\prime}$ and $\mathbf{v}_{k}^{\prime\prime}$ and to loose the link with the entries of the underlying Hessenberg matrix in (3.6).

Next, let us show how the recurrence relation of Proposition 3.4 reduces to the well-known Szegő recurrence if the matrix under consideration is unitary.

Example 3.6.

Let us return to the particular case of $\delta=0$ and unitary $A$ as discussed in Examples 2.3, 2.9, and 2.11. Inserting (2.6) and (2.23) in the second Szegő relation (2.35) leads to

[TABLE]

which is exactly the variant with $\ell=k,m_{2}=m_{3}=1,m_{1}=m=0$ of Proposition 3.4 discussed in Remark 3.5.

Remark 3.7.

In [5] a class of matrices, called $(H,m)$ -well-free matrices is investigated, and it is established that these matrices satisfy the recurrence relation (1.3) with $m_{1}=m_{2},m_{3}=0$ . Briefly described, these matrices form a subset of upper-separable Hessenberg matrices which satisfy an additional constraint preventing a breakdown in the recurrence relation (1.3). Intuitively, this additional “well-free” constraint signifies that there are no rank deficiencies encountered in the low rank part of the upper-separable Hessenberg matrix. The problem of breakdown is discussed in [2] and [14], where it is overcome by making use of a set of several multiple recurrence relations instead of the single recurrence relation (1.3), and the use of an algorithm based on (1.3) which provides a set of column vectors to generate the low rank structure of the underlying Hessenberg matrix, respectively. We will, however, not need any well-free constraint to prevent a breakdown in the recurrence relation stated in Proposition 3.4, as our approach does not impose any limitations on the matrix structure beyond (1.1).

3.3 The algorithm

Algorithm 1, which we will name Fast Arnoldi throughout the text, describes a fast variant of the Arnoldi algorithm for $BML$ -matrices based on Proposition 3.4. We will give a short description of each of the components of the algorithm and print the corresponding piece of pseudocode.

The idea is to make alternate use of the recurrence relations

[TABLE]

The recurrence relation (3.7) is used to compute the next orthonormal basis vector of the Krylov subspace as a linear combination of previously computed orthonormal vectors as well as GMRES residual vectors, while the recurrence relation (3.8) is used to update the GMRES residual vectors once a new orthonormal vector is retrieved. As (3.7) is only valid for $k>m$ , the first $m$ orthonormal vectors are computed by means of the classical Arnoldi iteration.

Each time the relation (3.7) is employed, the vector $\mathbf{v}_{k}^{\prime}$ is formed, causing products between vectors and matrices to be computed. The total complexity to compute $\mathbf{v}_{k}^{\prime}$ is $\mathcal{O}(m_{3}n)+\mathcal{O}(n^{2})$ .

$\hat{F}_{k}:=\mathbf{v}_{k}^{\ast}F$ ; $\hat{G}_{k-m}:=\mathbf{v}_{k-m}^{\ast}G$ ;

$\widetilde{G}:=\widetilde{G}+\mathbf{v}_{k-m}\hat{G}_{k-m}$ ;

$\mathbf{v}^{\prime}:=A\mathbf{v}_{k}-\widetilde{G}\hat{F}^{\ast}_{k}$ ;

The coefficients $a_{j,k}$ and $h_{j,k}$ in (3.7) are the solution of the least squares problem

[TABLE]

Note that $\mathbf{v}_{i}\perp\text{span}\{\mathbf{w}_{k-m-1}(z_{1}),\ldots,\mathbf{w}_{k-m-1}(z_{m_{2}})\}$ for all $k-m+1\leq i\leq k+1$ , allowing to solve the above least squares problem without knowing $\mathbf{v}_{k+1}$ in advance. To shorten notation in subsequent discussions, we define $M_{k}:=\left[\mathbf{w}_{k-m-1}(z_{1}),\ldots,\mathbf{w}_{k-m-1}(z_{m_{2}})\right]$ .

Each of the coefficients $h_{j,k}$ are entries of the $k$ th column of the corresponding Hessenberg matrix and are computed as $\mathbf{v}_{j}^{\ast}\mathbf{v}_{k}^{\prime}$ , which has a computational complexity of $\mathcal{O}(mn)$ .

for $j=k,k-1,\ldots,k-m+1$ do

$h_{j,k}:=\mathbf{v}_{j}^{\ast}\mathbf{v}^{\prime}$ ; $\mathbf{v}^{\prime}:=\mathbf{v}^{\prime}-h_{j,k}\mathbf{v}_{j}$ ;

end for

Next, a $QR$ -decomposition of the matrix $M_{k}$ is computed, after which the coefficients $a_{1,k},\ldots a_{m_{2},k}$ are retrieved by back-substitution. The complexity of this operation is $\mathcal{O}\left(m_{2}^{2}(n+1)\right)$ .

$Q:=[\mathbf{q}_{1},\ldots,\mathbf{q}_{m_{2}}],R:=(r_{i,j})\in\mathbb{C}^{m_{2}\times m_{2}}$ ,

such that $QR=M_{k}$ ;

for $j=m_{2},m_{2}-1,\ldots,1$ do

$a_{j,k}:=\left(\mathbf{q}_{j}^{\ast}\mathbf{v}^{\prime}-\sum_{\ell=j+1}^{m_{2}}r_{j,\ell}a_{j,k}\right)/r_{j,j}$ ;

$\mathbf{v}^{\prime}=\mathbf{v}^{\prime}-a_{j,k}\mathbf{w}_{k-m-1}(z_{j})$ ;

end for

$h_{k+1,k}=||\mathbf{v}^{\prime}||$ ; $\mathbf{v}_{k+1}:=\mathbf{v}^{\prime}/h_{k+1,k}$ ;

Then for each $z_{i}$ , $1\leq i\leq m_{2}$ , recurrence relation (3.8) is used to update the GMRES residual vectors, which are all equal to the starting vector $\mathbf{v}_{1}$ at the beginning of the iteration ( $k=m+1$ ). Each time the relation (3.8) is employed, a matrix vector product needs to be computed. This leads to a total complexity of $\mathcal{O}(m_{2}n^{2})$ .

for $j=1,\ldots,m_{2}$ do

$\tau_{k-m}(z_{j})=(-1)^{k-m-1}\mathbf{w}_{k-m-1}(z_{j})^{\ast}(A-z_{j}I_{n})\mathbf{v}_{k-m}$ ;

$s_{k-m}(z_{j})=h_{k-m+1,k-m}/\sqrt{h_{k-m+1,k-m}^{2}+|\tau_{k-m}(z_{j})|^{2}}$ ;

$c_{k-m}(z_{j})=\tau_{k-m}(z_{j})/\sqrt{h_{k-m+1,k-m}^{2}+|\tau_{k-m}(z_{j})|^{2}}$ ;

$\mathbf{w}_{k-m}(z_{j})=s_{k-m}(z_{j})\mathbf{w}_{k-m-1}(z_{j})+(-1)^{k-m}\overline{c_{k-m}(z_{j})}\mathbf{v}_{k-m+1}$ ;

end for

Note that it is numerically more stable if we do not divide by the square root

$\sqrt{h_{k-m+1,k-m}^{2}+|\tau_{k-m}(z_{j})|^{2}}$ in the computation of $s_{k-m}(z_{j})$ and $c_{k-m}(z_{j})$ , but instead normalize $\mathbf{w}_{k-m}(z_{j})$ after each iteration (this however, leads to $m_{2}$ additional scalar products). If we assume the matrix under consideration is sparse; allowing a computational complexity of $\mathcal{O}(n)$ to compute a matrix vector product; the total complexity to compute the first $k$ orthonormal Arnoldi vectors can be estimated as $\mathcal{O}(kn)$ .

Remark 3.8.

If the rational function in (1.1) has only one pole, i.e., $m_{2}=1$ in (3.6) then the order in which the coefficients are determined can be reversed. More precisely, we can first compute $a_{1,k}$ as $\mathbf{w}_{k-m-1}(z_{1})^{\ast}\mathbf{v}_{k}^{\prime}$ and then orthonormalize the resulting difference $\mathbf{v}_{k}^{\prime}-a_{1,k}\mathbf{w}_{k-m-1}(z_{1})$ against $\mathbf{v}_{k-m+1},\ldots,\mathbf{v}_{k}$ to obtain $\mathbf{v}_{k+1}$ . This might be of influence on the numerical performance.

4 Connection with the Barth-Manteuffel multiple recurrence relation

The aim of this section is to show how our work is related to that of Barth & Manteuffel in their article on ‘Multiple recursion conjugate gradient algorithms’ [2]. They introduce an economical conjugate gradient algorithm for the class of $BML$ -matrices, by making use of short recurrence relations. We will give a short summary of their findings and discuss both the differences and similarities with our approach.

To prevent the possibility of a breakdown in their so-called ‘single recurrence relation’ Barth & Manteuffel rewrote it as a set of recurrence relations, that are stated in (4.1)-(4.3).

[TABLE]

Unfortunately, they use different letters, shift indices, and construct an orthogonal, but not orthonormal basis of the Krylov space. The new normalization comes from the fact that they consider a Hessenberg matrix which has ones on its first subdiagonal. Their basis vectors $\mathbf{\underline{p}}_{0},\mathbf{\underline{p}}_{1},\ldots,\mathbf{\underline{p}}_{N-1}\in\mathbb{C}^{n}$ satisfy $\mathbf{\underline{p}}_{j}/\|\mathbf{\underline{p}}_{j}\|=\mathbf{v}_{j+1}$ . Moreover, they use the integers $(\ell,m,\kappa,\theta)$ instead of $(m_{1},m_{2},m_{3},m_{3}-1+m)$ . Also, as seen in (4.1)-(4.3), two other families of vectors with double indices are used. For simplicity and consistency we will abbreviate them as

[TABLE]

As the reader will see below, to compare our approach with [2], we will not explicitly make use of (4.1)-(4.3), but instead make use of a mathematical equivalent of (4.1)-(4.3) which is adapted to our notation and scalings. The original pseudocode used by Barth & Manteuffel is stated in Algorithm 2.

In [2, Eqn. (4.16)] the authors provide an explicit formula for the entries of the upper Hessenberg matrix $H$ :

[TABLE]

for all $j=1,2,...,k-m$ , in which the reader recognizes generators $\underline{\rho}_{i},\underline{\eta}_{i},\underline{\mu}_{i}$ and $\underline{\tau}_{i}$ for the low-rank part of $H$ . Note that, in contrast to [2], we start numbering with $i=1$ instead of $i=0$ . To be able to use the recurrence relations (4.1)-(4.3) in practice, the generators $\underline{\rho}_{i},\underline{\eta}_{i},\underline{\mu}_{i}$ and $\underline{\tau}_{i}$ need to be known in advance. Therefore, Algorithm 2 is based on a rewritten form of the recurrence relations (4.1)-(4.3) which enables to compute the orthogonal basis vectors $\mathbf{p}_{i}$ and the generators $\underline{\rho}_{i},\underline{\eta}_{i},\underline{\mu}_{i}$ and $\underline{\tau}_{i}$ simultaneously.

We define $W_{k}$ and $\widehat{W}_{k}$ [2, Eqn. (4.22) and Eqn. (4.23)] as

[TABLE]

for $k=0,\ldots,n-1$ , which is the equivalent of (4.2)-(4.3). With a similar reasoning as in [2, Eqn. (5.3)] it can be proven that

[TABLE]

which is mathematically equivalent to (4.1).

Suppose now that the generators as defined in (4.4) are known. Then one can use (4.6) to compute $\mathbf{v}_{k+1}$ out of $\mathbf{v}_{k-m+1},...,\mathbf{v}_{k},W_{k-m-1}$ and $\widehat{W}_{k-m-1}$ , then use (4.5) to compute $W_{k-m},\widehat{W}_{k-m}$ out of $\mathbf{v}_{k-m+1},W_{k-m-1}$ and $\widehat{W}_{k-m-1}$ , and so on. Hence, it remains to find the generators. Two of them can be computed with an explicit formula [2, Eqn. (4.15) and (4.13)], namely

[TABLE]

The vector $\underline{\rho}_{j}$ is obtained [2, Eqn. (4.12)] as the ‘remainder’ in the polynomial division of $q_{j-1}$ (2.6) by the denominator $q$ (1.1):

[TABLE]

where we observe that $\rho_{1},...,\rho_{m_{2}}$ form the canonical basis of $\mathbb{C}^{m_{2}}$ , and thus $V_{m_{2}}^{*}W_{k}=I$ . We may rewrite the remainder in terms of the Lagrange polynomials $\ell_{h}$ of the roots $z_{1},..,z_{m_{2}}$ of $q$ , leading to

[TABLE]

Substituting the expression for $\underline{\rho}_{j}$ into (4.5) allows to conclude that

[TABLE]

Recalling that $V_{m_{2}}^{*}W_{k}=I$ gives

[TABLE]

which makes a partial link between (4.6) and Proposition 3.4. In particular, if the matrix $A$ is unitary and no low-rank perturbation is involved, $W_{k}$ is a multiple of $\mathbf{w}_{k}(0)$ and the Barth-Manteuffel multiple recurrence relation turns out to be equivalent to the Szegő recurrence relations. The quantities $\underline{\rho}_{k+1}$ are computed recursively [2, Eqn. (5.11)] by computing all entries of $\underline{H}_{k}$ , and by taking remainders after division by $q$ in the relation $zq_{k-1}(z)=\sum_{j=1}^{k+1}H_{j,k}q_{j-1}(z)$ , the polynomial translation of the Arnoldi relation $AV_{k}=V_{k+1}\underline{H}_{k}$ . We refer to lines $6,8,12,25,27$ and $36$ of Algorithm 2.

It remains to show how to compute $\underline{\tau}_{k-m+1}$ (after having computed $\mathbf{v}_{k+1},\rho_{k-m+1},W_{k-m}$ ) and relate the term $\widehat{W}_{k-m-1}\underline{\mu}_{k}$ in (4.6) to the term $V_{k-m}G_{k-m}F^{*}\mathbf{v}_{k}=V_{k-m}V_{k-m}^{*}G\underline{\mu}_{k}$ of our short recurrence of Proposition 3.4. In fact, at this place the authors of [2] require an additional delay in the recurrence (4.6) by replacing $m$ by $m^{\prime}:=m+m_{3}\geq m$ , which is possible according to (4.4). According to [2, Eqn. (5.2)] and (4.4), the computation of $\underline{\tau}_{k-m^{\prime}+1}$ is done by solving the system

[TABLE]

We refer to line $16$ of Algorithm 2. However, there is a possible problem with this system which is not discussed in [2]. As noticed after [2, Eqn. (5.2)], it is consistent, but one may not insure that the matrix is invertible, i.e., we might have several solutions, each of them being a generator suitable for $H_{k}$ , but not necessarily for $H_{N}$ . This is a general problem with computing generators for $H_{N}$ in a recursive manner: there is no guarantee that $m_{3}=\mbox{rank}(F^{*})$ is equal to $\mbox{rank}(F^{*}V_{N})$ , and thus whether the minimal number of generators for $H_{N}$ is equal to $m_{2}+m_{3}$ . In addition, at stage $k$ of a recursive computation, it might happen that the minimal number of generators for $H_{k}$ is strictly lower, i.e., $\mbox{rank}(F^{*}V_{k})<\mbox{rank}(F^{*}V_{N})$ , i.e., we should have a $m_{3}$ depending on $k$ . Finally, the matrix of coefficients is just obtained by picking the last $m_{3}$ columns of $F^{*}V_{k}$ which might also lower the rank. However, going through the proof of (4.4) we can derive an explicit formula for $\underline{\tau}_{j}$ . From [2, Eqn. (4.14)] it can be deduced that

[TABLE]

Combining (4.12) and (4.8), we obtain

[TABLE]

Note that (4.13) could be used to compute $\underline{\tau}_{k-m+1}$ without introducing the additional delay in (4.6). Inserting (4.13) into (4.5) gives the explicit formula

[TABLE]

In the following remark we intend to compare our approach with that of Barth & Manteuffel [2].

Remark 4.1.

(a)

Both approaches heavily use the fact that a certain upper right part of the Hessenberg matrix is of rank at most $m_{2}+m_{3}$ . In other words, one is able to express $A\mathbf{v}_{k}$ as a linear combination of $m_{2}+m_{3}$ correction vectors plus a linear combination of the last $m$ or $m^{\prime}=m+m_{3}$ columns of $V_{k+1}$ , a kind of corrected “short” recurrence. Notice however that our recurrence is “shorter” if $m_{3}>0$ . 2. (b)

In our approach, $m_{3}$ correction vectors are explicitly given (the term $\mathbf{v}_{k}^{\prime}-A\mathbf{v}_{k}$ ) and can be updated explicitly. They are not necessarily linearly independent. The other $m_{2}$ correction vectors are identified as GMRES residuals for shifted systems, allowing for easy updating.

In contrast, Barth & Manteuffel compute explicitly the four sequences of generators of the low-rank structure of $H_{N}$ , given in (4.4). Notice, however, that at the $k$ th step of the algorithm one can only deduce generators for $H_{k}$ and not for $H_{N}$ . As mentioned above, in order not to be obliged to correct generators found earlier, there should be an additional assumption on $\mbox{rank}(F^{*}V_{k})$ not mentioned by the authors. However, there is a variant of the Barth & Manteuffel approach: instead of using (4.11) requiring a delay in the recurrence relation (4.6), one can use (4.13), which is not mentioned in **[2]**, to compute $\underline{\tau}_{k-m+1}$ just after having computed $\underline{\rho}_{k-m+1}$ . 3. (c)

In our approach, for finding the coefficients of the GMRES correction vectors, we suggest to solve a least square problem, the matrix of coefficients $M_{k}$ having as columns these $m_{2}$ (normalized) GMRES correction vectors. Notice that $M_{k}$ has full column rank (since $V_{m_{2}}^{*}M_{k}$ has), but might be ill-conditioned. Thus standard techniques (SVD dropping small singular values, or $QR$ decomposition with column pivoting and threshold) can be applied, where the residual error in solving this least-square problem leads to a loss of orthogonality for $v_{k+1}$ of the same order.

*In contrast, Barth & Manteuffel suggest one of the missing generators by solving system (4.11). The computation of the other generator $\underline{\rho}_{j+1}$ is quite involved and requires the knowledge of the whole * $j$ th column of the Hessenberg matrix $\underline{H}_{k}$ (which is not necessarily computed using our approach). 4. (d)

If GMRES is converging fast, we believe that the normalization (4.10) is not appropriate since

[TABLE]

Also the $\underline{\eta}_{k}$ are very small due to the above-mentioned decay property of the entries of our Hessenberg matrix.

5 Some special matrices

In this section we give special attention to some classes of matrices where we can slightly reduce the computational complexity of Algorithm 1. The first class consists of matrices $A$ which satisfy an equation of the form

[TABLE]

for some $\alpha,\beta\in\mathbb{C}$ . This includes the class of normal matrices of which all but $m_{3}$ eigenvalues are collinear. We will address this kind of matrices as nearly Hermitian matrices. If $\alpha=1,\beta=0$ and $A$ is real, this corresponds to the class of nearly symmetric matrices as discussed in [4]. However, one can easily check that all results derived in [4] are also valid for a matrix of the form (5.1).

The second class consists of matrices $A$ which satisfy an equation of the form

[TABLE]

for some $\alpha,\beta,\delta\in\mathbb{C}$ . This includes the class of normal matrices of which all but $m_{3}$ eigenvalues are concyclic. If $\delta=0$ , we speak of nearly unitary matrices, if $\delta\not=0$ we speak of nearly shifted unitary matrices.

The class of matrices satisfying (3.5) or (5.2) include all examples of $BML$ -matrices known to us which are of practical interest. By this we mean, matrices that are suitably large with respect to the quantities $m_{1},m_{2}$ and $m_{3}$ . More information on matrices satisfying equation (1.1) can be found in [13].

5.1 Nearly Hermitian matrices

Assume the matrix $A$ is nearly Hermitian. Then (3.6) reduces to

[TABLE]

Define the vectors $\mathbf{p}_{k}^{\ast}\in\mathbb{C}^{m_{3}}$ recursively as

[TABLE]

for $k\geq 2$ and $\mathbf{p}_{1}^{\ast}=G_{1}$ . By recurrence on $k$ it follows that $\mathbf{p}_{k}^{\ast}=\mathbf{e}_{k}^{\ast}Q_{k}(0)^{\ast}G_{k}$ . Therefore, in combination with (2.31),

[TABLE]

Then (2.25) yields

[TABLE]

the latter equality because of (5.3) and (5.5). Expression (5.6) can now be used to compute $\tau_{k}(0)$ instead of (2.25), reducing the computational complexity222Expression (5.6) was also proved alternatively By Beckermann and Reichel [4, Proposition 4.2]..

5.2 Nearly unitary matrices

Assume the matrix $A$ is nearly unitary. Then (3.6) reduces to

[TABLE]

where $a_{1,k}=\mathbf{w}_{k-1}(0)^{\ast}\mathbf{v}_{k}^{\prime}$ and $h_{k+1,k}$ such that $\mathbf{v}_{k+1}$ is of unit length. Again we make use of the vector $\mathbf{p}_{k}^{\ast}$ as defined in (5.4). Then because of (5.5), (2.25) yields

[TABLE]

Expression (5.8) can now be used to compute $\tau_{k}(0)$ instead of (2.25).

5.3 Nearly shifted unitary matrices

Assume the matrix $A$ is nearly shifted unitary. Then (3.6) reduces to

[TABLE]

where $a_{1,k}=\mathbf{w}_{k-2}(\delta)^{\ast}\mathbf{v}_{k}^{\prime}$ and $h_{k,k}$ , $h_{k+1,k}$ are entries of the corresponding Hessenberg matrix. From (2.24) we deduce that

[TABLE]

Hence, due to (2.24) and (5.9),

[TABLE]

Finally, we know that

[TABLE]

with $\mathbf{p}_{k}^{\ast}(\delta)\in\mathbb{C}^{m_{3}}$ recursively defined as

[TABLE]

for $k\geq 2$ and $\mathbf{p}_{1}^{\ast}=G_{1}$ . Making use of (5.10), (5.11) and (5.12), (2.25) yields

[TABLE]

As before, the above expression can now be used to compute $\tau_{k-1}(\delta)$ instead of (2.25).

6 Numerical examples

In this section we will compare our fast Arnoldi algorithm with the one of Barth-Manteuel and classical Arnoldi. We focus especially on the orthogonality of the obtained Arnoldi vectors. The orthogonality in the forthcoming figures is measured by a method described originally by Paige [6, 15]. Given $V_{k}^{*}V_{k}-I=U_{k}+U_{k}^{*}$ with $U_{k}$ strictly upper triangular, we define $S_{k}=(I+U_{k})^{-1}U_{k}$ . The norm of $S_{k}$ is used as an orthogonality measure for the columns of $V_{k}$ , i.e., $\|S_{k}\|\in[0,1]$ where $\|S_{k}\|=0$ when they are orthonormal and $\|S_{k}\|=1$ when they are linearly dependent [15]. To make the fairest possible comparison with the Barth-Manteuffel algorithm, we implemented their pseudocode as stated in [2] and recalled in Algorithm 2. However, their pseudocode returns an orthogonal basis, while recurrence relation (3.6) returns an orthonormal basis. Therefore, we have normalized these vectors first.

We will start in §6.1 and §6.2 by discussing the special case of nearly unitary $A$ (with $\delta=0$ ) and nearly shifted unitary $A$ (with shift $\delta\neq 0$ ), where we replaced in our BML-Arnoldi algorithm formula (2.23) for the computation of $\tau_{k}(\delta)$ by the less expensive formulas described in §5.2, and §5.3, respectively. Subsequently we report in §6.3 about an example of a nearly Hermitian matrix discussed already in [6].

Quite often there is some correlation between loss of orthogonality between Arnoldi vectors and convergence of the GMRES residual $r_{k}(\delta)$ of the shifted system $(A-\delta I)x=b$ with starting vector $x_{0}=0$ . This phenomenon is probably related to the decay properties mentioned in Remark 2.7. We therefore draw in each of the figures below the relative GMRES residual

[TABLE]

the last identity following from (2.8). Notice that the quantities $s_{j}(\delta)$ are already computed in the BML-Arnoldi algorithm in the case $m_{3}=1$ of §6.1 and §6.2, whereas in §6.3 we have to add the computation of $s_{j}(\delta)$ , here for $\delta=0$ , following the formulas given in §5.1. One may understand (6.1) as the recursive computation of the GMRES residuals following some progressive residual scheme, where the underlying least squares problem is solved by successive Givens rotations. We will refer to this residual in the forthcoming figures as the progressive residual. However, due to loss of orthogonality, it might be that these progressive residuals are badly computed. This is why each time we display also the ”exact” relative GMRES residual, obtained by computing the $k$ th iterate of GMRES for the shifted system $(A-\delta I)x=b$ with starting vector $x_{0}=0$ via the black box routine of Matlab (which does not use our Arnoldi vectors but recomputes them via full Arnoldi, and solves the least squares problem via Householder transforms). It turns out that, in all our numerical experiments, that when both Arnoldi and fast Arnoldi behave well, that the progressive residual and the GMRES residual exhibit the same convergence history.

All computations were carried out in Matlab R2015a. As a starting vector for the Krylov subspace we always consider a vector $b$ that has normally distributed random entries with mean zero and variance one.

6.1 Perturbed diagonal and unitary matrices

We consider $200\times 200$ diagonal matrices for which all but $m_{3}$ eigenvalues lie on a circle. Clearly, such matrices satisfy equation (1.1) with $m_{1}=m_{2}=1$ . We considered various cases; for each case we show the eigenvalues of the matrix and a comparison of the orthogonality of the computed Arnoldi vectors for classical Arnoldi, Barth-Manteuffel, and the fast Arnoldi method. The legend is plotted in Figure 1 and is identical for all similar graphs in this section.

In the first experiment, see Figure 1, we consider eigenvalues on three quarter of the unit circle. Clearly the full Arnoldi and fast Arnoldi perform best and in all the tests we ran the orthogonality of the computed vectors was comparable. Other experiments revealed that Barth-Manteuffel performed just slightly worse when considering eigenvalues distributed over the entire unit circle, the performance of Barth-Manteuffel started to degrade when segments were excluded from the unit circle. The progressive residual seems to align almost perfectly with the GMRES residual. We have also tested various radii and similar conclusions hold when the radius of the circle is changed. 2. 2.

In the second experiment, see Figure 2, we have shifted the unit circle in the complex plane. Barth-Manteuffel seems to have problems with this case, Arnoldi, and the fast Arnoldi method on the other hand exhibit good accuracy. The progressive residual and the GMRES residual align again almost perfectly. 3. 3.

In a third experiment, see Figure 3, all eigenvalues except for two are located on the unit circle. In this case Barth-Manteuffel outperforms our code slightly. We note that the location of the eigenvalues outside the circle does not have a significant impact on the overall picture of the accuracy. Tests revealed also that if one would shift the midpoint or exclude eigenvalues out of parts of the circle the fast Arnoldi method would outperform Barth-Manteuffel.

Overall we can conclude that the Arnoldi method is the most accurate one and the fast Arnoldi method is also typically quite close. The Barth-Manteuffel algorithm, however, exhibits quick loss of orthogonality when the circle is shifted or the eigenvalues do not span the entire circle. Moreover, the progressive residual and the GMRES residual align almost perfectly.

We also considered a real non-normal matrix $A$ which is of the form

[TABLE]

for some vectors $\mathbf{u}$ and $\mathbf{v}$ and a unitary matrix $U$ . Using the Sherman-Morrison formula [9], equation (6.2) yields

[TABLE]

Hence,

[TABLE]

The orthogonality of the computed Arnoldi vectors was examined for a $100\times 100$ matrix $A$ of the form (6.2) where the unitary matrix $U$ and the vectors $\mathbf{u}$ and $\mathbf{v}$ are randomly generated. In this case the orthogonality behaved similar to Figure 1, implying that the fast Arnoldi behaves similar to Arnoldi and Barth-Manteuffel deteriorates.

6.2 Unitary matrix from Quantum Chromodynamics

We consider a shifted unitary matrix which finds its origin in Quantum Chromodynamics (QCD)[1]. QCD is the theory which describes the fundamental interaction between quarks, which are the building blocks of protons and neutrons. This theory makes use of the Neuberger overlap operator $A=\rho I+\gamma\,\text{sign}(Q)$ , where $\rho$ and $\gamma$ are scalars and $Q$ is the Hermitian Wilson fermian matrix. As a result the Neuberger overlap operator is a shifted unitary matrix. To construct the matrix $Q$ a parameter $\kappa$ and a hopping matrix are needed. We have selected these parameters equal to the ones from [1], i.e. $\kappa=0.2809$ and as hopping matrix conf5.0-00.14x4-2600.mtx from the Matrix Market333A repository of test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services..

To compute $\text{sign}(Q)$ , we invoke the software package designed by Arnold, et al., [1], which makes use of the Zolotarev algorithm [17]. We choose parameters $\rho=2$ and $\gamma=1$ for the Neuberger overlap operator. On the left of Figure 4 we have drawn the eigenvalues of the Neuberger overlap operator $A$ and observe that the density of the eigenvalues is much higher on the right than on the left.

The orthogonality of the computed Arnoldi vectors is depicted in Figure 4. Approximately the first ten iterations of the recurrence relation (3.6) fast Arnoldi and the Barth-Manteuffel algorithm show comparable accuracy. After that, the orthogonality of the vectors computed with the Barth-Manteuffel algorithm deteriorates fast at almost the same rate as classical Arnoldi. Even though the progressive residuals and the GMRES residual align, classical Arnoldi seems to suffer heavily from loss of accuracy. The fast Arnoldi method clearly outperforms the other approaches.

6.3 Departure from orthogonality

Beckermann & Reichel [4] proposed a Krylov subspace method for solving a linear system in which the coefficient matrix is nearly Hermitian. Their method, based on a short recurrence for generating an orthonormal Krylov basis, is better known as Progressive GMRES, shortly named PGMRES. For nearly Hermitian matrices, this short recurrence coincides with the recurrence relation (3.6) described above.

However, Embree, et al., [6] showed how in certain cases the PGMRES method exhibits an instability which finds its origin in the loss of orthogonality of the computed Arnoldi vectors. A specific class of examples is described and the corresponding departure from orthogonality is shown when using the recurrence relation (3.6). In the forthcoming experiments we have used the algorithm for nearly Hermitian matrices as presented in Section 5.1.

This class of matrices is of the form

[TABLE]

where

[TABLE]

with eigenvalues

$\lambda_{1},\ldots,\lambda_{p}$ uniformly distributed in the interval $[-\beta,-\alpha]$ , 2. 2.

$\lambda_{p+1},\ldots,\lambda_{n-2}$ uniformly distributed in the interval $[\alpha,\beta]$ , 3. 3.

$\lambda_{n-1},\lambda_{n}=\pm\gamma i$ .

We take two examples from this class and compare the loss of orthogonality of the recurrence relation (3.6) as predicted in [6] with the Barth-Manteuffel algorithm. The orthogonality of the Arnoldi vectors stored in the matrix $V_{k}$ is depicted in Figure 5.

As seen in Figure 5 recurrence relation (3.6) gives rise to significantly less orthogonal vectors than the standard Arnoldi iteration. However, it may also be observed that the Barth-Manteuffel algorithm suffers from the same loss of orthogonality as recurrence relation (3.6). We see that the loss of orthogonality emerges as soon as the progressive residual vectors start to differ significantly from the actual GMRES residual, explaining the inaccuracies in the computed vectors. Figure 6 shows the gradual loss of orthogonality between the vectors. White stands for a perfect orthogonality, black for complete loss, The colors assigned are, for $10^{0}$ black (no orthogonality) and for $10^{-16}$ white (orthogonal up to machine precision). In Figure 6 we observe that $v_{j}$ is numerically orthogonal to $v_{k}$ for $j,k<=14$ , $j\neq k$ , and also at later stages for $|j-k|\leq 2$ , as expected from the local reorthogonalization of our algorithm. However, globally, the orthogonality gets quite quickly lost, as observed already by Embree et al. [6], who suggested Schur complement techniques to tackle this problem.

We can conclude that in this case both Barth-Manteuffel and fast Arnoldi exhibit a fast and almost identical loss of orthogonality.

7 Conclusion

An economic variant for the Arnoldi algorithm has been established for matrices whose adjoint is a low-rank perturbation of a rational function of the matrix. In the process, some aspects of the Arnoldi process are described in terms of orthogonal polynomials. This includes an explicit formula for the unitary factor in the $QR$ -decomposition of a Hessenberg matrix and a decay property of the entries of this Hessenberg matrix which is related to the convergence of the GMRES algorithm. Also, the existence of a progressive GMRES residual formula has been shown, extending the findings of [4]. Furthermore, comparisons are made with the algorithm described by Barth and Manteuffel [2] for matrices whose adjoint is a low-rank perturbation of a rational function of the matrix, both theoretically and numerically.

8 Acknowledgements

Part of this research has been established during a visit at the university of Science and Technology of Lille in 2014. We thank Bernd Beckermann and Ana Matos for their hospitality.

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. Arnold, N. Cundy, J. van den Eshof, A. Frommer, S. Krieg, Th. Lippert, and K. Shäfer. Numerical methods for the QCD overlap operator II: Optimal Krylov subspace methods, in QCD and Numerical Analysis III, eds. A. Boricci, A. Frommer, B. Joo, A. Kennedy and B. Pendleton , volume 47 of Lecture Notes in Computational Science and Engineering . Springer, Berlin, 2005.
2[2] T. Barth and T. Manteuffel. Multiple recursion Conjugate Gradient algorithms Part I: Sufficient conditions. SIAM Journal on Matrix Analysis and Applications , 21(3):768–796, 2000.
3[3] B. Beckermann. Discrete orthogonal polynomials and superlinear convergence of Krylov subspace methods in numerical linear algebra. Orthogonal Polynomials and Special Functions, Lecture Notes in Mathematics , 1883:119–185, 2006.
4[4] B. Beckermann and L. Reichel. The Arnoldi process and GMRES for nearly symmetric matrices. SIAM Journal on Matrix Analysis and Applications , 30(1):102–120, 2008.
5[5] T. Bella, V. Olshevsky, and P. Zhlobich. Classifications of recurrence relations via subclasses of (H,m)-quasiseparable matrices. Lecture Notes in Electrical Engineering , 80:23–53, 2011.
6[6] M. Embree, J.A. Sifuentes, K.M. Soodhalter, D.B Szyld, and F. Xue. Short-term recurrence Krylov subspace methods for nearly Hermitian matrices. SIAM Journal on Matrix Analysis and Applications , 33(2):480–500, 2012.
7[7] V. Faber, J. Liesen, and P. Tichý. The Faber-Manteuffel theorem for linear operators. SIAM Journal on Numerical Analysis , 46(3):1323–1337, 2008.
8[8] V. Faber and T. Manteuffel. Necessary and sufficient conditions for the existence of a Conjugate Gradient method. SIAM Journal on Numerical Analysis , 21(2):352–362, 1984.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

On a fast Arnoldi method for BMLBMLBML-matrices

Abstract

1 Introduction

Definition 1.1**.**

2 Towards a progressive GMRES method

2.1 Orthogonal polynomials linked to the Arnoldi process

2.2 The QRQRQR-factorization of a Hessenberg matrix

Definition 2.1**.**

Proposition 2.2**.**

Proof.

Example 2.3**.**

Corollary 2.4**.**

Proof.

Corollary 2.5**.**

Proof.

2.3 The GMRES residual and orthogonal polynomials

Proposition 2.6**.**

Proof.

Remark 2.7**.**

Remark 2.8**.**

Example 2.9**.**

2.4 A progressive GMRES residual formula

Proposition 2.10**.**

Proof.

Example 2.11**.**

3 The Arnoldi process for BMLBMLBML-matrices

3.1 Structure formula for BMLBMLBML-matrices

Proposition 3.1**.**

Proof.

Corollary 3.2**.**

Proof.

Remark 3.3**.**

3.2 Short recurrence relations for BMLBMLBML-matrices

Proposition 3.4**.**

Proof.

Remark 3.5**.**

Example 3.6**.**

Remark 3.7**.**

3.3 The algorithm

Remark 3.8**.**

4 Connection with the Barth-Manteuffel multiple recurrence relation

Remark 4.1**.**

5 Some special matrices

5.1 Nearly Hermitian matrices

5.2 Nearly unitary matrices

5.3 Nearly shifted unitary matrices

6 Numerical examples

6.1 Perturbed diagonal and unitary matrices

6.2 Unitary matrix from Quantum Chromodynamics

6.3 Departure from orthogonality

7 Conclusion

8 Acknowledgements

On a fast Arnoldi method for $BML$ -matrices

Definition 1.1.

2.2 The $QR$ -factorization of a Hessenberg matrix

Definition 2.1.

Proposition 2.2.

Example 2.3.

Corollary 2.4.

Corollary 2.5.

Proposition 2.6.

Remark 2.7.

Remark 2.8.

Example 2.9.

Proposition 2.10.

Example 2.11.

3 The Arnoldi process for $BML$ -matrices

3.1 Structure formula for $BML$ -matrices

Proposition 3.1.

Corollary 3.2.

Remark 3.3.

3.2 Short recurrence relations for $BML$ -matrices

Proposition 3.4.

Remark 3.5.

Example 3.6.

Remark 3.7.

Remark 3.8.

Remark 4.1.