On pole-swapping algorithms for the eigenvalue problem
Daan Camps, Thomas Mach, Raf Vandebril, David S. Watkins

TL;DR
This paper develops a new, flexible convergence theory for pole-swapping algorithms used in eigenvalue problems, introduces an improved swapping routine, and explores their implications for algorithm design.
Contribution
A new modular convergence theory for pole-swapping algorithms and an improved swapping routine demonstrated through analysis and numerical tests.
Findings
The new swapping routine outperforms existing methods.
The modular theory applies broadly to all pole-swapping algorithms.
Insights into bi-directional chasing and shift strategies are provided.
Abstract
Pole-swapping algorithms, which are generalizations of the QZ algorithm for the generalized eigenvalue problem, are studied. A new modular (and therefore more flexible) convergence theory that applies to all pole-swapping algorithms is developed. A key component of all such algorithms is a procedure that swaps two adjacent eigenvalues in a triangular pencil. An improved swapping routine is developed, and its superiority over existing methods is demonstrated by a backward error analysis and numerical tests. The modularity of the new convergence theory and the generality of the pole-swapping approach shed new light on bi-directional chasing algorithms, optimally packed shifts, and bulge pencils, and allow the design of novel algorithms.
| Our method | ||||||
|---|---|---|---|---|---|---|
| Van Dooren | ||||||
| Sylvester | ||||||
| Our method | ||||||
|---|---|---|---|---|---|---|
| Van Dooren | ||||||
| Sylvester | ||||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On pole-swapping algorithms
for the eigenvalue problem††thanks: This research was partially supported by the Research Council KU Leuven, project C14/16/056 (Inverse-free Rational Krylov Methods: Theory and Applications).
Daan Camps222Computational Research Division, Lawrence Berkeley National Laboratory, California ([email protected]).
Thomas Mach333Department of Mathematical Sciences, Kent State University, Ohio ([email protected]).
Raf Vandebril444Department of Computer Science, KU Leuven, Belgium ([email protected]).
David S. Watkins555Department of Mathematics, Washington State University ([email protected])
Abstract
Pole-swapping algorithms, which are generalizations of the QZ algorithm for the generalized eigenvalue problem, are studied. A new modular (and therefore more flexible) convergence theory that applies to all pole-swapping algorithms is developed. A key component of all such algorithms is a procedure that swaps two adjacent eigenvalues in a triangular pencil. An improved swapping routine is developed, and its superiority over existing methods is demonstrated by a backward error analysis and numerical tests. The modularity of the new convergence theory and the generality of the pole-swapping approach shed new light on bi-directional chasing algorithms, optimally packed shifts, and bulge pencils, and allow the design of novel algorithms.
keywords:
eigenvalue, QZ algorithm, pole swapping, convergence
{AMS}
65F15, 15A18
1 Introduction
The standard algorithm for computing the eigenvalues of a small to medium-sized non-Hermitian matrix A\in\mbox{\mbox{}^{n\times n}} is still Francis’s implicitly-shifted QR algorithm [15, 34]. In many applications, eigenvalue problems arise naturally as generalized eigenvalue problems for a pencil , and for these problems the Moler-Stewart variant of Francis’s algorithm [25], commonly called the QZ algorithm, can be used. In this paper we may refer sometimes to a pencil and other times to a pair . Either way, we are talking about the same object.
A few years ago we published a generalization of the QZ algorithm [28]. More recently an even more general algorithm, the rational QZ (RQZ) algorithm, was presented by Camps, Meerbergen, and Vandebril [14]. This arose from the study of rational Arnoldi methods and is related to work of Berljafa and Güttel [6].
In this paper we discuss the RQZ algorithm and introduce several variants. We develop a new modular (and therefore more flexible) convergence theory that can be applied immediately to all variants.
We reinterpret the QZ algorithm and show that it can be viewed as a pole-swapping algorithm with poles at infinity. Moreover we will show that the algorithm [21] for optimally packed chains of bulges is a disguised implementation of pole swapping.
A key component of the RQZ and related algorithms is a procedure that swaps two adjacent eigenvalues in a triangular pencil. We present an improved swapping routine and demonstrate its superiority by numerical experiments and a backward error analysis.
Double-shift pole-swapping algorithms that can be applied to real matrix pencils exist [13, 26]. All of what is discussed in this paper for single shifts can be extended to the double-shift case, but we have not worked out every detail. The one item that will require further thought is the extension of the improved swapping routine of Section 8 to blocks larger than . A significant advantage of sticking to the complex single-shift case, as we have done here, is simplicity and clarity of presentation.
2 Hessenberg pairs
A pencil is called a regular pencil or regular pair if there is at least one complex such that is invertible. Throughout this paper we make the blanket assumption of regularity.
A matrix A\in\mbox{\mbox{}^{n\times n}} is in (upper) Hessenberg form if every entry below the first subdiagonal is zero. It is in proper Hessenberg form if every subdiagonal entry is nonzero, i.e. for , …, . A preliminary step for the algorithm is to reduce the pair to Hessenberg-triangular form. That is, is transformed by a unitary equivalence to a new pair for which is upper Hessenberg and is upper triangular. Notice that if is not properly Hessenberg, the eigenvalue problem can be split immediately into two or more independent subproblems. Thus we can always assume that we are dealing with a matrix in proper Hessenberg form.
In the new theory we deal with a more general class of Hessenberg pencils. The pair is called a Hessenberg pair if both and are Hessenberg matrices. If for some , we can immediately split the eigenvalue problem into two smaller problems. We therefore eliminate that case from further consideration. For reasons that will become apparent later, the ratios , , …, are called the poles of the Hessenberg pair . In the case , we have an infinite pole. The Hessenberg-triangular form is a special Hessenberg pair for which all of the poles are infinite.
Closely related to is the pole pair (or pole pencil ) obtained from by deleting the first row and last column. The pole pencil is upper triangular, and its eigenvalues are obviously the poles of .
Operations on Hessenberg pairs
Introducing terminology that we have used in some of our recent work [1, 2, 3, 4], we define a core transformation (or core for short) to be a unitary matrix that acts only on two adjacent rows/columns, for example,
[TABLE]
where the four asterisks form a unitary matrix. Givens rotations are examples of core transformations. Our core transformations always have subscripts that tell where the action is: acts on rows/columns and .
Following [14] we introduce two types of operations, or moves, both of which manipulate the poles in the pair. Let , …, denote the poles of the Hessenberg pair .
Changing a pole at the top or bottom. (Type I move)
We can change the pole to any value we want by applying a core transformation to the pencil on the left. Suppose we want to change to , say. Noting that only the first two entries of can be nonzero, we deduce that there is a such that the second entry of is zero. In other words,
[TABLE]
for some . If we then define and , then , which implies that . This means that is the new first pole of . The other poles remain fixed, as they are untouched by the transformation.
This operation fails only if , yielding . This happens exactly when the first columns of and are proportional. But this is not such a failure after all, as it exposes as an eigenvalue of the pencil and allows us to deflate to a smaller problem by deleting the first row and column.
In summary, if we want to replace pole by , we will either succeed in doing so or get a deflation of an eigenvalue.
**Remark **\thetheorem
When we write something like here and elsewhere, this should be viewed as shorthand for where and are any scalars for which . As a practical matter this allows us to use modest sized and even when is very large, and in particular it allows us to implement the case by taking .
The pole at the bottom can also be replaced by any other pole, say , by a similar procedure. We want to transform the pencil to with . Noting that the row vector has nonzero entries only in its last two positions, we see that there must be a core transformation that maps it to a multiple of , i.e. for some . This is the desired transformation, since it implies , which is equivalent to .
This fails only if , yielding , which happens exactly when the th rows of and are proportional. But again this is not really a failure at all, since it allows to be extracted as an eigenvalue and the problem to be deflated to a smaller one.
This discussion helps motivate the following definition. A Hessenberg pair is called a proper Hessenberg pair if three conditions hold: (i) \mbox{\mid!a_{j+1,j}!\mid}+\mbox{\mid!b_{j+1,j}!\mid}>0 for , …, , (ii) the first columns of and are not proportional, (iii) the last rows of and are not proportional. The first condition just says that for each , at least one of and is nonzero. If this condition is not satisfied, we can immediately reduce the pencil to two smaller pencils. If either of conditions (ii) and (iii) is not satisfied, we can also reduce the problem, as we know from the discussion immediately above. Therefore, we can always assume, without loss of generality, that we are working with a proper Hessenberg pair.
Proposition 2.1**.**
[14]** In a proper Hessenberg pair, the core transformation that replaces pole by satisfies
[TABLE]
*for some nonzero . *
Proof 2.2**.**
From our construction we have . Since is the first pole of the pair , we have for some . The properness assumption guarantees that both and are nonzero. Therefore , where .
Remark 2.3**.**
The insertion of the extra factor may seem mysterious. As we shall see later, this is just what is needed for a consistent convergence theory. In the product , the factor signals that the pole is entering the pencil, while the factor signals that the pole is leaving.
Proposition 2.4**.**
[14]** In a proper Hessenberg pair, the core transformation that replaces pole by satisfies
[TABLE]
*for some nonzero . *
Proof 2.5**.**
From our construction we have . Since is the last pole of the pair , we have for some nonzero . Therefore , where .
The arithmetic cost of a move of type I is just the cost of multiplying and by a single core transformation, or . If the cores are Givens rotations applied in the conventional way, the cost is about multiplications and additions, or (complex) flops. Different implementations could yield slightly different flop counts, but regardless of the details the cost will be .
Standard backward error analysis [35] shows that moves of type I are backward stable.
Interchanging two poles. (Type II move)
The second of the two allowed operations is to interchange two adjacent poles by a unitary equivalence . To understand this, consider the pole pencil obtained by discarding the first row and last column from . This pencil is upper triangular and has , …, as its eigenvalues. There are standard techniques [5, 19, 20, 27], [32, §§ 4.8, 6.6] for interchanging any two adjacent eigenvalues and . We will describe an improved method in Section 8. Each of these requires only an equivalence transformation by two core transformations and of dimension . We then enlarge these matrices by adjoining a row and column to the top of and the bottom of :
[TABLE]
Then is the desired transformation.
If the swap is done as described by Van Dooren [27], the procedure always succeeds and is backward stable in a sense. Our new swapping procedure will be shown to have improved stability. In order not to interrupt the flow of the paper, we defer the description of the new procedure, as well as a discussion of backward errors, to Section 8.
The flop count for a move of type II is about the same as for a move of type I, namely if the core transformations are implemented as Givens rotations. In any event, the flop counts for moves of type I and II are about the same, and each move costs flops.111This is the correct count for the case when only eigenvalues are being computed. If eigenvectors or some deflating subspaces are wanted as well, the transforming matrices and also need to be updated on each move. This adds about (complex) flops for a type I move and flops for a type II, but the total is still .
Remark 2.6**.**
We have one type of move that is able to change a pole at one end or the other and another type that swaps poles in the middle. It is natural to ask whether we can devise a move that changes a single pole in the middle. The answer is no. Consider a transformation
[TABLE]
where does not touch the first row and does not touch the last column. That is,
[TABLE]
Under any such transformation the poles must remain invariant. This is so because the transformation (2) is equivalent to a transformation on the pole pencil. Since the poles of are the eigenvalues of the pole pencil, they must remain fixed.
Thus any transformation meant to change a pole must touch either the first row or the last column. That’s what the moves of type I do.
3 Building an algorithm from the pieces
Suppose we want to find the eigenvalues of some regular pair . As usual, there are two steps to the process. The first is a direct method that transforms to a condensed form, in our case a Hessenberg pencil. The second step is an iterative process that uncovers the eigenvalues of the condensed form.
In some contexts the reduction phase can be skipped. As a notable example, the rational Arnoldi process [6] applied to a large matrix naturally generates, after steps, a Hessenberg pencil. The th pole of the pencil is equal to the shift that was used on the th step of the process. We can obtain estimates of the eigenvalues of the large matrix by computing the eigenvalues of the pencil. This requires no reduction; we can go directly to the iterative phase.
Reduction to a Hessenberg pencil
Moler and Stewart [25] showed how to reduce to Hessenberg-triangular form by a direct method in flops. The reduction is also described in [16, 32, 33] and elsewhere. If the resulting pair is not proper, we can split it into smaller proper pairs, so let us assume it is proper. This is a Hessenberg pencil with all poles equal to . If the user is happy to start from this configuration, s/he can move directly to the iterative phase.
If the user wants to set certain prescribed poles , …, before beginning the iterations, that is also possible. One obvious procedure is to begin by introducing at the top of the pencil by a move of type I. Then can be swapped with each of the remaining infinite poles by moves of type II until it arrives at its desired position at the bottom. The total number of moves is . Then can be introduced at the top by a move of type I. It can then be swapped with each of the remaining infinite poles until it arrives at its desired position just above . The total number of moves for this step is . Then can be introduced, and so on. Eventually we get each of , …, into its desired position. The total number of moves for this phase is about , and the total flop count is .
One can equally well introduce the poles at the bottom and swap them upward, starting with , then , and so on. The amount of work is exactly the same, about moves. Better yet, one can take and introduce , …, (in reverse order) at the top and , …, at the bottom. This cuts the number of moves in half. However one does it, the cost is .
Camps, Meerbergen, and Vandebril [14] describe a procedure that introduces the poles during the reduction to Hessenberg form. They also present an example where a good choice of poles induces a deflation in the middle of the pencil.
The iterative phase (basic algorithm)
During the discussion of moves of type I in Section 2 we defined proper Hessenberg pairs and noted that if a Hessenberg pair is not proper, it can be reduced to smaller pairs that are. We therefore assume, without loss of generality, that we have a proper Hessenberg pair with poles , …, . We now describe an iteration of the RQZ algorithm proposed in [14]. We will call this the basic algorithm.
First a shift is chosen. Any of the usual shifting strategies can be employed here. The simplest is the Rayleigh-quotient shift . Then is introduced as a pole at the top of the pencil, replacing , by a move of type I. Next is swapped with by a move of type II. Then another move of type II is used to swap with , and so on. After moves of type II, arrives at the bottom of the pencil. The poles are now , …, , and . Finally a move of type I is used to remove the pole from the bottom, replacing it by a new pole . This completes the iteration. The user has complete flexibility in the choice of . One possibility is . Another, which might be called a Rayleigh-quotient pole, is .
The cost of one iteration of the basic algorithm is moves or flops. With any of the standard shifting strategies, e.g. Rayleigh-quotient shift, repeated iterations will normally cause rapid convergence of an eigenvalue at the bottom of the pencil. Typically and quadratically, leaving as an eigenvalue and allowing deflation of the problem. After deflations, all of the eigenvalues will have been found.
There are numerous variations on the basic algorithm. For example, it can be turned upside down. We can pick a shift, say , insert it at the bottom of the pencil, and chase it to the top. Since we can do this, then why not chase shifts in both directions at once? Some possibilities along these lines will be discussed in Section 6.
Relationship to the QZ algorithm
We now show that when the basic algorithm is applied to a pair that has all poles infinity, it reduces to the single-shift version of the Moler/Stewart QZ algorithm. Consider a Hessenberg-triangular pair
[TABLE]
which has poles , , and . An iteration of the basic algorithm begins by choosing a shift and inserting it into the pair at the top by a move of type I. The transformation is , , where satisfies (1). This is exactly the same as the transformation that starts single-shift QZ [33, p. 537]. It alters the first two rows of the matrices, so the transformed matrices have the form
[TABLE]
The triangular form of has been disturbed, but this is still a Hessenberg pair. Its poles are , , . (We will continue to refer to the matrices as “" and “”, even though they change in the course of the iteration.) The next step of the basic algorithm is a move of type II that interchanges the pole with the adjacent pole , resulting in
[TABLE]
a Hessenberg pair with poles , , . The transformation has the form , , with appropriately chosen core transformations and . Let us consider now how things look if we apply the cores one at a time. Starting from the configuration shown in (3), first apply on the right. This acts on columns one and two of each matrix and produces
[TABLE]
The entry must now be zero. This is so because, as we know, after the application of on the left, must be zero, as shown in (4). The left multiplication by cannot do this job, so it must have been done by . At the same time, must produce a bulge at . This proves that is exactly the same transformation as is used at this point in the QZ bulge chase.
Now, when we apply on the left, it operates on rows two and three. It must set to zero and create a new bulge at to arrive at (4). Thus is exactly the same transformation as is used at this point in the QZ bulge chase.
The next step is a move of type II that transforms (4) to
[TABLE]
a Hessenberg pair with poles , , . The transformation has the form , . Again we could look at what happens if we apply the cores one at a time, first , then , and we would find as before that these are exactly the same transformations as in a bulge chase.
In our little example, we have now reached the bottom. In a larger example, we would continue moves of type II, pushing the pole downward, and at each step we would have the same situation. The final step is a move of type I that removes from the bottom of the pencil, replacing it by a pole . This is exactly the transform, acting on columns and , that sets (the entry entry in (5)) to zero. Again this is exactly the same as the transformation that completes the QZ bulge chase. The pair is now in Hessenberg-triangular form.
We have demonstrated that the basic algorithm reduces to the single-shift QZ algorithm in the case when all of the poles are infinite.
4 Convergence theory
In the convergence theorems in this paper we make the blanket (and generically valid) assumption that none of the poles or shifts that are mentioned are eigenvalues of the pencil. We often find it convenient to assume that is nonsingular.
The mechanism that drives all variants of Francis’s algorithm is nested subspace iteration with changes of coordinate system [33, p. 431], [34, p. 399], [1, Thm 2.2.3]. As a specific example, let us consider a single step of the QZ algorithm with shift applied to a Hessenberg-triangular pencil , yielding a new pencil with
[TABLE]
First we define some nested sequences of subspaces. For , …, , define
[TABLE]
where , …, are the standard basis vectors. Then define
[TABLE]
Thus \mbox{\mathcal{Q}}_{k} (resp. \mbox{\mathcal{Z}}_{k}) is the space spanned by the first columns of (resp. ).
Theorem 4.1**.**
A single step of the algorithm with shift effects nested subspace iterations
[TABLE]
The change of coordinate system (6) transforms both \mbox{\mathcal{Q}}_{k} and \mbox{\mathcal{Z}}_{k} back to \mbox{\mathcal{E}}_{k}.
We call this a convergence theorem even though it makes no mention of convergence. Theorems like this can be used together with the convergence theory of subspace iteration to draw conclusions about the convergence of the algorithm, as explained in [32, 33, 34] and elsewhere.
Camps, Meerbergen, and Vandebril [14, Thm. 6.1] proved a result like Theorem 4.1 for the basic algorithm. The scenario is similar. The iteration begins with a proper Hessenberg pair with poles , …, , employs a shift , and ends with a new proper Hessenberg pair with poles , …, . The old and new pairs are related by a unitary equivalence transformation of the form (6).
Theorem 4.2**.**
A single step of the basic algorithm with shift , starting with a proper Hessenberg pair with poles , …, and ending with with poles , …, effects nested subspace iterations
[TABLE]
The change of coordinate system (6) transforms both \mbox{\mathcal{Q}}_{k} and \mbox{\mathcal{Z}}_{k} back to \mbox{\mathcal{E}}_{k}.
This theorem was proved in [14], but we will also provide a proof based on our new theory in Section 5. Comparing this with Theorem 4.1, we see that the inclusion of poles gives extra freedom that might be used to improve convergence.
Now consider Theorem 4.2 in the case when all of the poles are infinite. When , the operator becomes (when appropriately rescaled) . Similarly becomes . These operators are exactly the ones that appear in Theorem 4.1, just as we would expect.
Although the QZ algorithm is a special case of the basic algorithm, there is an important difference in their implementation. The QZ algorithm acts on proper Hessenberg-triangular pencils. It is a bulge-chasing algorithm. The initial equivalence transformation of each iteration creates a bulge in the Hessenberg-triangular form. The rest of the iteration consists of equivalence transformations that chase the bulge back and forth between and until it finally disappears off of the bottom of the pencil. At that point the Hessenberg-triangular form has been restored and the iteration is complete. The QZ algorithm can also be implemented as a core-chasing algorithm, as is shown in [1] and [3], but the situation is the same: The Hessenberg-triangular form is disturbed at the beginning of the iteration and not restored until the very end.
Now let us contrast this with what happens in the basic algorithm (with infinite poles or otherwise). The basic algorithm operates on proper Hessenberg pairs, in which neither matrix is required to be triangular. Each iteration starts with a move of type I, performs a sequence of moves of type II, and ends with a move of type I. These moves do not disturb the Hessenberg form; it is preserved throughout. This implies that we can think of each move as a “mini iteration” and ask whether we can obtain a result like Theorem 4.1 or 4.2 for each individual move of type I or II. It turns out that we can.
Each move of either type is an equivalence transform of the form
[TABLE]
The case denotes a move of type I, and we have . The case also denotes a type I move, and in this case . The cases , …, are of type II. Suppose has poles , …, . A move of type II interchanges poles and . For the moves of type I, in the case , suppose the pole is replaced by a new pole ; in the case , suppose is replaced by a new pole . With this notation we can cover both types of move by a single theorem.
As above we define sequences of nested subspaces (\mbox{\mathcal{Q}}_{k}) and (\mbox{\mathcal{Z}}_{k}), where \mbox{\mathcal{Q}}_{k} (resp. \mbox{\mathcal{Z}}_{k}) is the space spanned by the first columns of (resp. ). But note that, because and are core transformations, these spaces are mostly trivial in this setting: \mbox{\mathcal{Q}}_{k}=\mbox{\mathcal{E}}_{k} except when , and \mbox{\mathcal{Z}}_{k}=\mbox{\mathcal{E}}_{k} except when .
Theorem 4.3**.**
Using notation and terminology established directly above, the move
[TABLE]
effects nested subspace iterations that are, however, mostly trivial. The nontrivial actions are
[TABLE]
and
[TABLE]
The change of coordinate system (7) transforms \mbox{\mathcal{Q}}_{j} back to \mbox{\mathcal{E}}_{j} and \mbox{\mathcal{Z}}_{j-1} back to \mbox{\mathcal{E}}_{j-1}.
The proof of Theorem 4.3 makes use of rational Krylov subspaces. Given C\in\mbox{\mbox{}^{n\times n}} and v\in\mbox{\mbox{}^{n}}, the standard Krylov subspaces \mbox{\mathcal{K}}_{j}(C,v) are defined by
[TABLE]
Given an ordered set of poles , none in the spectrum of , the rational Krylov subspaces \mbox{\mathcal{K}}_{j}(C,v,[\sigma_{1},\ldots,\sigma_{j-1}]) are defined by
[TABLE]
and in general
[TABLE]
Making the abbreviation , we can rewrite this as
[TABLE]
The span on the right-hand side involves only positive powers of , so the shifts are irrelevant; it is just the standard Krylov subspace \mbox{\mathcal{K}}_{j}(C,v). Therefore
[TABLE]
Given a pair with nonsingular, we define rational Krylov subspaces
[TABLE]
associated with the pair by
[TABLE]
and
[TABLE]
, …, . We have assumed for convenience that is nonsingular. See [14] for a definition of these spaces that does not require this assumption. We are using the symbol \mbox{\mathcal{K}}_{j} to denote several different types of Krylov subspaces. The meaning in each case is uniquely determined by the number and type of arguments.
We will make use of the following result, which is Theorem 5.6 in [14].
Proposition 4.4**.**
Let be a proper upper Hessenberg pair with poles . Let \mbox{\mathcal{E}}_{j}=\mbox{{\rm span}\left{e_{1},\ldots,e_{j}\right}}, as before. Then for , …, ,
[TABLE]
See [14] for the proof. Notice that in the \mbox{\mathcal{L}}_{j} spaces the poles are , starting from . With Proposition 4.4 in hand, we can prove Theorem 4.3.
Proof 4.5**.**
Proposition 2.1 shows that for some nonzero . This establishes the case of Theorem 4.3.
Now consider . The transformation interchanges poles and , so the ordered pole set of is
[TABLE]
Applying Proposition 4.4 to we have
[TABLE]
Therefore
[TABLE]
Noting that , using the abbreviations and , and using (8) twice, we obtain
[TABLE]
In the final step we used Proposition 4.4 again. Since , we get the desired result \mbox{\mathcal{Q}}_{j}=(A-\sigma_{j-1}B)(A-\sigma_{j}B)^{-1}\mbox{\mathcal{E}}_{j}.
In this argument we have assumed that exists. However, the result also holds for singular by a continuity argument.
Now consider the spaces \mbox{\mathcal{Z}}_{j-1}. In the case we have . Substituting and solving for , we have . The ordered pole set for is , so for some nonzero . Similarly for some nonzero . Therefore
[TABLE]
This proves that
[TABLE]
as desired.
For we have . Arguing just as we did for \mbox{\mathcal{Q}}_{j}, we have
[TABLE]
Using , and making the abbreviations and , we have
[TABLE]
Remark 4.6**.**
We used Proposition 2.1 to prove the case , but we did not use Proposition 2.4. In connection with this we remark that Theorem 4.3 immediately implies the dual results
[TABLE]
and
[TABLE]
obtained by noting that \mbox{\mathcal{U}}=C\mbox{\mathcal{S}} if and only if \mbox{\mathcal{U}}^{\perp}=(C^{*})^{-1}\mbox{\mathcal{S}}^{\perp}. We could equally well have derived the dual results first and then deduced Theorem 4.3. In that case we would use Proposition 2.4 to prove the case , and not use Proposition 2.1 at all. From Proposition 2.4 with we have immediately
[TABLE]
which implies
[TABLE]
the case of the dual result.
5 Using Theorem 4.3
In all of the convergence theorems of the previous section we have actions of the form \mbox{\mathcal{Q}}_{k}=r(AB^{-1})\mbox{\mathcal{E}}_{k} and \mbox{\mathcal{Z}}_{k}=r(B^{-1}A)\mbox{\mathcal{E}}_{k}, where is a rational function, e.g. . In the following lemma the functions and can be any functions defined on the spectrum of the pencil , but in our applications they will always be rational. In this case, being defined on the spectrum of just means that none of the poles are eigenvalues.
Lemma 5.1**.**
Consider two successive changes of coordinate system
[TABLE]
so that
[TABLE]
For , …, , if
[TABLE]
then
[TABLE]
where is the pointwise product of and . If
[TABLE]
then
[TABLE]
Proof 5.2**.**
Noting that , we have
[TABLE]
so Q\mbox{\mathcal{E}}_{k}=sr(AB^{-1})\mbox{\mathcal{E}}_{k}. The result for Z\mbox{\mathcal{E}}_{k} is proved similarly, using .
Clearly this lemma can be extended by induction to three or more successive changes of coordinate system, and that’s how we are going to use it.
Proof of Theorem 4.2
As a first application of Theorem 4.3, we show that it can be used to prove Theorem 4.2.
According to Theorem 4.2, for each the basic algorithm effects a transformation
[TABLE]
Let us see why this is so. Recall that the basic algorithm begins with a move of type I that introduces the shift as a pole at the top of the pencil. It then does a sequence of moves of type II that swap with the other poles one by one. For a given , most of these moves have no effect on \mbox{\mathcal{E}}_{k}. The only exception is the th move, the case in Theorem 4.3. This is where we need to focus.
One iteration of the basic algorithm performs the equivalence
[TABLE]
where and are products of core transformations:
[TABLE]
The core is the one that replaces pole with the shift . (together with ) swaps with , (together with ) swaps with , and so on. removes and installs a new pole . We are interested in the action of (together with ), which swaps with . Thus we factor and as
[TABLE]
where , and so on. Now we break the transformation into three parts:
[TABLE]
[TABLE]
and
[TABLE]
Because each of the cores , …, leaves \mbox{\mathcal{E}}_{k} invariant, we have
[TABLE]
We can apply Theorem 4.3 with to the transformation (10), taking into account that the poles that are swapped in the th move are and , to get
[TABLE]
Finally, noting that , …, all leave \mbox{\mathcal{E}}_{k} invariant, we have
[TABLE]
Now, applying Lemma 5.1 to the product , we get
[TABLE]
which is exactly (9).
We can prove the part of Theorem 4.2 in exactly the same way. We have
[TABLE]
and by Theorem 4.3 with ,
[TABLE]
and finally
[TABLE]
Therefore, by Lemma 5.1,
[TABLE]
Adding one to the index , we get the part of Theorem 4.2, thereby completing the proof.
Generalization of the proof
The basic algorithm is just one of many possible algorithms that make use of moves of types I and II on proper Hessenberg forms. We have already pointed out that one could run the algorithm in the opposite direction or in both directions at once. There are lots of other possibilities, and we will look at some in what follows.
From our proof of Theorem 4.2 it should now be clear that we will be able to use Theorem 4.3, together with Lemma 5.1, to analyze the action of any algorithm that acts on a proper Hessenberg pencil by moves of types I and II. Consider a transformation
[TABLE]
where and are products of core transformations generated by any sequence of moves of type I and II. If we want to find the action of on \mbox{\mathcal{E}}_{k} for some , we need only look at the core transformations of the form , i.e. the ones that act in the plane. Thus we factor into a product of the form
[TABLE]
where , , … are products of core transformations that do not act in the plane and therefore satisfy \tilde{Q}\mbox{\mathcal{E}}_{k}=\mbox{\mathcal{E}}_{k}, \check{Q}\mbox{\mathcal{E}}_{k}=\mbox{\mathcal{E}}_{k}, and so on, and , , … are cores that do act in the plane. Let us say there are such cores , …, .
The transforming matrix has a fully analogous factorization
[TABLE]
assuming we use the convention that moves of type I have the form with or with . We have \tilde{Z}\mbox{\mathcal{E}}_{k-1}=\mbox{\mathcal{E}}_{k-1}, \check{Z}\mbox{\mathcal{E}}_{k-1}=\mbox{\mathcal{E}}_{k-1}, et cetera. The transformations that act nontrivially on \mbox{\mathcal{E}}_{k-1} are , …, .
Suppose that on the move corresponding to the transformations and , the poles that get swapped are and . Then, according to Theorem 4.3, the function associated with this swap is . Let denote the product of these functions:
[TABLE]
Then, applying Lemma 5.1 to the long product of transformations defined by (12) and (13), we find that the action of on \mbox{\mathcal{E}}_{k} and of on \mbox{\mathcal{E}}_{k-1} is given by
[TABLE]
We summarize these findings as a theorem.
Theorem 5.3**.**
Consider a transformation (11), where and are products of core transformations generated by any sequence of moves of types I and II. For some suppose that of the moves acted at the th position, swapping poles and for , …, . Define a rational function by (14). Then the action of on \mbox{\mathcal{E}}_{k} and of on \mbox{\mathcal{E}}_{k-1} is given by (15). The transformation (11) transforms \mbox{\mathcal{Q}}_{k} back to \mbox{\mathcal{E}}_{k} and \mbox{\mathcal{Z}}_{k-1} back to \mbox{\mathcal{E}}_{k-1}.
6 Variations on the basic algorithm
In this section we consider algorithms built exclusively from moves of types I and II. Since the moves are backward stable, the resulting algorithms are also backward stable. We do not claim that all of the ideas presented here will result in practical algorithms; some of them are quite speculative.
The basic algorithm (like the single-shift bulge-chasing and core-chasing algorithms) takes a single shift, inserts it into the top of the pencil, and chases it to the bottom. This algorithm suffers from inefficient use of cache memory and negligible potential for parallelism. In the case of bulge-chasing algorithms the problem was remedied by selecting a large number of shifts at once, creating many small bulges one after the other, and chasing this chain of bulges together to the bottom of the matrix or pencil [9, 23, 24]. This allows the use of Level 3 BLAS and therefore efficient cache use. It also provides an opportunity for parallelism [17].
Chasing multiple shifts at once
The same remedy works for pole-swapping algorithms, as was already mentioned in [13, 14, 26]. We can choose shifts , …, , where typically .222One way to obtain shifts is to use an auxiliary routine to compute the eigenvalues of the lower-right-hand subpencil of , and use these as the shifts. Suppose the poles of are
[TABLE]
By a sequence of moves of types I and II we can replace , …, by , …, , so that the poles of the new pencil are
[TABLE]
Then we can chase these shifts together to the bottom, creating enough arithmetic to make efficient use of cache. To be precise, in the first step we would swap with , then with , and so on. Eventually we swap with , putting at the top. Then we go on to the next step.
We can pass a chain of shifts from top to bottom, and we can equally well pass a chain from bottom to top. If we wish, we can pass chains in both directions at once. Suppose we have shifts , …, that we wish to chase from top to bottom and shifts , …, that we wish to chase from bottom to top. Using moves of types I and II we can introduce them:
[TABLE]
We then chase the ’s downward and the ’s upward. The two chains pass through each other, and eventually we get to the position
[TABLE]
The reader can check that the poles in the middle, , …, , get moved around in the process, but they end up exactly where they started. At this point we can regard the iteration as complete, or we can “complete” the iteration by removing the and from the pencil and replacing them with new sets of shifts.
Let’s see what Theorem 5.3 tells us about this bi-directional procedure. Let
[TABLE]
Then for , …, we have the action
[TABLE]
The reason for this is that each of the passes downward through the th position, causing a factor , and each of the passes upward, causing a factor . This isn’t all that happens at position , but it’s all that matters. To see this, consider, for example, a position at which all of the pass through before any of the get there. Passing each downward requires also passing a upward, causing a factor . Later on, when the are being passed upward, each that was previously passed upward gets passed downward through the th position, causing a factor . The factors and cancel each other out. We know that this must happen for each because each starts and ends in the same position.
An optimistic scenario
Consider a situation in which we have in hand the information that we need to split the problem. Suppose we know a (with ) where (we think) we can split the pencil, and suppose that we have in mind an rational function
[TABLE]
that can (nearly) split it. By this we mean that r(AB^{-1})\mbox{\mathcal{E}}_{k} is (nearly) invariant under and r(B^{-1}A)\mbox{\mathcal{E}}_{k} is (nearly) invariant under . If we then take the as shifts to be passed downward and the as shifts to be passed upward, we will get both \mbox{\mathcal{Q}}_{k}=Q\mbox{\mathcal{E}}_{k}=r(AB^{-1})\mbox{\mathcal{E}}_{k} and \mbox{\mathcal{Z}}_{k}=Z\mbox{\mathcal{E}}_{k}=r(B^{-1}A)\mbox{\mathcal{E}}_{k}. The change of variables maps both of these spaces back to \mbox{\mathcal{E}}_{k}. Thus \mbox{\mathcal{E}}_{k} is (nearly) invariant under both and , which implies that (\mbox{\mathcal{E}}_{k},\mbox{\mathcal{E}}_{k}) is (nearly) a deflating subspace for . If the pencil does not quite split apart, another step with the same (or improved?) shifts may get the job done. Notice that to achieve the desired spaces \mbox{\mathcal{Q}}_{k}=r(AB^{-1})\mbox{\mathcal{E}}_{k} and \mbox{\mathcal{Z}}_{k}=r(B^{-1}A)\mbox{\mathcal{E}}_{k}, it is not necessary to pass the shifts all the way through the pencil. All that is needed is that , …, are pushed downward past position and , …, are passed upward past position .
Of course this is a very optimistic scenario. (Where do we get these special shifts?) We include it here just to indicate what might be possible and to illustrate the use of Theorem 5.3.
Practical shift strategies
A more realistic plan is to take (for example) , …, to be the eigenvalues of the lower-right-hand subpencil and , …, the eigenvalues of the upper-left-hand subpencil, which will have the effect of causing deflations near the ends of the pencil.333Notice, however, that a strategy like this should also include some provision to ensure that the upward-moving shifts are well separated from the downward-moving shifts. If some is (nearly) equal to one of the , they will (nearly) cancel each other out. An even better idea is to include aggressive early deflation [10], which is easy to implement in this context. This was already discussed in detail in [13, 14], so we will not go into it.
Steady streams of shifts
We conclude this section with one more interesting but fanciful idea. Imagine that we introduce steady streams of shifts at the top and the bottom. Eventually the streams start to pass through each other. How do we move the streams in their respective directions in an expeditious way? To answer this question let us first look at the small case , for which we have seven poles. Suppose we have at some point the poles
[TABLE]
where the shifts are moving downward and the upward. We can introduce a new shift at the top by a move of type I that removes . At the same time we can do three moves of type II to interchange with , with , and with . The result is
[TABLE]
This is one step. For the next step we use a move of type I to introduce a new shift at the bottom, removing . At the same time we do three moves of type II to interchange with , with , and with . The result is
[TABLE]
The third step is like the first, the fourth step is like the second, and so on. We can illustrate these steps schematically with a diagram.
[TABLE]
The matrix in the middle can be either or , since the same core transformations are applied to both. The cores of the first step are in black, with the move of type I marked accordingly. Each move of type II requires two cores, one on the left and one on the right. For example, the two cores marked with the symbol “” belong to a single move. The second step is in red, with the core of type I marked accordingly on the right. The third and fourth steps are marked in blue and green, respectively. Four subsequent steps are shown in grey. We are illustrating the case , which is typical of even . The odd case, which is slightly different, is left for the reader.
Before we get too excited about this elegant scheme, we must acknowledge that there are some challenges in the way of a competitive implementation. Thinking now of larger , we see that each step is rich in arithmetic and highly parallel. Each step consists of about moves or flops. To move a shift from one end of the pencil to the other requires about steps, or flops. Therefore any competitive implementation must exploit the parallelism well. Another, possibly larger, issue is this: How do we get a steady stream of good shifts to feed in at the two ends?
7 Connections to earlier work
Bulge pencils
The purpose of shifting is to accelerate convergence. In the standard Francis bulge-chasing algorithm the shifts are inserted at the top. That is, the shifts are used to help determine the initial transformation that creates the bulge. Then the shifts are forgotten, and the bulge is chased downward until it disappears off the bottom. Well-chosen shifts, inserted at the top, lead to rapid emergence of eigenvalues at the bottom of the matrix or pencil. Thus the information about the shifts is somehow transmitted in the bulge from top to bottom.
A bit more than twenty years ago one of the authors began to study the mechanism by which the shift information is conveyed in bulge-chasing algorithms. This study took some time, it seemed to be nontrivial, and it led to the discovery of the bulge pencil [29, 30, 32].
Now let’s take a fresh look at the bulge pencil in light of what we now know about pole swapping. Suppose we pick a single shift and begin chasing a bulge downward in a pair . After couple of steps we have
[TABLE]
with the bulge located at position . The subpencil outlined in (16) is the bulge pencil. Its eigenvalues are and [32, Chap. 7].444In [32] we considered (large-bulge) multishift algorithms with shifts , …, . Then the bulge pencil is and has eigenvalues , …, , and . Here we are considering only the case . If we now do one more transformation on the left, moving the bulge from to , we obtain
[TABLE]
This is a Hessenberg pair, and the eigenvalues of the bulge pencil are now in plain sight. In the position we have the pole , and in the position we have a finite pole, which we know to be the shift . What was opaque before is now transparent.
Certain structured problems require algorithms that chase bulges in both directions in order to preserve the structure. The first example of such an algorithm was the Hamiltonian QR algorithm of Byers [11, 12]. Some more recent examples are algorithms for the palindromic and even eigenvalue problems discussed in [22]. Our understanding of the bulge pencil made it possible to explain completely how to pass bulges (and the shifts that they contain) through each other in general in both structured and unstructured cases [31]. It took time and effort figure this out, but now, in light of what we know about pole swapping, we can see that passing shifts through each other is simple. It’s just a matter of swapping two eigenvalues of the pole pencil. Once again, what was opaque before is now transparent.
Tightly and optimally packed shifts
The schemes discussed in Section 6 insert not just one shift but long chains of shifts , …, into the pencil as poles and then chase them downward (or upward) in a bunch. In such a scheme it is important for efficiency to have the shifts packed as tightly together as possible. It is clear that in our current scenario we achieve this; the shifts , …, appear as adjacent poles in the Hessenberg pair, and there is no way they could be packed any closer. (The same result is achieved effortlessly when this methodology is applied to core-chasing algorithms [1].) In contrast, in the bulge-chasing scenario, the packing of bulges is not naturally optimal, and it is not obvious how to fix the problem. However, with some effort a remedy was eventually found [21]. In hindsight we can show that the remedy is a disguised implementation of a pole-swapping algorithm.
We have explained already that pole swapping reduces to bulge chasing if all poles that are not shifts are set to infinity. The philosophy is, however, different. Bulge chasing executes in each step an equivalence where the transforms on left and right act on columns and rows having the same indices, say and . Pole swapping, on the other hand, has the transformation on the left acting on rows and , while the transformation on the right acts on columns and . Pole swapping is half an equivalence off compared to bulge chasing. This lag is natural in the pole-swapping setting and appears to be the foundational strategy to get optimally packed bulges.
An optimally packed chain of two single shifts in the bulge chasing setting would, ideally, look like
[TABLE]
whereas in the pole-swapping setting it would resemble
[TABLE]
For simplicity, and without loss of generality, we restrict ourselves to two single shifts.
We have seen that getting an optimally packed chain of shifts in the pole-swapping setting is trivial. In the bulge chasing case, however, it is impossible to achieve (17). Introducing the first shift and chasing it down a row results in
[TABLE]
Introducing the second shift does not work. We end up with
[TABLE]
and both single shifts have been combined into a multishift bulge. The scheme introduced by Braman, Byers, and Mathias [8] delays the introduction of the second shift until the first has been moved two spots down. We get
[TABLE]
which are so-called tightly packed shifts. It is impossible to pack them any closer; otherwise the two bulge pencils (marked in the figure) would overlap.
A solution to pack the bulges as tight as in (17) was proposed by Karlsson, Kressner, and Lang [21]. The trick is to defer some transformations from the right. Suppose the first bulge is introduced and we would like to move it down a row; instead of executing an entire bulge-chasing step, we only execute the transformation from the left, the transformation on the right is postponed. We end up with
[TABLE]
which is nothing else than having moved the first pole down a position. Next we introduce the second shift, but we do not execute the transformation from the right. We get
[TABLE]
To start the chasing, one now brings the first shift to the right, creating a bulge and then annihlates the bulge. Thus, one does not execute an entire bulge-chase step, but again the transformation from the right is delayed. We end up with
[TABLE]
after which we can do the same with the second shift. Obviously this is just pole swapping, but the description in terms of bulges and delayed transformations conceals this fact.
Karlsson et al. [21] discussed the optimal packing of the bulges in terms of double-shift bulges. Since we have not discussed double-shift pole-swapping algorithms here, we do not explore this. The principles are, however, identical. The algorithm of Karlsson et al. [21] is a pole-swapping algorithm (with poles at infinity) avant-la-lettre.
8 The new pole-swapping procedure
We now describe the new swapping procedure that was promised at the beginning. The process of swapping two adjacent poles is equivalent to swapping two adjacent eigenvalues in the upper-triangular pole pencil. For the description it suffices to look at a subpencil. Consider therefore a upper-triangular pencil
[TABLE]
with eigenvalues and . We want to swap the eigenvalues. That is, we want to find core transformations and such that
[TABLE]
with and .
Solution in exact arithmetic
Exact method 1
This method “grabs ” and pulls it upward. Substituting in the pencil, we have
[TABLE]
from which we deduce that the vector
[TABLE]
is a right eigenvector of the pencil associated with eigenvalue . Let
[TABLE]
Direct computation shows that
[TABLE]
Thus the spaces spanned by and form a one-dimensional deflating pair for associated with the eigenvalue .
Let and be cores such that
[TABLE]
and define
[TABLE]
Then we claim that is an upper triangular pencil with the eigenvalue on top. This is verified by the calculations
[TABLE]
and
[TABLE]
This procedure fails if and only if , which happens whenever , for example. The condition implies that the eigenvalues are equal, so in this case the swap can be skipped.
Exact method 2
This method, which is the dual of the previous method, “grabs ” and pushes it downward. Substituting in the pencil, we have
[TABLE]
from which we deduce that the vector
[TABLE]
is a left eigenvector of the pencil associated with eigenvalue . Let
[TABLE]
Direct computation shows that
[TABLE]
Let and be cores such that
[TABLE]
and define
[TABLE]
Then we claim that is an upper triangular pencil with the eigenvalue on the bottom. This is verified by the calculations
[TABLE]
and
[TABLE]
This procedure fails if and only if , in which case and the swap can be skipped.
The reader can easily check that the two methods produce exactly the same and .
Solution in floating point arithmetic
In the interest of stability one should not implement either of the above procedures in practice. There are several alternatives.
Case 1
We will demonstrate below that the following procedure, which is based on exact method 1, is stable in the case \mbox{\mid!\sigma_{1}!\mid}\geq\mbox{\mid!\sigma_{2}!\mid}. Compute as in (19). Then compute such that , where \gamma=\mbox{\parallel!x!\parallel}. (Here and in what follows, the norm symbol refers to either the vector 2-norm or matrix 2-norm, depending on the context.) Then compute . Since , the first column of is . Do not compute using the vector as defined in (20). Instead compute so that . Then let
[TABLE]
Case 2
For the case when \mbox{\mid!\sigma_{1}!\mid}<\mbox{\mid!\sigma_{2}!\mid} we need a different procedure. There are multiple possibilities, the simplest of which is to apply the above procedure with the roles of and reversed. We compute as before, then use instead of to determine . Specifically, since , the first column of is . Thus we can compute so that .
This procedure is similar to that of Van Dooren [27]. His method always computes first, then uses either or to compute . The only difference is that our criterion for switching between and is different from that in [27]. This makes a difference in the backward error.
Another procedure, which is based on exact method 2, computes first. Compute the vector as in (22), then compute such that , where \zeta=\mbox{\parallel!v!\parallel}. Then compute . Since , the second row of is . Do not compute using as defined in (23). Instead compute so that . Then let
[TABLE]
This is exactly equivalent to the procedure from Case 1 applied to a “flipped” pencil. Let , the flip matrix, and consider the pencil
[TABLE]
This has the eigenvalues reversed. The condition \mbox{\mid!\sigma_{2}!\mid}>\mbox{\mid!\sigma_{1}!\mid} implies that we can stably apply the method from Case 1, and then “unflip” the result. The equation implies
[TABLE]
which shows that the roles of and are reversed in the flipped procedure. (Of course and are not exactly the same, but they contain the same information.) The “compute first” procedure that we have just outlined is a way of implementing the “flipped” procedure without actually doing the flips.
Backward error analysis
It suffices to prove backward stability in Case 1, since the options in Case 2 are both variants of Case 1.
The swapping operation is a unitary equivalence, and such transformations generally are stable [18], but there is one thing we have to check. The core is designed so that has a zero in the position. This automatically creates a zero in the position of because the first columns of and are both proportional to . This is true in exact arithmetic. We just need to check that in floating-point arithmetic the entry that is created in the position of is small enough that backward stability is not compromised by setting it to zero. For this it suffices that its magnitude be no bigger than a modest multiple of u\mbox{\parallel!A!\parallel}, where is the unit roundoff.
The swapping operation begins with the computation of in (19). In floating-point arithmetic we get
[TABLE]
where each is the result of two roundoff errors, a multiplication and a subtraction mapped back to the product terms, and therefore satisfies \mbox{\mid!\epsilon_{i}!\mid}\leq 2u+O(u^{2}). We will use the abbreviation \mbox{\mid!\epsilon_{i}!\mid}\lesssim u to mean that is no bigger than a modest constant times .
The next step is to compute . In practice we do this using fl and make additional roundoff errors in the computation. We get \tilde{Z}=\mbox{{\rm fl}(Z)} satisfying
[TABLE]
Here \tilde{\gamma}=\mbox{\parallel!\mbox{{\rm fl}}!\parallel}. A tiny relative error is made during this norm computation, and another tiny error is made when fl is divided by . These are the causes of the error , and we have \mbox{\mid!\epsilon_{5}!\mid}\lesssim u. Similarly \mbox{\mid!\epsilon_{6}!\mid}\lesssim u. For more details about this computation see [1, § 1.4].
The vector defined by (26) is our computed (and normalized) version of a right eigenvector associated with eigenvalue . For later use we wish to show that is exactly an eigenvector of a slightly perturbed pencil. Thus we seek , , , and such that
[TABLE]
Notice that we are not going to back any of the error onto or . This equation is equivalent to
[TABLE]
Filling in the values of and from (26) and (25), we can check that this equation holds if we make the assignments
[TABLE]
[TABLE]
Clearly \mbox{\mid!\tilde{\alpha}{i}-\alpha{i}!\mid}\lesssim u\mbox{\mid!\alpha_{i}!\mid} and \mbox{\mid!\tilde{\beta}{i}-\beta{i}!\mid}\lesssim u\mbox{\mid!\beta_{i}!\mid} for , . Equation (27) can be written more compactly as
[TABLE]
Thus is an eigenvector of the perturbed pencil associated with eigenvalue . We also write
[TABLE]
with and diagonal matrices satisfying \mbox{\parallel!\delta A!\parallel}\lesssim u\mbox{\parallel!A!\parallel} and \mbox{\parallel!\delta B_{1}!\parallel}\lesssim u\mbox{\parallel!B!\parallel}.
Finally we compute . In exact arithmetic is constructed so that , for some , so the first column of must be proportional to . In practice, instead of we use
[TABLE]
where \mbox{\mid!\epsilon_{i}^{\prime}!\mid}\lesssim u for , , . The computed version of is \tilde{Q}=\mbox{{\rm fl}(Q)} satisfying
[TABLE]
where \check{\zeta}=\mbox{\parallel!\check{y}!\parallel}, and and are due to the tiny roundoff errors in the calculation.
For our analysis we need to establish that there is a slightly perturbed matrix
[TABLE]
such that has an exact zero in the position. This just means that is exactly proportional to . It is easy to check that the choice
[TABLE]
does the trick. Clearly \mbox{\mid!\hat{\beta}{1}-\beta{1}!\mid}\lesssim u\,\mbox{\mid!\beta_{1}!\mid} and \mbox{\mid!\hat{\beta}{2}-\beta{2}!\mid}\lesssim u\,\mbox{\mid!\beta_{2}!\mid}, and is a diagonal matrix satisfying \mbox{\parallel!\delta B_{2}!\parallel}\lesssim u\,\mbox{\parallel!B!\parallel}.
Our final computed results are fl and fl. We have to show that the entries of these matrices are small enough that we can set them to zero without compromising backward stability. The “” part is routine. Focusing on the entry, we have
[TABLE]
where is the matrix of roundoff errors incurred in multiplying the three matrices together and satisfies \mbox{\parallel!E_{1}!\parallel}\lesssim u\,\mbox{\parallel!\tilde{Q}!\parallel}\,\mbox{\parallel!B!\parallel}\,\mbox{\parallel!\tilde{Z}!\parallel}, i.e. \mbox{\parallel!E_{1}!\parallel}\lesssim u\,\mbox{\parallel!B!\parallel}. The remaining term is
[TABLE]
The first term on the right-hand side is exactly zero by construction. The second is bounded above by \mbox{\parallel!\delta B_{2}!\parallel}\lesssim u\,\mbox{\parallel!B!\parallel}. This takes care of the “” part.
The “” part (the important part) is more delicate. We have
[TABLE]
where is the matrix of roundoff errors incurred in multiplying the three matrices together and satisfies \mbox{\parallel!E_{2}!\parallel}\lesssim u\,\mbox{\parallel!A!\parallel}. The remaining term is
[TABLE]
The second term on the right-hand side is bounded above by \mbox{\parallel!\delta A!\parallel}\lesssim u\,\mbox{\parallel!A!\parallel}, so now we can just focus on the other term. Here we make use of (28), which can be written as .
[TABLE]
The term containing is zero by construction, so now we just need to concentrate on the other term. Let . From the definitions of and we see that
[TABLE]
where \mbox{\mid!\epsilon_{i}^{\prime\prime}!\mid}\lesssim u for , . Moreover for some tiny . We also use our assumption \mbox{\mid!\sigma_{1}!\mid}\geq\mbox{\mid!\sigma_{2}!\mid} to deduce that \mbox{\mid!\beta_{1}\alpha_{2}/\beta_{2}!\mid}\leq\mbox{\mid!\alpha_{1}!\mid}. Thus
[TABLE]
so
[TABLE]
We conclude that our one remaining term, which is , satisfies
[TABLE]
We have demonstrated that
[TABLE]
so we can set these numbers to zero without compromising backward stability. The symbols hide constants, but these constants are not too large due to the small total number of operations required by the swap.
Our procedure improves on that of Van Dooren [27] in that the latter only guarantees that the two entries are bounded above by u\max\{\mbox{\parallel!A!\parallel},\mbox{\parallel!B!\parallel}\} instead of u\,\mbox{\parallel!A!\parallel} and u\,\mbox{\parallel!B!\parallel} separately. It follows that our procedure produces better results in cases where and have vastly different norms. We remind the reader that the and referred to here are the small matrices defined in (18), and not the larger matrices in which they are imbedded. Therefore we cannot solve the problem of different norms by a simple rescaling of the large matrices at the outset, as that does not guarantee equal norms in all of the little submatrices in which the swaps take place.
Numerical experiments
In most cases it does not matter which swapping procedure is used; they all perform well. In order to see a difference, they must be stress tested on pencils that have elements that vary widely in magnitude. Therefore, in the two experiments reported here, we used pencils whose nonzero entries are randomly generated complex numbers with magnitudes distributed logarithmically in the range from to .
In our first test we generated sixty-four million random upper-triangular pencils and computed the swapping transformations using three different algorithms: our method, the method of Van Dooren [27], and a method that solves the generalized Sylvester equation explicitly to determine and [7]. The computations were done in IEEE standard double-precision arithmetic, for which . Table 1 shows that our method always produces residuals \mbox{\mid!a_{21}!\mid}/\mbox{\parallel!A!\parallel} and \mbox{\mid!b_{21}!\mid}/\mbox{\parallel!B!\parallel} that are under , and more than of them are under . In contrast, the Van Dooren and Sylvester methods sometimes produce much larger residuals, approaching in a few cases. If we change the criterion and consider the residuals \mbox{\mid!a_{21}!\mid}/\Delta and \mbox{\mid!b_{21}!\mid}/\Delta, where \Delta=\max\{\mbox{\parallel!A!\parallel},\mbox{\parallel!B!\parallel}\}, then all methods perform well, as Table 2 shows. By this criterion all residuals are under . Our method and Van Dooren’s method perform about equally well, and the Sylvester method is almost as good. We conclude that if and are roughly the same, it doesn’t matter which method is used. However, in problems for which there can be large differences in magnitude between and , our method is better.
It is natural to ask whether improved backward stability of the swapping transformations actually results in more accurate computed eigenvalues of the larger pencils. To test this we considered ten thousand randomly generated upper Hessenberg pencils with logarithmically distributed entries with magnitudes varying from to . Since we do not know the exact eigenvalues of these pencils, we used MATLAB with the ADVANPIX Multiprecision Computing Toolbox555https://www.advanpix.com to compute “exact” eigenvalues in quadruple precision arithmetic. We compared these with the approximate eigenvalues computed using our method and Van Dooren’s.
Before we look at that comparison, we note we didn’t just compute the eigenvalues; in fact we computed the Schur form , where and are upper triangular. This allowed us to compute residuals
[TABLE]
which are measures of backward error. When the computation was done using our method, the residuals were always tiny, never exceeding , verifying normwise backward stability. When Van Dooren’s criterion was used, the residuals (30) were usually just as small but occasionally larger. If the denominators and in the residuals and are replaced by \Delta=\max\{\mbox{\parallel!A!\parallel},\mbox{\parallel!B!\parallel}\}, the Van Dooren residuals also become uniformly small, never exceeding .
Of course tiny backward errors do not guarantee accurate computed eigenvalues, as some of them may be ill conditioned. Moreover, decreasing the backward error does not necessarily guarantee improved eigenvalue accuracy, so we must make the comparison. Let , , , , denote the “exact” eigenvalues produced in quadruple precision, let denote the approximate eigenvalues computed by our method, and let
[TABLE]
the maximum relative error. Let denote the eigenvalues computed by Van Dooren’s method, and let denote the maximum relative error, defined analogously to as in (31).
We examined the ratios and found that just over 98% of our trials resulted in , indicating that neither method was significantly more accurate than the other. (In fact there were many cases where , since it often happens that our criterion and Van Dooren’s criterion make exactly the same decisions.) Of the remaining trials, which numbered 181, there were 145 in which our method did significantly better than Van Dooren’s, i.e. , and 36 in which . Thus our new method obtained more accurate eigenvalues in about 80% of the significant cases. More details are given in histogram form in Figure 1. The gap in the center of the figure is due to having left out the many cases for which is close to . In the interest of compactness and clarity, the figure also leaves out one “off the charts” case for which .
9 Conclusions
We have discussed the RQZ algorithm and a number of variants, which we refer to generally as pole-swapping algorithms. We have made two main contributions: 1) We have developed a flexible, modular convergence theory that can be applied to any pole-swapping algorithm. 2) We have presented a new, more accurate, swapping procedure. A backward error analysis and numerical experiments demonstrate the superiority of the new procedure.
Acknowledgment
We thank the anonymous referees for carefully reading the paper and suggesting several improvements.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. L. Aurentz, T. Mach, L. Robol, R. Vandebril, and D. S. Watkins , Core-Chasing Algorithms for the Eigenvalue Problem , SIAM, Philadelphia, 2018.
- 2[2] , Fast and backward stable computation of roots of polynomials, part II: backward error analysis; companion matrix and companion pencil , SIAM J. Matrix Anal. Appl., 39 (2018), pp. 1245–1269.
- 3[3] , Fast and backward stable computation of the eigenvalues and eigenvectors of matrix polynomials , Math. Comp., 88 (2019), pp. 313–347.
- 4[4] J. L. Aurentz, T. Mach, R. Vandebril, and D. S. Watkins , Fast and backward stable computation of roots of polynomials , SIAM J. Matrix Anal. Appl., 36 (2015), pp. 942–973.
- 5[5] Z. Bai and J. Demmel , On swapping diagonal blocks in real Schur form , Linear Algebra Appl., 186 (1993), pp. 73–95.
- 6[6] M. Berljafa and S. Güttel , Generalized rational Krylov decompositions with an application to rational approximation , SIAM J. Matrix Anal. Appl., 36 (2015), pp. 894–916.
- 7[7] A. Bojanczyk and P. Van Dooren , Reordering diagonal blocks in the real Schur form , in Linear Algebra for Large Scale and Real-Time Applications, M. Moonen, G. Golub, and B. D. Moor, eds., NATO ASI Series E: Applied Sciences, Springer, 1993, pp. 351–352.
- 8[8] K. Braman, R. Byers, and R. Mathias , The multishift QR algorithm. part I: Maintaining well-focused shifts and level 3 performance , SIAM J. Matrix Anal. Appl., 23 (2002), pp. 929–947.
