On the global convergence of the Jacobi method for symmetric matrices of order 4 under parallel strategies
Erna Begovic, Vjeran Hari

TL;DR
This paper proves the global convergence of certain parallel cyclic Jacobi methods for symmetric matrices of order 4, showing they consistently reduce the off-diagonal norm, and discusses the speed variability depending on matrix properties.
Contribution
It establishes the global convergence of specific parallel Jacobi strategies for 4x4 symmetric matrices, a result not previously confirmed for these methods.
Findings
The inequality S(A^{[2]}) ≤ (1 - 10^{-5}) S(A) holds for all symmetric 4x4 matrices after two cycles.
The method's convergence is guaranteed under all fully parallel strategies considered.
There exist matrices where the first cycle does not significantly reduce the off-diagonal norm, indicating variability in convergence speed.
Abstract
The paper analyzes special cyclic Jacobi methods for symmetric matrices of order . Only those cyclic pivot strategies that enable full parallelization of the method are considered. These strategies, unlike the serial pivot strategies, can force the method to be very slow or very fast within one cycle, depending on the underlying matrix. Hence, for the global convergence proof one has to consider two or three adjacent cycles. It is proved that for any symmetric matrix of order~ the inequality holds, where results from by applying two cycles of a particular parallel method. Here stands for the Frobenius norm of the strictly upper-triangular part of . The result holds for two special parallel strategies and implies the global convergence of the method under all possible fully parallel strategies. It is also proved that for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the Global Convergence of the Jacobi Method for Symmetric Matrices of order under Parallel Strategies
Erna Begović Kovač
and
Vjeran Hari
(Date: 5 March 2017)
Abstract.
The paper analyzes special cyclic Jacobi methods for symmetric matrices of order . Only those cyclic pivot strategies that enable full parallelization of the method are considered. These strategies, unlike the serial pivot strategies, can force the method to be very slow or very fast within one cycle, depending on the underlying matrix. Hence, for the global convergence proof one has to consider two or three adjacent cycles. It is proved that for any symmetric matrix of order the inequality holds, where results from by applying two cycles of a particular parallel method. Here stands for the Frobenius norm of the strictly upper-triangular part of . The result holds for two special parallel strategies and implies the global convergence of the method under all possible fully parallel strategies. It is also proved that for every and there exist a symmetric matrix of order and a cyclic strategy, such that upon completion of the first cycle of the appropriate Jacobi method the inequality holds.
Key words and phrases:
Eigenvalues, symmetric matrix of order 4, Jacobi method, global convergence, parallel pivot strategies
2010 Mathematics Subject Classification:
65F15, 65G99
Erna Begović Kovač, Faculty of Chemical Engineering and Technology, University of Zagreb, Marulićev trg 19, 10000 Zagreb, Croatia
Vjeran Hari, Department of Mathematics, Faculty of Science, University of Zagreb, Bijenička 30, 10000 Zagreb, Croatia
This work has been fully supported by Croatian Science Foundation under the project 3670.
1. Introduction
The Jacobi method applies a sequence of similarity transformations by plane rotations to a symmetric matrix in order to diagonalize it. The method can be described as an iterative process of the for
[TABLE]
where are plane rotations and is a symmetric matrix of order . The method is globally convergent if, for each starting , the generated sequence converges to a diagonal matrix. Its global (asymptotic) convergence has been considered in [8, 9, 3, 12, 17] ([19, 10]) and its accuracy in [4, 5, 6, 15]. A one-sided version of the method has been studied in [13, 18] and the block versions in [7, 11, 1]. There are many papers on Jacobi methods, and further references can be found within the bibliographies of the papers cited above.
At the step the method annihilates two off-diagonal elements of , and , . The element is the pivot element while and are pivot indices. The way of selecting the pivot pair at each step is called pivot strategy. The elements of are the same as in the identity matrix , except for the elements at positions , , , , which are , , , , respectively. The rotation angle is determined by the known formula
[TABLE]
which implies
[TABLE]
and
[TABLE]
Here stands for the off-norm of a symmetric matrix of order ,
[TABLE]
In the definition (1.1) of the rotation angle, we assume that if and . It is the most natural assumption which can be rephrased as: if the pivot element is zero, just skip it.
Since the diagonal elements converge if the rotation angle is chosen as in the relation (1.1) (see [14]), it is easy to show that the obtained sequence converges to some diagonal matrix if and only if
[TABLE]
Therefore, the method is globally convergent if (1.4) holds for any initial . Since the sequence is nonincreasing, for the global convergence of the method it is sufficient to show that for any symmetric matrix we have
[TABLE]
where and do not depend on . Here we prove that the relation (1.5) holds with or , for the case and for those cyclic strategies which enable parallel processing. For these strategies one cycle (or sweep) consists of three “parallel steps”.
Why would one consider the Jacobi method for symmetric matrices of order when that problem can be solved directly? Jacobi method is known for its high relative accuracy on well behaved symmetric matrices, for its efficiency on nearly diagonal matrices and for its suitability for parallel processing. So, the natural choice of a pivot strategy for matrices of order is a parallel strategy. We have discovered that parallel strategies are very special. Depending on the underlying matrix, the reduction of the quantity per sweep can be extremely slow or fast. This knowledge can be used to improve the implementation of the algorithm. Finally, the Jacobi method for large symmetric positive definite matrices is nowadays implemented as one sided block algorithm. At each step the block algorithm has to solve the same eigenvalue problem but for much smaller matrix, typically of order –. For this purpose one can use an element-wise Jacobi method or one can accelerate it by using the block algorithm which solves a by eigenvalue problem at each step.
There are several comments related to the inequality (1.5) and its proof. First, the proof presented here reveals that the reduction of the quantity during one cycle can be arbitrary small. It sheds light to convergence failure of the cyclic Jacobi method discussed in [3]. We show that for every there is a starting matrix and a cyclic Jacobi method such that upon completion of the first cycle the inequality holds. This fact is first proved for and then for any . Hence the global convergence consideration for the general cyclic Jacobi method should scrutinize more than one cycle of the process. Second, the presented result covers the most difficult part in the proof that every cyclic Jacobi method for symmetric matrices of order is globally convergent [1, 2].
The paper is divided into five sections and three appendices. In Section 2 we introduce notation and the basic concepts of the theory of equivalent strategies. We also recall some known convergence results. In Section 3 we concentrate on parallel strategies and introduce an auxiliary tool, a linear operator , which simplifies the convergence analysis. The convergence result is formulated and proved for some trivial cases. Section 4 is devoted to the global convergence proof and Section 5 to the construction of the above mentioned matrix and to the proofs of the related results. Since the proofs of the main results are pretty complicated, we have moved all lengthy and technical proofs to appendices A, B and C. They are related to the results from sections 3, 4 and 5, respectively.
Some of the results presented here can be found in the unpublished thesis [1].
2. Basic concepts and notation
For the Jacobi method for symmetric matrices of order , the pivot strategy can be defined as a function , where and \mathbf{P}_{n}=\big{\{}(i,j)\ \big{|}\ 1\leq i<j\leq n\big{\}}. We say that at step , selects the pivot pair which lies in . Let be a pivot strategy. If there is a positive integer such that for all , we say that is periodic with period . If and \{I(k)\ \big{|}\ 0\leq k\leq T-1\}=\mathbf{P}_{n}, the pivot strategy is cyclic.
For , let \mathcal{\mbox{\Large\mathcal{O}}}(S) denote the set of all finite sequences made of the elements of , assuming that each pair from appears at least once in each sequence from \mathcal{\mbox{\Large\mathcal{O}}}(S). Let be a sequence of pairs from \mathcal{\mbox{\Large\mathcal{O}}}(S). An admissible transposition on is any transposition of two adjacent pairs from ,
[TABLE]
provided that . For such pairs we say that they commute, or that they are disjoint. Two sequences \mathcal{O},\mathcal{O}^{\prime}\in\mathcal{\mbox{\Large\mathcal{O}}}(S) are called
- (i)
Equivalent if one can be obtained from the other by a finite number of admissible transpositions. Then we write .
- (ii)
Shift-equivalent if and , where stands for the concatenation of the sequences and . We write .
- (iii)
Weakly equivalent if one can find from \mathcal{\mbox{\Large\mathcal{O}}}(S) such that in the sequence , each pair of adjacent terms , , , consists of either equivalent or shift-equivalent terms. In such a case, we write .
One can check that , and are equivalence relations on \mathcal{\mbox{\Large\mathcal{O}}}(S). In our application we shall have .
Once these equivalence relations are defined on \mathcal{\mbox{\Large\mathcal{O}}}(\mathbf{P}_{n}), they can easily be transferred to the set of cyclic pivot strategies. Here is the procedure.
Let be a cyclic pivot strategy. By we mean the sequence of pairs , , , . Conversely, for \mathcal{O}\in\mathcal{\mbox{\Large\mathcal{O}}}(\mathbf{P}_{n}), , the cyclic strategy generated by is defined by , provided that , , . In other words, runs through in the cyclic way as increases.
Two cyclic strategies and are equivalent (we write ), shift-equivalent () and weakly equivalent () if the same is true for the corresponding sequences and . Note that for the shift-equivalent strategies we have , , for some shift , . (We can confine to nonnegative shifts since .)
The importance of weakly equivalent cyclic strategies comes from the following result.
Theorem 2.1**.**
[17]* If the Jacobi method converges for some cyclic strategy , then it also converges for all strategies that are weakly equivalent to .*
Note that Theorem 2.1 also covers the cases of equivalent and shift-equivalent strategies. Another important result regarding the convergence under two weakly equivalent strategies is proved in [11, Lemma 4.8].
A cyclic strategy can be represented by the matrix , where
[TABLE]
and , . Instead of , we shall display to indicate that the diagonal positions are not part of the pivot sequence (see (3.1)). If , we shall also write .
3. Parallel strategies in the case
Let be a symmetric matrix of order . Since the length of each \mathcal{O}\in\mathcal{\mbox{\Large\mathcal{O}}}(\mathbf{P}_{4}) equals , each cyclic Jacobi method applies six steps within one cycle. Among all cyclic strategies a distinguished role is played by the “parallel” ones. They enable parallel processing, so the corresponding method will be called parallel Jacobi method (cf. [16]). Each parallel Jacobi method for symmetric matrices of order applies three parallel steps within each cycle. Every parallel step consists of two consecutive steps which can be performed concurrently. This way, instead of six sequential steps, using a parallel pivot strategy, we apply three parallel steps within one cycle.
As we shall see, it will be sufficient to study just two cyclic pivot strategies and , which have the following two-dimensional representations
[TABLE]
respectively. All cyclic strategies that can be fully parallelized, are shift equivalent to or . Therefore, the convergence results for all parallel strategies follow from the results for the strategies and .
Consider the sets of pairs , , . Note that the pairs within braces commute. These are the only sets that contain commuting pairs and only they can define parallel Jacobi steps. From the first braces we see that the corresponding plane rotations and commute, and also their entries can be computed independently of each other. So, the corresponding Jacobi steps can be applied in parallel: first apply concurrently the left transformations and then the right ones, or vice versa. This corresponds to the one parallel step which consists of two subsequent ordinary Jacobi steps. The same can be said for the steps corresponding to the other two braces. This leads us to parallel strategies, which we represent by the matrices
[TABLE]
Here, the matrix entries count the parallel steps and mark the pivot positions associated with them.
By inspecting all commuting pairs, we conclude that there are exactly six parallel strategies and they can be grouped into two clusters which are actually equivalent classes for the relation . They are defined by the following orderings from \mathcal{\mbox{\Large\mathcal{O}}}(\mathbf{P}_{4}), where , ,
[TABLE]
In order to prove the global convergence of the Jacobi method under all six parallel strategies, it is sufficient to prove it for the strategies and . This follows from Theorem 2.1. Next, we show that the strategies and are closely connected, so that the method converges under one of them if and only if it converges under the other one. To this end, note that the matrices and are permutationally similar,
[TABLE]
where or . Here is the transposition which interchanges rows (columns) and if a matrix is premultiplied (postmultiplied) by it. If (3.2) holds, we say that and are permutationally equivalent (see [1, 2]).
Proposition 3.1**.**
Let be a symmetric matrix of order . Let be obtained by applying the cyclic Jacobi method defined by the strategy on . Let or , and let be obtained by applying the cyclic Jacobi method defined by the strategy on . Then , .
Proof.
The proof has been moved to A. ∎
Thus, Proposition 3.1 implies that the Jacobi method converges under the strategy if and only if it converges under the strategy . In particular, if the relation (1.5) holds for the method defined by , with some and , it holds for the method defined by with the same and , and vice versa. Theorem 3.4 below, shows that the relation (1.5) holds for the strategy with and .
What can be said for the method under the strategies , and , ? For these strategies, the relation (1.5) holds with the same and with larger for . In particular, for and . We shall show it for . For the other three strategies the proof is similar.
Let us apply the Jacobi method defined by the strategy to a symmetric matrix of order , thus generating the sequence of matrices , , . Let us consider cycles of the method. We display each second iterate, i.e. the iterates obtained after each of the first nine parallel steps:
[TABLE]
We concentrate on the matrix . If another Jacobi method is applied to , the one defined by the strategy , one obtains (after each two steps) the same matrices . After two sweeps, one obtains the matrix and, if the relation (1.5) holds for with and , then one has . Therefore, one obtains
[TABLE]
proving the claim.
3.1. The cyclic strategy
We focus on strategy where
[TABLE]
By this strategy, at the beginning of each cycle (except for the first cycle), the elements at the positions and are zero. Since we consider the global convergence, we can assume that the initial matrix already has the form
[TABLE]
Let denote the column partition of the identity matrix, and let
[TABLE]
The similarity transformation with and with has the following effect on the elements of a square matrix ,
[TABLE]
Thus, for each we have . We see a favorable movement of the elements lying at the pivot positions for the parallel steps. We can use it to define a new iterative process, closely related to the original Jacobi process, where the pivot elements always remain at the same positions. This will simplify the analysis.
Therefore, we introduce a linear operator which is comprised of the transformation corresponding to the first parallel step under followed by the similarity transformation with .
Definition 3.2**.**
Let denote the vector space of real by symmetric matrices. For let
[TABLE]
where and are Jacobi rotations which annihilate the elements and of , respectively, and is defined by the relation (3.4). The rotation angles , are from the interval , so that the formulas (1.1)–(1.3) hold.
For let and for any
[TABLE]
Thus, is a linear operator. Note that if and , then reduces to the similarity transformation with the similarity matrix . The function is not linear. However, it satisfies
[TABLE]
If is as in the relation (3.3) and , then we have
[TABLE]
with
[TABLE]
The rotation angles and are determined by
[TABLE]
First, we show that the repeated application of to yields the matrices which are closely related to Jacobi iterations under the parallel strategy .
Proposition 3.3**.**
Let and let be obtained by applying steps of the Jacobi method under the strategy to . Then
[TABLE]
Proof.
The proof is lengthy and technical, so we have moved it to A. ∎
In particular, the relation (3.8) implies
[TABLE]
We use Proposition 3.3 to simplify the proof of the main result which follows.
Theorem 3.4**.**
Let be such that , and let be obtained by applying steps of the Jacobi method under the strategy to . Then
[TABLE]
with .
Note that steps correspond to two sweeps of the method. Theorem 3.4 ensures the global convergence of the method since the sequence of iterates is nonincreasing and its subsequence converges to zero.
The proof of the main theorem is lengthy, hence we will devote the entire Section 4 to it. However, we first provide a lemma that covers the special cases when more than two off-diagonal elements are equal to zero. Then the relation (3.10) holds with much larger ( or ).
Lemma 3.5**.**
Let be such that and . If
- (i)
* and , then is diagonal.*
- (ii)
* and , then is diagonal.*
- (iii)
, then .
- (iv)
, then .
- (v)
, then .
- (vi)
, then .
Proof.
The proof has been moved to A. ∎
4. Proof of Theorem 3.4
Let . Then the assertion (3.10) of Theorem 3.4 can be expressed in the form
[TABLE]
Instead of working with matrices , , we shall work with , . Let , so that holds. If , then Theorem 3.4 holds. We assume .
Contrary to the assertion of the theorem suppose that
[TABLE]
We shall show that the relation (4.2) leads to a contradiction.
From Lemma 3.5, we conclude that all off-diagonal elements of except for and are non-zero for . Furthermore, by Lemma 3.5(i), we have , and since , , we have
[TABLE]
Let
[TABLE]
From our assumptions it follows that for at least . Note that
[TABLE]
This implies
[TABLE]
in particular
[TABLE]
Formulas (3.6) describe the transition from to for any . In this transition we denote the angles and by and , respectively. If we set , , , , , then from the formulas (3.6) we have
[TABLE]
Here, has been bounded by , as in the proof of Lemma 3.5(v). This implies
[TABLE]
Together with (4.6), the relation (4.7) yields
[TABLE]
Lemma 4.1**.**
Exactly one of the following two assertions holds:
- (a)
, ,
- (b)
, .
Proof.
Suppose that the two inequalities in (a) hold for some , . Then, because of the relations (4.5) and (4.3), we have
[TABLE]
and therefore
[TABLE]
Thus, the corresponding inequality in (b) cannot hold for that . Similarly, if the two inequalities in (b) hold for some , then the corresponding inequality in (a) cannot be true.
Now, let us show that if holds for , then it holds for all . From the relations (3.6) it follows
[TABLE]
The relation (4.11) implies
[TABLE]
for . Therefore, if the two inequalities in (a) hold for , then the relation (4.9) will hold for . For any of these the relation (4.10) also holds, proving that the inequalities in (b) cannot hold. Hence we can conclude that the both inequalities in (a) hold for .
Similarly, if the two inequalities in (b) hold for , they hold for all and then the inequalities in (a) do not hold. ∎
We continue to prove (4.1) under the assumption . The case (b) will be addressed later.
4.1. The case
Let us see what can be concluded for the rotation angles. Using , for , , respectively, from the relations (3.6) one easily obtains
[TABLE]
Hence,
[TABLE]
We used the definition of from (4.4). In the same way, one obtains
[TABLE]
for . Using the relations (4.5), (4.3), the assumption (a) and Lemma 4.1, we conclude that the relation (4.14) implies
[TABLE]
for . Thus
[TABLE]
Here we used . In (4.15) we have the strict inequalities when and that is certainly true for .
Lemma 4.2**.**
For the angles , , , we have the following relations.
- (i)
One of the following two relations holds
[TABLE]
[TABLE]
- (ii)
.
- (iii)
.
- (iv)
If , then
[TABLE]
- (v)
.
- (vi)
.
- (vii)
.
Proof.
The proof is technical and has been moved to B. ∎
From the proof, one can easily check that in the assertions of Lemma 4.2, the inequality signs and standing left to can be replaced by and respectively, provided that (which is true for ).
Let
[TABLE]
Lemma 4.1(a) implies , . From the relation (4.11) it follows
[TABLE]
Hence by Lemma 4.2(ii) we have
[TABLE]
The next step is bounding , , by simple functions of the subsequent s.
Lemma 4.3**.**
The quantities satisfy the following inequalities
[TABLE]
Hence we obtain
[TABLE]
Proof.
The proof has been moved to B. ∎
Lemma 4.4**.**
For the pivot elements of we have:
- (i)
*. *
In particular,
[TABLE]
- (ii)
\big{(}b_{13}^{(k)}+b_{24}^{(k)}\big{)}^{2}=2\delta_{k}^{2}S^{2}(B)-\big{(}b_{13}^{(k)}-b_{24}^{(k)}\big{)}^{2},\ k\geq 0.* *
Hence,
[TABLE]
and in particular
[TABLE]
- (iii)
.
Proof.
The proof is technical and has been moved to B. ∎
We come to the main part of the proof. So far, we derived the restrictions on the angles and other quantities, which are expressed in the previous lemmas. The question arises whether the diagonal elements, which enter into the definition of the angles, can allow all those limitations.
Consider the quantities
[TABLE]
Each will be expressed in two ways. On the one hand, we use (3.6) and (B.3) to obtain
[TABLE]
Hence, by the assertions (vi) and (vii) of Lemma 4.2, for we have
[TABLE]
On the other hand, we use (B.3) or (3.7) to obtain
[TABLE]
Therefore, we have
[TABLE]
Using Lemma 4.4(iii) we obtain
[TABLE]
Furthermore, using Lemma 4.2(v), (4.21) and Lemma 4.4 we can bound from the above the right hand side of the inequality (4.23) divided by . We obtain
[TABLE]
These inequalities hold for . Hence, for we have
[TABLE]
By inspecting the two cases, one checks that the case yields the contradiction with the relation (4.2) from the beginning of this proof. This is exactly what we need to prove the assertion (3.10) of the theorem. For the relation (4.24) becomes
[TABLE]
Using Lemma 4.4(ii) (actually the bound from (4.20)) the above relation implies
[TABLE]
From Lemma 4.3 we have and . Moreover,
[TABLE]
Dividing the inequality (4.25) by and using (4.26) and (4.18) we get the contradiction
[TABLE]
We used the bound for and , and for .
4.2. The case
The proof is similar as in the case . We follow the lines of the proof above and modify it where necessary.
The expression on the right-hand side in the relation (4.13) can easily be brought to different form. We obtain
[TABLE]
Then the relation (4.14) becomes
[TABLE]
This implies
[TABLE]
Lemma 4.2 has to be modified and we formulate it as a new lemma.
Lemma 4.5**.**
For the angles , , , we have the following relations.
- (i)
One of the following two relations holds
[TABLE]
[TABLE]
- (ii)
.
- (iii)
.
- (iv)
If , then
[TABLE]
- (v)
.
- (vi)
.
- (vii)
.
Proof.
The proofs of these assertions are very similar or identical to the proofs of the corresponding assertions of Lemma 4.2. ∎
Instead of , we work with . The relation (4.12) and the assertion (ii) of Lemma 4.5 imply
[TABLE]
The statement of Lemma 4.3 does not have to be modified, but the proof needs minor changes. We have explained those changes in B under the title “Proof of Lemma 4.3 in the case ”.
Lemma 4.4 has to be modified.
Lemma 4.6**.**
For the pivot elements of we have:
- (i)
*. *
In particular,
[TABLE]
- (ii)
\big{(}b_{13}^{(k)}-b_{24}^{(k)}\big{)}^{2}=2\delta_{k}^{2}S^{2}(B)-\big{(}b_{13}^{(k)}+b_{24}^{(k)}\big{)}^{2},\ \text{for}\ k\geq 0*. *
Hence,
[TABLE]
and in particular
[TABLE]
- (iii)
.
Proof.
The proof is similar to the proof of Lemma 4.4. We have moved it to B. ∎
To prove the main assertion (4.1) we use the same as earlier. The assertions (vi) and (vii) of Lemma 4.5 yield
[TABLE]
Using (4.22) we obtain
[TABLE]
Using Lemma 4.6(iii), the left-hand side can be bounded from below by and for the case one can use Lemma 4.6(ii) to further reduce it to .
Using (4.29), (4.28), Lemma 4.5(v), and Lemma 4.6(i), the right-hand side divided by can be bounded from above by
[TABLE]
For , after dividing by , one obtains
[TABLE]
which is the same inequality as (4.25). The rest of the proof is the same as earlier. ∎
At this point we would like to make a few comments.
- •
Theorem 3.4 obviously holds with somewhat larger , e.g. one can try to complete the proof with . On the other hand, a small from the proof exposes the possibility of the very small reduction of within one cycle. This happens when an underlaying matrix has a special structure. The next section deals with this issue.
- •
Although is a small decrease of the off-norm within two cycles, the result does not mean that the convergence of the method should be slow. The proof is concentrated on the worst case scenario. Typically, the slower the method is within one cycle, the faster it is in the next cycle. Example 5.1 indicates that behavior.
- •
In this convergence proof we have explicitly used the diagonal elements of , which is unusual when the reduction of is considered. Usually, only the off-diagonal elements and the bounds on rotation angles are used (e.g. [12, 17, 19, 10]). In that case the proof is valid for a more general iterative process used in the global convergence analysis of Jacobi-type processes which use nonorthogonal transformation matrices [11, 1].
5. The slow off-norm reduction within one cycle
As it can be seen from the above theory, the decrease of the off-norm after one cycle of the Jacobi method under the strategy can be small. Here we give an example from [1], where the relative decrease of the off-norm after one cycle is less then .
Example 5.1**.**
Let
[TABLE]
with , , .
We have used MATLAB Symbolic Math Toolbox, in particular the Variable-precision arithmetic with digits, to compute the matrix iterates under the cyclic Jacobi method defined by the strategy . We display the off-norm of each iterate to significant digits. For we obtain
[TABLE]
As we can see from the table below, during the first cycle the off-norm of does not change in the first decimal places. But later it drops rapidly, especially in the th step.
[TABLE]
In general, one can always find a matrix such that the decrease of the off-norm after one cycle of the Jacobi method under the strategy is arbitrary small and depends only on .
Proposition 5.2**.**
Let ,
[TABLE]
and let the cyclic Jacobi method defined by the strategy be applied to , thus generating the matrices . After completing one full sweep we have
[TABLE]
Proof.
The proof is lengthy and technical, so it has been moved to C. ∎
We end the paper with the following important theorem.
Theorem 5.3**.**
For every and , there exists a symmetric matrix of order , depending on and a cyclic strategy , such that
[TABLE]
Here and is obtained from by applying a full cycle of the Jacobi method under the strategy .
Proof.
Let , , and , where is from Proposition 5.2. Let . Proposition 5.2 yields to ,
[TABLE]
Since
[TABLE]
the proof is completed in this case.
If , then . Hence, we can choose to obtain .
Let and let
[TABLE]
be a symmetric matrix of order with the following properties.
- (i)
is of order such that holds when one full cycle of the Jacobi method under the strategy is applied to . This follows from (5.1) because we have proved the theorem for .
- (ii)
The block is diagonal.
The pivot strategy is defined by , , where is any ordering of the set and is any ordering of the set .
Obviously, the whole sweep on reduces to the sweep on under the strategy since all other Jacobi angles are zero. ∎
Let us show that the blocks and of the matrix can be chosen such that all their entries are nonzero. Indeed, we can make other sets of the assumptions on . One such set of the assumptions is the following.
- (i)
is of order and such that
[TABLE]
holds when a full cycle of the Jacobi method under the strategy is applied to . The existence of such an follows from (5.1) because we have proved the theorem for .
- (ii)
We have , where satisfies .
- (iii)
We have and
[TABLE]
where
[TABLE]
The pivot strategy is defined as in the proof of Theorem 5.3.
To keep the paper shorter we do not give a rigorous proof, but we make few essential remarks. After completing the sweep on the inequality (5.2) still holds, only the superscript on the left-hand side has to be replaced by , . We also have and . Due to the condition (5.3) all later angles will be bounded by some multiples of . The sum of squares of the last pivot elements will be bounded by some multiple of , which will eventually yield the required result.
Acknowledgements
The authors are thankful to the anonymous referees for their excellent remarks which improved the readability of the paper.
Appendix A Proofs related to Section 3
A.1. Proof of Proposition 3.1
We prove the proposition for . The proof for the case is similar. Since the both Jacobi processes are cyclic, it is sufficient to prove the proposition for . Let
[TABLE]
where
[TABLE]
Note that . Let us inspect the product . We have
[TABLE]
It remains to show that , , , , , . Since
[TABLE]
it immediately follows from (1.1) that , . Thus, after completing the first two steps in each of the two processes, we have
[TABLE]
This shows that the relation (A.1) holds if and are replaced by and , respectively. Checking the angle formula (1.1) we find that , and therefore . The last check is the easiest one since the denominators in (1.1) for the angles and are opposite to those for the angles and . ∎
A.2. Proof of Proposition 3.3
Let us denote , . An easy calculation shows that
[TABLE]
Hence and it is sufficient to show that the relation (3.8) holds for . We shall show
[TABLE]
Consider two processes, the first one is defined by the relation , , and the second one is the Jacobi method under the strategy . These two processes generate the matrices , , and , , respectively. The rotation angles at the step of the first process will be denoted by and , . The rotation angle at the step of the Jacobi method will be denoted by , . Thus, and are used to compute , while is used to compute .
For the assertion (3.8) takes the form which is correct since and .
Let . Then
[TABLE]
By Definition 3.2 angles and are the Jacobi angles which annihilate the elements of at positions and . Therefore, we have and , and consequently . Thus, , which had to be proved.
Let . We use the fact that the assertion (3.8) holds for . Using the relations (A.2) and (3.5) one obtains
[TABLE]
For the rotation angles which annihilate the elements at positions and we have
[TABLE]
hence,
[TABLE]
The relation will hold provided that
[TABLE]
and the relation (A.5) will hold provided that
[TABLE]
It is easy to see that the relations (A.6) and (A.7) follow from the relations (3.5) and (A.4).
The proof for proceeds in the same manner as for , but with different indices. ∎
A.3. Proof of Lemma 3.5
We shall use the notation from the proof of Proposition 3.3.
- (i)
If and , then and
[TABLE]
- (ii)
Since the first two pivot elements and are zero, the corresponding rotation angles and are zero as well, and . The next two pivot elements and are the only possibly nonzero off-diagonal elements. Hence, .
- (iii)
Since , we have . The relations (3.6) imply
[TABLE]
We used the assumption . Using the relation (3.9), we have
[TABLE]
- (iv)
The proof is same as (iii), only and are used instead of and .
- (v)
Let . From the relations (3.6) we get
[TABLE]
We bounded the expression using the function for , on . The minimum of that function equals . Hence,
[TABLE]
- (vi)
The proof is same as , only is used instead of . ∎
Appendix B Proofs related to Section 4
B.1. Proof of Lemma 4.2
First, note that the following two inequalities hold:
[TABLE]
- (i)
Relations (4.15) and (B.1) imply
[TABLE]
The assertion follows from the fact that the rotation angles are from the interval .
- (ii)
The assertion follows from (i).
- (iii)
Using the relations (B.2) and (B.1) we have
[TABLE]
- (iv)
From the relations (B.2) and (B.1) we have either
[TABLE]
or
[TABLE]
Hence, using (4.6) one obtains the lower bound for the tangents. The upper bound is obvious since the angles lie in the segment . For the latter assertion note that holds for any real and equality is attained only for . Recall that we can write , . Hence,
[TABLE]
- (v)
Specifying we have
[TABLE]
Since and have the opposite sign, the absolute value of their sum cannot be larger than the larger term.
- (vi)
Let . Using the notation and ideas from the proof of (iv), we have
[TABLE]
This implies
[TABLE]
- (vii)
The proof is similar to the proof of (vii). If
[TABLE]
∎
B.2. Proof of Lemma 4.3
In terms of the elements of matrices and for the angle formulas (3.7) take the form
[TABLE]
The relation (4.17) and Lemma 4.1(a) imply
[TABLE]
From the relations (3.6) for we have
[TABLE]
Combining that with the angle formulas (B.3) one obtains
[TABLE]
Recall that for any we have . This implies
[TABLE]
The relations (B.5), (B.6) and (B.7) imply
[TABLE]
After squaring and summing the inequalities (B.8) and (B.9), using (4.4) and the inequality which holds for any real and , for we get
[TABLE]
Bounding the term is simple. Using (B.3), (4.4) and Lemma 4.2(v) we obtain
[TABLE]
The relations (B.10) and (B.11) imply
[TABLE]
Bounding is more demanding. From the relations (3.6) for we have
[TABLE]
From the relations (3.6) we also get
[TABLE]
Using (B.13), (B.14), (B.3), Lemma 4.2(iii), (4.4), Lemma 4.2(v), (4.16), (4.17) and (B.4) for one obtains
[TABLE]
Here, for the term is replaced by one. Next, we use the inequality and combine (B.12) and (B.15). After canceling by we have
[TABLE]
Specifically, for , one obtains
[TABLE]
Since
[TABLE]
it follows
[TABLE]
We used the fact that for . Combining the bounds for and in order to eliminate the term we obtain
[TABLE]
To bound by we just insert the upper bound for and into the expression within the parentheses. In a similar way (as ) the bound for can be obtained. To bound by we use the bounds and for and . Finally, the upper bounds for and are obtained (in this order) by inserting the best available bounds into the appropriate expressions. ∎
B.3. Proof of Lemma 4.4
- (i)
The assertion follows from (B.14), (4.16) and (4.17) or (B.4).
- (ii)
We use the parallelogram law, for real, and the definition of from (4.4). For the inequalities (4.20) follow from (4.19) and Lemma 4.3. We have
[TABLE]
- (iii)
We use the first case in Lemma 4.2(i)
[TABLE]
with . The proof of (B.16) is the same if we use , instead of , , respectively. From the relations (3.6) it follows that
[TABLE]
Hence, for we can use the assertion (ii) to obtain
[TABLE]
Combining that with (B.16) one obtains , . ∎
B.4. Proof of Lemma 4.3 in the case
The relations (B.3)–(B.12) remain the same except for the relation (B.4) in which and are replaced by and , respectively. For the relation (B.13) can be written as
[TABLE]
and instead of (B.14) we use (B.17). The relation (B.15) takes the form
[TABLE]
for . Here we used the relations (B.18), (B.17), (4.27) and the assertions (iii) and (v) of Lemma 4.5. The rest of the proof follows the remaining lines in the proof of Lemma 4.3. ∎
B.5. Proof of Lemma 4.6
The proof of the first two assertions is quite similar to the proof of the appropriate assertions of Lemma 4.4. To prove the third assertion instead of the relation (B.16) we now have
[TABLE]
where we used . The proof of (B.16) is the same if we use , instead of , , respectively. Using (B.14) and the assertion (ii) for we obtain
[TABLE]
Combining that with (B.19) we get , . ∎
Appendix C Proofs related to Section 5
C.1. Proof of Proposition 5.2
We use the operator from Definition 3.2. Let and for .
For we compute from . The elements and are annihilated and the off-norm reduction equals
[TABLE]
For the rotation angels and we have
[TABLE]
Using the notation from Lemma 4.2 we have
[TABLE]
Hence
[TABLE]
and
[TABLE]
Since , for we have
[TABLE]
The relation (C.5) implies
[TABLE]
and consequently
[TABLE]
Since , we have . Therefore, using the relation (C.6) we obtain
[TABLE]
Thus,
[TABLE]
Note that
[TABLE]
hence we have to bound . From the relations (3.6) it follows that
[TABLE]
Inserting (C.9) and (C.10) into (C.8) and using (C.7) we obtain
[TABLE]
Let . We have
[TABLE]
From the relations (C.9), (C.10) and (C.7) we conclude that the pivot elements and are negative. Using the relations (C.2) and (C.7) we bound their moduli from below
[TABLE]
Moreover, using (3.6) or (B.13) and (C.3), (C.4) we obtain
[TABLE]
and
[TABLE]
Hence, we conclude that , . Like in Lemma 4.2(i), we set
[TABLE]
From the relations (C.12), (C.3) and (C.4) we obtain
[TABLE]
Then, for and it holds
[TABLE]
Next we bound the off-norm reduction in the third parallel step which equals . We use the relations (4.13), (4.5) and (4.11) to obtain
[TABLE]
Finally, from (C.1), (C.8), (C.11) and (C.13) it follows
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. Begović: Convergence of Block Jacobi Methods . Ph.D. thesis, University of Zagreb, 2014.
- 2[2] E. Begović Kovač, V. Hari: Jacobi method for symmetric matrices of order 4 converges for every cyclic pivot strategy . ar Xiv:1701.02387 [math.NA]
- 3[3] K. W. Brodlie, M. J. D. Powell: On the convergence of cyclic Jacobi methods . IMA J. Appl. Math. 15 (3) (1975) 279–287.
- 4[4] J. Demmel, K. Veselić: Jacobi’s method is more accurate than QR . SIAM J. Matrix Anal. Appl. 13 (1992) 1204–1245.
- 5[5] Z. Drmač, K. Veselić: New fast and accurate Jacobi SVD algorithm I . SIAM J. Matrix Anal. Appl. 29 (4) (2008) 1322–1342.
- 6[6] Z. Drmač, K. Veselić: New fast and accurate Jacobi SVD algorithm II . SIAM J. Matrix Anal. Appl. 29 (4) (2008) 1343–1362.
- 7[7] Z. Drmač: A global convergence proof of cyclic Jacobi methods with block rotations . SIAM J. Matrix Anal. Appl. 31 (3) (2009) 1329–1350.
- 8[8] G. E. Forsythe, P. Henrici: The cyclic Jacobi method for computing the principal values of a complex matrix . Trans. Amer. Math. Soc. 94 (1960) 1–23.
