On Matrix Rearrangement Inequalities
Rima Alaifari, Xiuyuan Cheng, Lillian B. Pierce, Stefan Steinerberger

TL;DR
This paper proves that matrix rearrangement inequalities hold for all disordered words in 2x2 matrices and for most small perturbations of the identity in larger matrices, extending previous partial results.
Contribution
It establishes the validity of matrix rearrangement inequalities for all disordered words in 2x2 matrices and for generic small perturbations in larger matrices, improving upon prior characterizations.
Findings
Rearrangement inequality holds for all disordered words in 2x2 matrices.
For larger matrices, the inequality holds for most small perturbations of the identity.
Counterexamples exist only for specific matrix sizes and configurations.
Abstract
Given two symmetric and positive semidefinite square matrices , is it true that any matrix given as the product of copies of and copies of in a particular sequence must be dominated in the spectral norm by the ordered matrix product ? For example, is Drury has characterized precisely which disordered words have the property that an inequality of this type holds for all matrices . However, the -parameter family of counterexamples Drury constructs for these characterizations is comprised of matrices, and thus as stated the characterization applies only for matrices with . In contrast, we prove that for matrices, the general rearrangement inequality holds for all disordered words. We also show that for larger matrices, the general rearrangement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On Matrix Rearrangement Inequalities
Rima Alaifari, Xiuyuan Cheng, Lillian B. Pierce and Stefan Steinerberger
Rima Alaifari: Department of Mathematics, ETH Zürich, Rämistrasse 101, 8092 Zürich
Xiuyuan Cheng: Department of Mathematics, Duke University, 120 Science Drive, Durham NC 27708
Lillian B. Pierce: Department of Mathematics, Duke University, 120 Science Drive, Durham NC 27708
Stefan Steinerberger: Department of Mathematics, Yale University, 10 Hillhouse Avenue, New Haven, 06511 CT
Abstract.
Given two symmetric and positive semidefinite square matrices , is it true that any matrix given as the product of copies of and copies of in a particular sequence must be dominated in the spectral norm by the ordered matrix product ? For example, is
[TABLE]
Drury [10] has characterized precisely which disordered words have the property that an inequality of this type holds for all matrices . However, the -parameter family of counterexamples Drury constructs for these characterizations is comprised of matrices, and thus as stated the characterization applies only for matrices with . In contrast, we prove that for matrices, the general rearrangement inequality holds for all disordered words. We also show that for larger matrices, the general rearrangement inequality holds for all disordered words, for most (in a sense of full measure) that are sufficiently small perturbations of the identity.
Key words and phrases:
Rearrangement Inequality, Linear Operators, Matrix inequalities.
2010 Mathematics Subject Classification:
15A45, 47A30, 47A63 (primary) and 39B42 (secondary).
R.A. thanks David Gontier for fruitful discussions. X.C. is partially supported by the NSF (DMS-1818945, DMS-1820827). L.P. is partially supported by CAREER grant NSF DMS-1652173 and the Alfred P. Sloan Foundation. S.S. is partially supported by the NSF (DMS-1763179) and the Alfred P. Sloan Foundation.
1. Introduction
1.1. Introduction.
Rearrangement inequalities for functions have a long history; we refer to Lieb and Loss [20] for an introduction and an example of their ubiquity in Analysis, Mathematical Physics, and Partial Differential Equations. A natural question that one could ask is whether there is an operator-theoretic variant of such rearrangement inequalities. For example, given two operators and , is there an inequality
[TABLE]
where is a norm on operators? In this paper, we will study the question for being symmetric and positive semidefinite square matrices and denoting the classical operator norm
[TABLE]
We are interested in whether one could hope for a statement of the general type
[TABLE]
where
[TABLE]
with positive integers (except that we allow or ). Of course, if the operators commute then any such inequality is trivially an equality. A reason why one might hope in general for such a statement to be true is that one could expect the repeated application of only one operator to lead to growth (or at least preservation) of the norms of suitable eigenvectors, while alternating applications of two operators could have the effect of projecting alternately onto two possibly different eigenbases, thus losing size of the eigenvectors.
1.2. Known results.
There are several encouraging results in this direction, some of which are by now classical in Operator Theory, and have been extended in a variety of different ways. We note:
- •
Heinz-Löwner inequality (Heinz [16], 1951), (Löwner [21], 1934) stating that
[TABLE]
- •
Heinz-Kato inequality (Heinz [16], 1951), (Kato [19], 1952). If are positive operators and is a linear operator such that and for all in a Hilbert space, then
[TABLE]
- •
Cordes inequality (Cordes [8], 1987). For all symmetric and positive definite and all
[TABLE]
- •
McIntosh’s inequality (McIntosh [23], 1979) generalizes several of the earlier results and shows that for as above and an arbitrary square matrix of the same size,
[TABLE]
The last author characterized equality for several of these inequalities in [27].
- •
Furuta’s inequality [12] (see also [8]) shows that for any
[TABLE]
There is a large literature connected to these inequalities; we refer to [3, 6, 9, 11, 14, 17] as well as the books by Bhatia [4, 5], Cordes [8], Furuta [13], Marshall, Olkin & Arnold [22], Simon [26] and Zhan [28]. Many open problems remain. The authors themselves were motivated by a conjecture of Recht and Ré [25] who asked whether, for positive definite matrices , there is an inequality
[TABLE]
Recht and Ré [25] proved the inequality for ; Zhang [29] recently gave a proof for and being a multiple of 3. Israel, Krahmer and Ward [18] prove the inequality for ; we also refer to recent work of Albar, Junge and Zhao [1]. One way of interpreting the conjectured inequality of Recht and Ré is that repetition of matrices has a beneficial effect on the operator norm; this leads to asking about matrix rearrangement inequalities, as studied in this paper.
1.3. Statement of results.
Consider a putative inequality
[TABLE]
where and (possibly allowing or ) and are symmetric and positive semidefinite square matrices. Is it true that given any “word,” that is, a tuple of exponents , the inequality (1) holds for all such ? Drury [10] has shown that at this level of generality, the question has a negative answer. Moreover, he provides a complete characterization of conditions on the exponents for which such an inequality holds for all such (of all dimensions). For example, Drury shows that we always have
[TABLE]
while the inequality
[TABLE]
The counterexamples given by Drury to the general rearrangement inequality stem from a 1-parameter family of matrices. In contrast, our first main result is that the general rearrangement inequality does indeed hold true for any word, for all symmetric positive semidefinite matrices.
Theorem 1** (General Rearrangement Inequality for Matrices).**
Let be symmetric positive semidefinite matrices of size and let (possibly allowing or ). Then
[TABLE]
where and .
In light of Drury’s results, there is no hope for such general inequalities in higher dimensions. Nonetheless, one could wonder whether there is hope that, given any word , a rearrangement inequality should hold for some (or maybe even most) pairs of matrices . This is the motivation for our second result, which states that given any word, the rearrangement inequality is generically true for matrices in a sufficiently small neighborhood of the identity, for all .
Theorem 2** (General Rearrangement close to the Identity, arbitrary dimension).**
Let be symmetric positive semidefinite matrices and let (possibly allowing or ). If , then there exists such that for all
[TABLE]
where and .
Thus given any fixed word, this provides a codimension 1 family of among all relevant pairs of matrices in the neighborhood of the identity, which satisfy the rearrangement inequality for that word. We do not know whether the condition is necessary but are inclined to think that it may not be.
There are many other natural questions that come to mind. The rearrangement inequalities are invariant under multiplication with constants, which allows us to compactify the set of matrices: are such inequalities generically true (in, say, the sense that the measure of admissible matrices approaches full measure as the length of the inequality, or the number , increases)? Another question could be to determine other simple conditions on the matrices (other than assuming that they commute) that would imply the desired rearrangement inequalities hold.
2. Proof of Theorem 1
Our proof uses three different ingredients. The first ingredient is Corollary 4.4 in a paper of Ando, Hiai & Okubo [2] which states the following: let be symmetric positive semidefinite matrices of size and for , let satisfy
[TABLE]
then
[TABLE]
We remark that Ando, Hiai & Okubo [2] were motivated by the question whether such a trace inequality might be true in general: they establish the result for general positive semidefinite matrices that have at most two distinct eigenvalues. Plevnik [24] recently constructed an example showing that (2) can fail for matrices.
The second ingredient is the invariance of trace with respect to cyclic permutations, i.e.
[TABLE]
The third ingredient is the basic equation
[TABLE]
where denotes the largest eigenvalue of a matrix.
Let be symmetric positive semidefinite matrices of size . Consider now a general word
[TABLE]
where
[TABLE]
Assume the symmetric matrices and have eigenvalues (not necessarily distinct) given by
[TABLE]
We note that all these eigenvalues are nonnegative. Moreover, assuming the ordering and , we have by (4) that
[TABLE]
Thus to prove Theorem 1, it suffices to show that
[TABLE]
Defining , we employ the cyclic identity (3) followed by (2) with and
[TABLE]
followed by a second application of the cyclic identity (3) to obtain
[TABLE]
Since the trace is merely the sum of the eigenvalues, this shows that
[TABLE]
On the other hand, the determinant is multiplicative, and so
[TABLE]
It is simple to deduce from these two relations that (6) must hold.
Indeed, if , we have the desired result (6). If but then either or must vanish, by (8). If , then implies that and we have a contradiction to (7). Thus in this case we must have , and then the desired inequality (6) follows from (7). It remains to deal with the case when and are both nonzero, which implies that . Suppose contrary to (6) that for some ; then (7) implies that for some . Then by (8),
[TABLE]
which is the desired contradiction. (Alternatively one can use (7) and (8) to prove, using induction and repeated squaring of both sides of (7), that for any , For such expressions the leading term is asymptotically dominant and this shows .) This verifies (6) and hence completes the proof of Theorem 1.
3. Proof of Theorem 2
Let be fixed symmetric positive semidefinite matrices, and assume that the tuple of exponents is fixed, with and . Let denote the corresponding word in terms of , analogous to (5). The proof idea can be summarized as follows. Let denote , and let denote . We will choose a vector with that maximizes
[TABLE]
Then as long as we can show that for this we have
[TABLE]
we can conclude that
[TABLE]
thus proving Theorem 2.
By simply multiplying out and , we will see that the leading order terms (in ) come in both cases from a matrix of the form
[TABLE]
This motivates us to show that a significant proportion of must lie in the eigenspace of the largest eigenvalue of the matrix (Lemma 1 below). This observation will suffice to examine terms up to second order in in the desired inequality (9). Next, to treat the terms of third order and higher in , we will use a second lemma (Lemma 2 below), which shows that if , for an eigenvector corresponding to the largest eigenvalue of , the third order terms provide a strict inequality. This therefore allows us to neglect all higher order terms in (as long as is sufficiently small), and that leads to the desired inequality (9).
3.1. Two Lemmata
Our first lemma states that a one-parameter family of matrices that is approximately given by the identity plus a small linear term has the property that the eigenvector corresponding to its largest eigenvalue is necessarily very close to the leading eigenspace of the linear perturbation . This statement is certainly not novel, but we provide its simple proof.
Lemma 1**.**
Let , where is a symmetric positive semidefinite matrix and varies, giving a one-parameter family. For each let be a vector satisfying and
[TABLE]
Let be the orthogonal projection onto the eigenspace of the largest eigenvalue of . Then there exists a constant and also such that for every ,
[TABLE]
Proof.
Let us simplify notation and write and . Observe that they are orthogonal and thus
[TABLE]
We have, expanding up to first order,
[TABLE]
in which the implicit constant depends on . We will now see that several terms simplify. If has only one eigenvalue, then the projection is merely the identity and the result follows. From now on we may suppose that has at least two distinct eigenvalues and we use to denote the largest eigenvalue of and to denote the next largest. Then
[TABLE]
Altogether we have
[TABLE]
We recall that was chosen to maximize over all . In particular, if is an eigenvector of for with , then
[TABLE]
Applying this in (11) shows that there is a constant (depending on the implicit constants in the terms, and hence on ) such that as long as is sufficiently small (again relative to the implicit constants in the terms),
[TABLE]
Using , we obtain
[TABLE]
where and therefore
[TABLE]
where for all sufficiently small, for a parameter depending only on , and hence only on . ∎
Our second lemma states rearrangement inequalities for an eigenvector of the largest eigenvalue of (motivated by Lemma 1). The argument is again elementary but the statement itself is so specific that it is presumably new.
Lemma 2**.**
Let be symmetric and positive semidefinite square matrices such that . Fix and let denote the largest eigenvalue of . Then there exists a constant such that for all vectors satisfying
[TABLE]
we have the inequalities
[TABLE]
[TABLE]
Proof.
We start by showing the first inequality. We claim
[TABLE]
in the sense that
[TABLE]
is positive semidefinite. Indeed, we have that
[TABLE]
with equality if and only if is an eigenvector of corresponding to eigenvalue . This holds since is symmetric and positive semidefinite and its operator norm thus coincides with its largest eigenvalue. We now suppose with satisfies (12). Solving for in , we can rewrite
[TABLE]
We now need to compare this to which we can rewrite as
[TABLE]
subtracting this from (14) we see by (13) that
[TABLE]
Now we aim to show that this inequality is strict if satisfies (12). From our previous observation about (13), we know that equality holds in this last inequality precisely when is an eigenvector of corresponding to eigenvalue . Suppose this is true. Then
[TABLE]
while on the other hand, multiplying our assumption (12) by on the left-hand side shows that
[TABLE]
Subtracting these two identities shows that , violating our assumption . We conclude that cannot be an eigenvector for corresponding to , and hence the inequality in (15) is strict, for any satisfying (12). By compactness of the unit ball , there exists a constant such that
[TABLE]
concluding the proof of the first claim. As for the second inequality, we relabel and , obtain from the first case that
[TABLE]
and note that .
∎
3.2. Conclusion of the proof of Theorem 2
We are now ready to prove Theorem 2. We recall from the beginning of §3 that we consider a particular word , with sequences of exponents and , . We let denote . We choose a vector with that maximizes and it suffices to show that , as explained in (10). We will expand the unordered product and the ordered product up to the third term, with respect to . Lemma 1 will restrict the types of vectors we will have to study, Lemma 2 will give us a strict inequality in the third order terms, and the desired inequality will follow from that.
Precisely, in the above setting we will prove that there exist positive constants depending on such that for all ,
[TABLE]
Here the implicit constant depends on . Consequently, for all sufficiently small , the left-hand side is in fact strictly positive, and Theorem 2 follows.
A simple expansion shows that
[TABLE]
[TABLE]
where
[TABLE]
and, for combinatorial coefficients depending only on the sequences of exponents and ,
[TABLE]
and
[TABLE]
Thus an expansion up to third order shows that for any ,
[TABLE]
and
[TABLE]
We now use to denote the vector maximizing among all , and we aim to show the inequality (16) for
[TABLE]
using the above expansions (17) and (18).
We will first see that terms in this difference that are at most second order in cancel exactly, in fact for any . Indeed, the term of order 0 in , that is , cancels and, since , so does the term of order . Next, for the second order terms, for any vector ,
[TABLE]
The other terms of second order, and again coincide trivially (and hence cancel in the difference) because . This shows that for any , the terms of at most second order (with respect to ) cancel in the difference (19).
We now analyze the third order terms in the difference (19), which include terms of two types, namely
[TABLE]
and
[TABLE]
For the first type of term, we can use the fact that to see the terms corresponding to vanish, and similarly for , so that
[TABLE]
Altogether, we obtain that for any , the third order contributions of the difference (19) are given by
[TABLE]
Now we specialize to considering with that maximizes . We apply Lemma 1 to conclude that there is an and a constant such that for every , we can write
[TABLE]
where is the projection of onto the eigenspace corresponding to the largest eigenvalue of , and have the following properties: and is orthogonal to , so that . (We note that both and also depend on but suppress this for simplicity of notation). In (20) we see that
[TABLE]
since the first term vanishes; thus this type of term contributes
[TABLE]
to (22). A similar expansion for the other terms (21) shows that
[TABLE]
with an implicit constant depending on . Now that we have restricted to an inner product involving only , we apply (21) for the vector and note that Lemma 2 implies that
[TABLE]
with the constant provided by the lemma. This is strictly positive for all sufficiently small with respect to . To conclude, we have proved (16), and this completes the proof of Theorem 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] W. Albar, M. Junge and M. Zhao, On the symmetrized arithmetic-geometric mean inequality for operators, ar Xiv:1803.02435
- 2[2] T. Ando, F. Hiai and K. Okubo, Trace inequalities for multiple products of two matrices. Math. Inequal. Appl. 3 (2000), no. 3, 307–318.
- 3[3] E. Andruchow, G. Corach and D. Stojanoff, Geometrical significance of Löwner-Heinz inequality. Proc. Amer. Math. Soc. 128 (2000), no. 4, 1031–1037.
- 4[4] R. Bhatia, Matrix analysis. Graduate Texts in Mathematics, 169. Springer-Verlag, New York, 1997.
- 5[5] R. Bhatia, Positive definite matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton, NJ, 2007.
- 6[6] G. Corach, H. Porta and L. Recht, An operator inequality. Linear Algebra Appl. 142 (1990), 153–158.
- 7[7] H. Cordes, A matrix inequality. Proc. Amer. Math. Soc. 11 (1960) 206–210.
- 8[8] H.O. Cordes, Spectral Theory of Linear Differential Operators and Comparison Algebras, London Mathematical Society Lecture Note Series, vol. 76, Cambridge University Press, Cambridge, 1987
