Von Neumann Type of Trace Inequalities for Schatten-Class Operators
Gunther Dirr, Frederik vom Ende

TL;DR
This paper extends von Neumann's trace inequality and eigenvalue inequalities from finite-dimensional matrices to Schatten-class operators on infinite-dimensional Hilbert spaces, utilizing recent results on the $C$-numerical range.
Contribution
It introduces a generalization of classical trace and eigenvalue inequalities to Schatten-class operators in infinite-dimensional settings, expanding their applicability.
Findings
Generalized von Neumann's trace inequality to Schatten-class operators.
Extended eigenvalue inequalities for hermitian operators.
Utilized recent $C$-numerical range results for the generalization.
Abstract
We generalize von Neumann's well-known trace inequality, as well as related eigenvalue inequalities for hermitian matrices, to Schatten-class operators between complex Hilbert spaces of infinite dimension. To this end, we exploit some recent results on the -numerical range of Schatten-class operators. For the readers' convenience, we sketched the proof of these results in the Appendix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Editor
\revisionMonth dd, yyyy
Von Neumann Type of Trace Inequalities for Schatten-Class Operators
Gunther Dirr and Frederik vom Ende
GUNTHER DIRR, Institute of Mathematics, University of Würzburg, D-97074 Würzburg, Germany
FREDERIK VOM ENDE, Department of Chemistry, Technische Universität München, D-85747 Garching, Germany–and–Munich Centre for Quantum Science and Technology (MCQST), D-80799 München, Germany
(Date: Month dd, yyyy)
Abstract.
We generalize von Neumann’s well-known trace inequality, as well as related eigenvalue inequalities for hermitian matrices, to Schatten-class operators between complex Hilbert spaces of infinite dimension. To this end, we exploit some recent results on the -numerical range of Schatten-class operators. For the readers’ convenience, we sketched the proof of these results in the Appendix.
1991 Mathematics Subject Classification:
4
keywords:
-numerical range; Schatten-class operators; trace inequality; von Neumann inequality
††volume-info: Volume 00, Number 0, 0000
7B10, 15A42, 47A12
1. INTRODUCTION
In the mid thirties of the last century, von Neumann [20, Thm. 1] derived the following beautiful and widely used trace inequality for complex matrices:
Let with singular values and , respectively, be given. Then
[TABLE]
where denotes the unitary group.
In fact, the above result can be reinterpreted as a characterization of the image of the unitary double-coset under the trace-functional, i.e.
[TABLE]
with and being the closed disk of radius centred around the origin. This results from the elementary observation that the left-hand side of (1.2) is circular (simply replace by ). Another well-known consequence of (1.1), a von Neumann inequality for hermitian matrices [10, Ch. 9.H.1], reads as follows.
Let hermitian with respective eigenvalues and be given. Then
[TABLE]
where the superindeces and denote the decreasing and increasing sorting of the eigenvalue vectors, respectively.
The area of applications of von Neumann’s inequalities and, more generally, singular value decompositions (SVD) is enormous. It ranges from operator theory [6, 17] and numerics [8] to more applied fields like control theory [9], neural networks [14] as well as quantum dynamics and quantum control [7, 18]. An overview can be found in [10, 12]. Now the goal of this short contribution is to generalize these inequalities to Schatten-class operators on infinite-dimensional Hilbert spaces. In doing so, some recent results on the -numerical range of Schatten-class operators [3, 4] turn out to be quite helpful. For the readers’ convenience, we sketched the corresponding proofs in Appendix A.
This paper is organized as follows: Section 2 introduces the key notions and concepts of this work such as 2.1 Schatten classes, 2.2 convergence of compact sets via the Hausdorff metric as well as 2.3 the -numerical range for Schatten-class operators. Section 3 then presents the main results as mentioned above. Appendix A outlines the outsourced proof of some crucial geometrical results regarding the -numerical range.
2. NOTATION AND PRELIMINARIES
Unless stated otherwise, here and henceforth and are arbitrary infinite-dimensional complex Hilbert spaces while and are reserved for infinite-dimensional separable complex Hilbert spaces. Moreover, let , , , and denote the set of all bounded, unitary, compact, finite-rank and -th Schatten-class operators between and , respectively. As usual, if and coincide we simply write , , etc.
Scalar products are conjugate linear in the first argument and linear in the second one. For an arbitrary subset , the notations and stand for its closure and convex hull, respectively. Finally, given , we say and are conjugate if .
2.1. INFINITE-DIMENSIONAL HILBERT SPACES AND THE SCHATTEN CLASSES
For a comprehensive introduction to Hilbert spaces of infinite dimension as well as Schatten-class operators, we refer to, e.g., [1, 11] and [5]. Here, we recall only some basic results which will be used frequently throughout this paper.
Lemma 2.1** (Schmidt decomposition).**
For each , there exists a decreasing null sequence in as well as orthonormal systems in and in such that
[TABLE]
where the series converges in the operator norm.
As the singular numbers in Lemma 2.1 are uniquely determined by , the -th Schatten-class is (well-)defined via
[TABLE]
for . The Schatten--norm
[TABLE]
turns into a Banach space. Moreover, for , we identify with the set of all compact operators equipped with the norm
[TABLE]
Note that coincides with the ordinary operator norm . Hence constitutes a closed subspace of and thus a Banach space, too.
Remark 2.2**.**
Evidently, if for some then the series (2.1) converges in the Schatten--norm.
The following results can be found in [5, Coro. XI.9.4 & Lemma XI.9.9].
Lemma 2.3**.**
- (a)
Let . Then for all , :
[TABLE]
- (b)
Let . Then and for all .
Note that due to (a), all Schatten-classes constitute–just like the compact operators–a two-sided ideal in the -algebra of all bounded operators .
Now for any , the trace of is defined via
[TABLE]
where can be any orthonormal basis of . The trace is well-defined, as one can show that the right-hand side of (2.2) is finite and does not depend on the choice of . Important properties are the following, cf. [5, Lemma XI.9.14].
Lemma 2.4**.**
Let and with conjugate. Then one has and with
[TABLE]
In order to recap the well-known diagonalization result for compact normal operators, we first have to fix the term eigenvalue sequence of a compact operator . In general, it is obtained by arranging the (necessarily countably many) non-zero eigenvalues in decreasing order with respect to their absolute value and each eigenvalue is repeated as many times as its algebraic multiplicity111By [11, Prop. 15.12], every non-zero element of the spectrum of is an eigenvalue of and has a well-defined finite algebraic multiplicity , e.g., , where is the smallest natural number such that . calls for. If only finitely many non-vanishing eigenvalues exist, then the sequence is filled up with zeros, see [11, Ch. 15]. For our purposes, we have to pass to a slightly modified eigenvalue sequence as follows:
- •
If the range of is infinite-dimensional and the kernel of is finite-dimensional, then put zeros at the beginning of the eigenvalue sequence of .
- •
If the range and the kernel of are infinite-dimensional, mix infinitely many zeros into the eigenvalue sequence of .
Because in Definition 2.12 arbitrary permutations will be applied to the modified eigenvalue sequence, we do not need to specify this mixing procedure further, cf. also [3, Lemma 3.6].
- •
If the range of is finite-dimensional leave the eigenvalue sequence of unchanged.
Lemma 2.5** ([1], Thm. VIII.4.6).**
Let be normal, i.e. . Then there exists an orthonormal basis of such that
[TABLE]
where is the modified eigenvalue sequence of .
2.2. SET CONVERGENCE
In order to transfer results about convexity and star-shapedness of the -numerical range from matrices to Schatten-class operators, we need a concept of set convergence. We will use the Hausdorff metric on compact subsets (of ) and the associated notion of convergence, see, e.g., [13].
The distance between and a non-empty compact subset is given by , based on which the Hausdorff metric on the set of all non-empty compact subsets of is defined via
[TABLE]
The following characterization of the Hausdorff metric is readily verified.
Lemma 2.6**.**
Let be two non-empty compact sets and let . Then if and only if for all , there exists with and vice versa.
With this metric one can introduce the notion of convergence for sequences of non-empty compact subsets of such that the maximum- as well as the minimum-operator are continuous in the following sense.
Lemma 2.7**.**
Let be a bounded sequence of non-empty, compact subsets of which converges to . Then the sequences of real numbers and are convergent with
[TABLE]
Proof.
Let . By assumption, there exists such that for all . Hence by Lemma 2.6 one finds with and thus Similarly, there exists such that , so Combining both estimates, we get . The case of the minimum is shown analogously. ∎
2.3. THE -NUMERICAL RANGE OF SCHATTEN-CLASS OPERATORS
In this subsection, we present a few approximation results and collect some material on the -numerical range of Schatten-class operators which is of fundamental importance in Section 3. Because said results appeared only in an addendum [4] to another publication [3] on trace-class operators, we decided to sketch the proof in the appendix for the readers’ convenience.
Definition 2.8**.**
Let be conjugate. Then for and , the -numerical range of is defined to be
[TABLE]
Following (1.2), for and with conjugate one may actually introduce the more general set (now invoking the unitary equivalence orbit of instead of the unitary similarity orbit )
[TABLE]
Note that all traces involved are well-defined due to Lemma 2.3 and 2.4.
Lemma 2.9**.**
Let , and be a sequence in which converges strongly to . Then one has , , and for with respect to the norm .
Proof.
The cases and are proven in [3, Lemma 3.2]. As the proof for is essentially the same, we sketch only the major differences. First, choose such that
[TABLE]
where satisfies and for all . The existence of the constant is guaranteed by the uniform boundedness principle. Then decompose into with finite-rank. By Lemma 2.3 one has
[TABLE]
Thus, what remains is to choose such that for all . To this end, consider the estimate
[TABLE]
Then the strong convergence of yields such that
[TABLE]
for and all . This shows as . All other assertions are an immediate consequence of for and
[TABLE]
Proposition 2.10**.**
Let , with conjugate and let and be sequences in and , respectively, such that Then
[TABLE]
If, additionally, then
[TABLE]
Proof.
W.l.o.g. let for some –else all the involved sets would be trivial–so we may introduce the positive but (as seen via the reverse triangle inequality) finite numbers
[TABLE]
Let . By assumption there exists such that
[TABLE]
for all . We shall first tackle (2.3), as (2.4) can be shown in complete analogy. The goal will be to satisfy the assumptions of Lemma 2.6 in order to show for all .
Let so one finds , such that satisfies . Thus for by Lemma 2.3 and 2.4
[TABLE]
for all .
Similarly, let . Then for one finds , such that satisfies . Thus for we obtain
[TABLE]
The preceding proposition together with Lemma 2.9 immediately entails the next result.
Corollary 2.11**.**
Let , with conjugate. Then where is the orthogonal projection onto the span of the first elements of an arbitrarily chosen orthonormal basis of .
Here we used the well-known fact that the orthogonal projections strongly converge to the identity for , cf., e.g., [3, Lemma 3.2].
Definition 2.12** (-spectrum).**
Let be conjugate. Then, for with modified eigenvalue sequence and with modified eigenvalue sequence , the -spectrum of is defined via
[TABLE]
Hölder’s inequality and the standard estimate , cf. [11, Prop. 16.31], yield
[TABLE]
showing that the elements of are well-defined and bounded by .
Now, if the operators and are particularly “nice”, one can connect the -numerical range and the -spectrum of as follows:
Theorem 2.13** ([4]).**
Let and with conjugate. Then the following statements hold.
- (a)
* is star-shaped with respect to the origin.*
- (b)
If either or is normal with collinear eigenvalues, then is convex.
- (c)
If and both are normal, then . If, in addition, the eigenvalues of or are collinear then .
As stated in the beginning, a sketch of the proof can be found in Appendix A.
3. MAIN RESULTS
Considering the inequalities (1.1) and (1.3) from the introduction, it arguably is easier to generalize the former, i.e. to generalize von Neumann’s “original” trace inequality to Schatten-class operators. To start with we first investigate the finite-rank case.
Lemma 3.1**.**
Let , and . Then where .
Proof.
Defining as above, Lemma 2.1 yields orthonormal systems , in and , in such that
[TABLE]
Note that forcing both sums to have same summation range means that, potentially, some of the singular values have to be complemented by zeros, which is not of further importance.
“”: Let any , be given. Then
[TABLE]
by direct computation. Now consider the subspaces
[TABLE]
so there exist orthonormal bases of the form
[TABLE]
of and for some , respectively. W.l.o.g.222This can be done for example by sufficiently expanding the “smaller” orthonormal systems in or and possibly passing to new subspaces or which is always doable because we are in infinite dimensions. The particular choice of and is irrelevant because we only need the orthonormal systems which represent and to be contained within these finite-dimensional subspaces. we can assume and define
[TABLE]
for . This yields matrices
[TABLE]
which satisfy . By construction, one readily verifies that are orthonormal systems in so for all . Thus von Neumann’s original result (1.1) yields
[TABLE]
“”: We first consider unitary operators , such that and for all . This is always possible by completing the respective orthonormal systems , to orthonormal bases , which can then be transformed into each other via some unitary. This allows us to construct such that
[TABLE]
for any , . Of course and the latter satisfies
- •
: choose , and also
- •
: choose , as cyclic shift on the first basis elements, i.e.
[TABLE]
and similarly (on ).
Now because the unitary group on any Hilbert space is path-connected 333The standard argument for this goes as follows, cf. [16, Proof of Thm. 12.37]: For every there exists self-adjoint such that . Then is a continuous mapping of into with and . Thus every unitary operator is path-connected to the identity which implies path-connectedness of ). and because the mapping , is continuous, the image has to be path-connected as well. In particular, [math] and are path-connected within , i.e. for every there exists such that .
Finally, we can use the fact that is circular–which follows easily by replacing by with –to conclude and thus . ∎
Theorem 3.2**.**
Let , with conjugate. Then
[TABLE]
In particular, one has with .
Proof.
By Lemma 2.1 , for some orthonormal systems , in and , in . This allows us to define finite rank approximations and To pass to the original operators , we use Remark 2.2 to see
[TABLE]
Because of this we may apply Proposition 2.10 and Lemma 3.1 to obtain
[TABLE]
with . Using the obvious fact for all one readily verifies with . ∎
Remark 3.3**.**
To see that the supremum in (3.1) is not necessarily a maximum, consider with standard basis . Now the positive definite trace-class operator as well as the compact operator satisfy
[TABLE]
for any . We know that but if this was a maximum, then by the above calculation for all . The only operators which satisfy these conditions are the left- and the right-shift, respectively, both of which are not unitary–a contradiction.
Finally, we are prepared to extend inequality (1.3) to Schatten-class operators on separable Hilbert spaces.
Theorem 3.4**.**
Let , both be self-adjoint with conjugate and let the positive semi-definite operators and denote the positive and negative part of , respectively (i.e. , ). Then
[TABLE]
as well as
[TABLE]
*In particular, one has:
-\sum_{j=1}^{\infty}\big{(}\lambda_{j}^{\downarrow}(C^{+})\lambda_{j}^{\downarrow}(T^{-})+\lambda_{j}^{\downarrow}(C^{-})\lambda_{j}^{\downarrow}(T^{+})\big{)}\leq\operatorname{tr}(CT)\leq\sum_{j=1}^{\infty}\big{(}\lambda_{j}^{\downarrow}(C^{+})\lambda_{j}^{\downarrow}(T^{+})+\lambda_{j}^{\downarrow}(C^{-})\lambda_{j}^{\downarrow}(T^{-})\big{)}
Proof.
Let , both be self-adjoint with conjugate and first assume that has at most non-zero eigenvalues. Then the following is straightforward to show:
[TABLE]
Note that in this case the (modified) eigenvalue sequences of contains infinitely many zeros. Now let us address the general case. Choose any orthonormal eigenbasis of with corresponding modified eigenvalue sequence (Lemma 2.5). Moreover, let the projection onto the span of the first eigenvectors of . Then has at most non-zero eigenvalues and our preliminary considerations combined with Corollary 2.11 and Theorem 2.13 (c) as well as Lemma 2.7 readily imply
[TABLE]
where we used the identity . Now, the last step is to show that converges to . Let (and w.l.o.g. ). As is a sequence in we find with
[TABLE]
where for , the left-hand side becomes .
Either way, associated to this one can choose such that the first largest eigenvalues of are listed in and thus for all . Putting things together and using Hölder’s inequality yields
[TABLE]
The case of as well as the infimum-estimate are shown analogously which concludes the proof. ∎
Therefore if are self-adjoint (i.e. ), a path-connectedness argument similar to the proof of Lemma 3.1 shows with () given by (3.3) and () given by (3.2). In particular, .
Acknowledgements.
This work was supported by the Bavarian excellence network enb via the International PhD Programme of Excellence Exploring Quantum Matter (exqm).
4. APPENDIX
A. PROOF OF THEOREM 2.13
The overall idea is to transfer properties of from finite to infinite dimensions via the set convergence introduced in Section 2.2. However, we first need two auxiliary results to characterize the star-center of later on.
Lemma 4.1**.**
Let and be any orthonormal system in . Then
- (a)
* for all and*
- (b)
**
Proof.
(a) Consider a Schmidt decomposition of so
[TABLE]
Defining for all , using Cauchy-Schwarz and Bessel’s inequality one gets
[TABLE]
for all . On the other hand, said inequalities also imply
[TABLE]
Hence, because is decreasing by construction, an upper bound of is obtained by choosing and whenever . This shows the desired inequality. A proof of (b) can be found, e.g., in [11, Lemma 16.17]. ∎
Lemma 4.2**.**
Let with and let such that are conjugate. Furthermore, let be any orthonormal system in . Then
[TABLE]
Proof.
First, let , so . As is compact, by Lemma 4.1 (b) one has , hence the sequence of arithmetic means converges to zero as well. Next, let and . Moreover, we assume w.l.o.g. so . As , one can choose such that and moreover such that for all . Then, for any , Lemma 4.1 and Hölder’s inequality yield the estimate
[TABLE]
What we also need is some mechanism to associate bounded operators on with matrices. In doing so, let be some orthonormal basis of and let be the standard basis of . For any we define , and its linear extension to all of . With this, let
[TABLE]
be the operator which “cuts out” the upper block of (the matrix representation of) with respect to . The key result now is the following:
Proposition 4.3**.**
Let , with conjugate be given. Furthermore, let and be arbitrary orthonormal bases of . Then
[TABLE]
where and are the maps given by (4.1) with respect to and , respectively. Moreover, if are both are normal then
[TABLE]
where and are the orthonormal bases of which diagonalize and , respectively.
Proof.
For (or vice versa) proofs are given in [3, Thm. 3.1 & 3.6] which can be adjusted to by minimal modifications. ∎
With these preparations we are ready for proving our main result about the -numerical range of Schatten-class operators.
Proof of Theorem 2.13.
(a): For arbitrary orthonormal bases , of as well as any , it is readily verified that
[TABLE]
Both factors converge and, by Lemma 4.2, at least one of them goes to [math] as . Moreover, is star-shaped with respect to for all , cf. [2, Thm. 4]. Because Hausdorff convergence preserves star-shapedness [3, Lemma 2.5 (d)], Proposition 4.3 implies that is star-shaped with respect to .
For what follows let be the orthonormal bases of which diagonalize and , respectively.
(b): W.l.o.g. let be normal with collinear eigenvalues. Since is compact (i.e. its eigenvalue sequence is a null sequence) there exists such that is self-adjoint and by Proposition 4.3 we obtain
[TABLE]
Moreover, as is hermitian for all we conclude that is convex, cf. [15]. The fact that Hausdorff convergence preserves convexity [3, Lemma 2.5 (c)] then yields the desired result.
(c): The inclusion is shown exactly like [3, Thm. 3.4–first inclusion]. For the second inclusion, we note that by assumption and are diagonal and thus normal for all . Hence [19, Coro. 2.4] tells us
[TABLE]
for all . Using that Hausdorff convergence preserves inclusions [3, Lemma 2.5 (a)], (4.2) together with Proposition 4.3 yields
[TABLE]
Finally, applying the closure and the convex hull to the inclusions yields , where the last equality is due to (b), and thus . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Berberian , Introduction to Hilbert Space , Amer. Math. Soc., Chelsea 1976.
- 2[2] W.S. Cheung, N.K. Tsing , The C 𝐶 C -Numerical Range of Matrices is Star-Shaped, Lin. Multilin. Alg. , 41 (1996), 245–250.
- 3[3] G. Dirr, F. vom Ende , The C 𝐶 C -Numerical Range in Infinite Dimensions, Lin. Multilin. Alg. , 2018, In press: https://doi.org/10.1080/03081087.2018.1515884.
- 4[4] G. Dirr, F. vom Ende , Authors’ Addendum to "The C-Numerical Range in Infinite Dimensions", Lin. Multilin. Alg. , 2019, In press: https://doi.org/10.1080/03081087.2019.1604624.
- 5[5] N. Dunford, J. Schwartz , Linear Operators: Spectral Theory , Pure and applied mathematics, New York: Interscience Publishers, New York 1963.
- 6[6] K. Fan , Maximum Properties and Inequalities for the Eigenvalues of Completely Continuous Operators, Proc. Natl. Acad. Sci. USA , 37 (1951), 760–766.
- 7[7] S.J. Glaser, T. Schulte-Herbrüggen, M. Sieveking, et al. , Unitary Control in Quantum Ensembles: Maximising Signal Intensity in Coherent Spectroscopy, Science , 280 (1998), 421–424.
- 8[8] G.H. Golub, C.F. van Loan , Matrix Computations , The Johns Hopkins University Press, Baltimore 1989.
