This paper investigates inequalities between various quasi-arithmetic and quasi-geometric means of positive semidefinite matrices, establishing conditions on parameters for these inequalities to hold under different matrix orderings.
Contribution
It introduces new inequalities between matrix means and determines necessary and sufficient conditions for these inequalities across different matrix orderings.
Findings
01
Derived conditions for inequalities between matrix means.
02
Extended classical inequalities to matrix settings.
03
Analyzed inequalities under multiple matrix orderings.
Abstract
In this paper, for 0<α<1, p>0 and positive semidefinite matrices A,B≥0, we consider the quasi-extension Aα,p(A,B):=((1−α)Ap+αBp)1/p of the α-weighted arithmetic matrix mean, and the quasi-extensions Mα,p(A,B):=Mα(Ap,Bp)1/p of several different α-weighted geometric-type matrix means Mα(A,B) such as the α-weighted geometric mean in Kubo and Ando's sense and two types of α-weighted version of Fiedler and Pt\'ak's spectral geometric mean, as well as the R\'enyi mean and the α-weighted Log-Euclidean mean. For these we examine the inequalities Aα,p(A,B)◃Aα,q(A,B) and Mα,p(A,B)◃Aα,q(A,B) of arithmetic-geometric type, where ◃ is one of…
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Inequalities and Applications · Optimization and Variational Analysis · Matrix Theory and Algorithms
Full text
Various inequalities between quasi-arithmetic mean
1 Graduate School of Information Sciences, Tohoku University,
Aoba-ku, Sendai 980-8579, Japan
Abstract
In this paper, for 0<α<1, p>0 and positive semidefinite matrices A,B≥0, we consider the
quasi-extension Aα,p(A,B):=((1−α)Ap+αBp)1/p of the
α-weighted arithmetic matrix mean, and the quasi-extensions
Mα,p(A,B):=Mα(Ap,Bp)1/p of several different α-weighted
geometric-type matrix means Mα(A,B) such as the α-weighted geometric mean
in Kubo and Ando’s sense and two types of α-weighted version of Fiedler and Pták’s spectral
geometric mean, as well as the Rényi mean and the α-weighted Log-Euclidean mean. For these
we examine the inequalities Aα,p(A,B)◃Aα,q(A,B) and
Mα,p(A,B)◃Aα,q(A,B) of arithmetic-geometric type, where
◃ is one of several different matrix orderings varying from the strongest Loewner order to the
weakest order determined by trace inequality. For each choice of the above inequalities, our goal is to
hopefully obtain the necessary and sufficient condition on p,q,α under which the inequality holds
for all A,B≥0.
For positive semidefinite matrices (also operators) the most developed two-variable operator/matrix means
are Kubo and Ando’s operator means [29], defined corresponding to operator monotone functions
f≥0 on [0,∞) with f(1)=1 as
AσfB:=A1/2f(A−1/2BA−1/2)A1/2
for positive definite matrices (also operators) A,B>0 and extended to general positive semidefinite
A,B≥0 as AσfB:=limε↘0(A+εI)σf(B+εI). Among them the most
typical ones are the α-weighted arithmetic, harmonic and geometric means ▽α,
!α and #α for 0≤α≤1, whose definitions for A,B>0 are
[TABLE]
The so-called arithmetic-geometric-harmonic mean inequality in the Loewner order is
[TABLE]
Geometric-type matrix means not in Kubo and Ando’s sense have recently been in active consideration in
matrix analysis, partly motivated by recent development of quantum divergences in quantum information.
The most familiar one is the α-weighted Log-Euclidean mean
[TABLE]
for A,B>0. The spectral geometric mean
[TABLE]
for A,B>0 was formerly introduced in [15] and extended in [30] to the α-weighted version
as
[TABLE]
which has recently been reconsidered in [17, 19, 16] from a new perspective. Another new
α-weighted version of the spectral geometric mean was also recently introduced in [11] as
[TABLE]
Yet another geometric-type mean discussed in [22, 13] is
[TABLE]
with two parameters α∈(0,1) and z>0, which is called the Rényi mean because of its close
relation with the α-z-Rényi divergence [6].
For any matrix mean M(A,B) we have its quasi-extension M(Ap,Bp)1/p with a parameter p>0.
For 0<α<1 and p>0 the quasi-extension
(Ap▽αBp)1/p=((1−α)Ap+αBp)1/p of the α-weighted arithmetic
mean has been discussed by many authors under the name “operator power mean.” We notice that
LEα(A,B) is invariant under taking quasi-extension. In the present paper we denote the
quasi-extensions of A▽αB, A#αB, A!αB, F(A,B) and
F(A,B) respectively by Aα,p(A,B), Gα,p(A,B), Hα,p(A,B),
SGα(A,B) and SGα,p(A,B). We also write Rα,p(A,B) for
Rα,z(A,B) with parameter p=1/z instead of z. See Section 2.1 for more details on
these quasi-extended matrix means. The quasi matrix means Gα,p, SGα,p,
SGα,p, Rα,p as well as LEα are referred to as “quasi-geometric type
matrix means” because all of them are reduced to A1−αBα when AB=BA.
Our main aim of this paper is to extend the arithmetic-geometric mean inequality in (1.1) to those
for the quasi arithmetic mean and the quasi-geometric type means mentioned above. A special feature of
our study is that we consider not only the Loewner order (X≤Y for X,Y≥0) but also several of other
weaker orderings such as the chaotic order (denoted by X≤chaoY), the near order recently introduced
in [12] (denoted by X≤nearY), the entrywise eigenvalue order (denoted by X≤λY),
the weak majorization (X≺wY) and the trace inequality TrX≤TrY (denoted by X≤TrY).
See Section 2.2 for the explicit definitions of these orderings. It may also be stressed that we deal
with quasi matrix means for general positive semidefinite matrices though restricted to positive definite
matrices in most references. Our goal is to hopefully obtain the necessary and sufficient condition on
p,q,α under which the inequality Mα,p(A,B)◃Aα,q(A,B) holds for all
A and B, when Mα,p is one of the quasi-geometric type matrix means and ◃ is
one of the above matrix orderings.
The structure of the paper is as follows. In Section 2.1, for 0≤α≤1 and p>0 we review
the definitions of the above stated quasi matrix means Aα,p(A,B), Gα,p(A,B), etc. for
general positive semidefinite matrices A,B≥0 (with the support condition s(A)≥s(B) for
SGα,p and SGα,p). Some general basic facts on these quasi matrix means
are summarized. In Section 2.2 we review the above mentioned matrix orderings and summarize
some basic properties of them such as the strength relationship between them. In particular, for any quasi
matrix mean Mα,p and any matrix ordering ◃ among stated above, we show
(Theorem 2.3) that if Mα,p(A,B)◃Aα,q(A,B) holds for all A,B>0,
then the same holds for all A,B≥0 (with s(A)≥s(B) for Mα,p=SGα,p,
SGα,p). Section 3 provides some technical computations for specified
2×2 positive definite matrices as lemmas, which are repeatedly used in Section 4.
The main Section 4 are divided into six subsections. When Mα,p is respectively
Aα,p, LEα, Rα,p, Gα,p, SGα,p and
SGα,p, we examine the inequality Mα,p◃Aα,q for any
choice ◃ of the above matrix orderings. At the end of each subsection a table surveying the
results of the subsection is attached for the reader’s convenience. Finally in Section 5 several
remarks are in order to supplement characteristics and motivation of our study, additional facts, open
questions, etc. The paper contains an appendix on the Lie–Trotter–Kato product formula for operator means,
which is used in Section 2. This formula for positive semidefinite matrices is expected to be of
quite use, while it has been nowhere published in its complete form.
2 Preliminaries
For each n∈N we write Mn for the n×n complex matrices. Let Mn+ and
Mn++ be the positive semidefinite n×n matrices and the positive definite n×n
matrices, respectively. We simply write A≥0 for A∈Mn+ and A>0 for A∈Mn++. The
n×n identity matrix is denoted by In or simply I. Let Tr be the usual trace on Mn and
∥X∥∞ be the operator norm of X∈Mn. We write “for all A,B≥0” to mean
“for all A,B∈Mn+ with any n∈N”, and “for all A,B>0” to mean “for all A,B∈Mn++
with any n∈N”. For A≥0 there are two options of the convention of A0; the one is A0:=I
and the other is A0:=s(A), the support projection of A. In our discussions below we adopt the latter
convention. We write A−1 for the generalized inverse of A, i.e., the inverse of A under the
restriction to the support of A. Moreover, for p<0 we define Ap:=(A−1)−p.
In this preliminary section we will explain several examples of quasi matrix means and different notions of
matrix orders, which provide the basis for our discussions in the main Section 4.
2.1 Quasi matrix means
We first enumerate the definitions of quasi extensions of several binary matrix means for matrices, whose
order properties will be discussed in this paper. Let 0≤α≤1 and p>0. Let A,B∈Mn+.
(i) The α-weighted arithmetic mean is A▽αB:=(1−α)A+αB. Define
the quasi α-weighted arithmetic mean by
[TABLE]
which is also called the (α-weighted) matrix power mean.
(ii) For A,B>0 the α-weighted harmonic mean is
A!αB:=((1−α)A−1+αB−1)−1, extended to general A,B≥0 as
A!αB:=limε↘0(A+εI)!α(B+εI). Define the
quasi α-weighted harmonic mean by
[TABLE]
(iii) For A,B>0 the α-geometric mean is
A#αB:=A1/2(A−1/2BA−1/2)αA1/2, extended to A,B≥0 as
A#αB:=limε↘0(A+εI)#α(B+εI). The geometric mean# (=#1/2) was first introduced by Pusz and Woronowicz [33], and #α, together with
the above ▽α and !α, is a typical example of Kubo and Ando’s operator means
[29]. Define the quasi α-weighted geometric mean by
[TABLE]
(iv) For A,B>0 the spectral geometric mean due to Fiedler and Pták [15] is
[TABLE]
which was extended to the α-weighted version in [30] as
[TABLE]
For recent study of the weighted spectral
geometric mean, see, e.g., [28, 17, 19, 16] where Fα(A,B) is denoted by A♮αB.
We define the quasi α-weighted spectral geometric mean by
[TABLE]
Although the definitions of Fα and SGα,p are available for any A,B≥0 with A−p
in the generalized sense, they are meaningful as far as s(A)≥s(B). Indeed, in this case we have
SGα,p(A,B)=limε↘0SGα,p(A+εI,B+εI) as verified in
Proposition 2.2 below. When s(A)≤s(B), the situation is similar by interchanging A,B since
SGα,p(A,B)=SG1−α,p(B,A) for A,B>0 (see Remark 2.1(2) below). In this paper
we will consider SGα,p(A,B) for A,B≥0 with s(A)≥s(B).
(v) For A,B>0 another new weighted version of the spectral geometric mean was recently introduced in
[11] as
[TABLE]
We define the quasi version of this by
[TABLE]
Note that F1/2=F1/2 (=F) and so SG1/2,p=SG1/2,p for all p>0.
The definitions of Fα and SGα,p are meaningful for any A,B≥0
with s(A)≥s(B) similarly to SGα,p in (iv); see Proposition 2.2 below. We will use
SGα,p as well as SGα,p in this situation.
(vi) We consider one more quasi matrix mean defined for all A,B≥0 by
[TABLE]
which is called the Rényi mean in [13]. Note that the trace function TrRα,1/z(B,A)
(for general α,z>0) appears as the main component in the definition of the
α-z-Rényi divergence (see [6, 22]) in quantum information, that is the reason for
the terminology.
(vii) For A,B>0 the Log-Euclidean mean is
[TABLE]
Restricting to 0<α<1 we extend this to general A,B≥0 as
[TABLE]
where P0:=s(A)∧s(B). This extended definition is justified by Proposition 2.2 and
Theorem A.1 in Appendix A. Note that LEα(Ap,Bp)1/p=LEα(A,B) holds
for all p>0, so we have no quasi extension for LEα.
In the next remark we collect a few general simple facts on the quasi matrix means defined in (i)–(vii)
above.
Remark 2.1**.**
(1) For Mα,p∈{Aα,p,Hα,p,Gα,p}, it is obvious by definition that
M0,p(A,B)=A and M1,p(A,B)=B for all A,B≥0. From our definitions in (iv) and (v) it is easy
to verify that for any A,B≥0 with s(A)≥s(B) and any p>0,
[TABLE]
Also, note that R0,1(A,B)=(Ap/2s(B)Ap/2)1/p and R1,p(A,B)=(s(A)Bps(A))1/p for all
A,B≥0 and p>0. Hence the cases α=0,1 will be trivial (or quite simple) for our purpose, so
we will concentrate our considerations to 0<α<1 below.
(2) Let 0<α<1 and p>0. It is clear that any Mα,p from Aα,p, Hα,p,
Gα,p, LEα is symmetric in the sense that Mα,p(A,B)=M1−α,p(B,A)
for all A,B≥0. This is the case also for SGα,p when restricted to A,B>0; see [30].
However, this is not the case for SGα,p.
(3) It is obvious that Aα,p and Hα,p are transformed each other by taking inverse,
that is, Aα,p(A−1,B−1)=Hα,p(A,B)−1 for all A,B>0 and p>0. It is also
immediate to see that any Mα,p of the quasi matrix means given in (iii)–(vii) is invariant under
inverse, i.e., for any A,B>0,
[TABLE]
(4) We note that among quasi matrix means in (i)–(vii), only Aα,1, Hα,1 and
Gα,1 for 0≤α≤1 are Kubo and Ando’s operator means. Indeed, although the function
Aα,p(1,x)=(1−α+αxp)1/p is operator monotone on (0,∞) for 0≤α≤1
and 0<p≤1, the corresponding operator mean is
[TABLE]
which is obviously different from Aα,p(A,B) except for α=0,1 or p=1. The situation for
Hα,p is similar. For each M from Gα,p, SGα,p,
SGα,p, Rα,p,LEα, the function M(1,x)=xα is operator
monotone on (0,∞) for 0≤α≤1 but the corresponding operator mean is #α, i.e.,
Gα,1.
Proposition 2.2**.**
Let 0<α<1, p>0, and M be any of the quasi matrix means given in (i)–(vii). Let A,B≥0,
with an additional assumption s(A)≥s(B) when M=SGα,p or SGα,p.
Then we have
[TABLE]
Proof.
When M=Aα,p or Hα,p or Gα,p, it is easy to verify the result from the
downward continuity of Kubo and Ando’s operator means since (A+εI)p↘Ap and
(B+εI)p↘Bp as ε↘0. For M=Rα,p the result is obvious, and for
M=LEα it was verified in [25, Sec. 4]; see also Remark A.2 in Appendix A.
For the remaining, we first consider the case A>0. Since
[TABLE]
we see that SGα,p(A+εI,B+εI)→SGα,p(A,B) as ε↘0. Similarly we
have SGα,p(A+εI,B+εI)→SGα,p(A,B) too. For general
A,B≥0 with s(A)≥s(B) take the decomposition Cn=H1⊕H2 where H1 is the
range of s(A) and H2:=H1⊥. For M=SGα,p or SGα,p
we can write
[TABLE]
where Ik:=IHk, k=1,2. From the above shown case it follows that
M(A1+εI1,B1+εI1)→M(A1,B1) as ε↘0. Hence the assertion follows.
∎
The next theorem says that the quasi matrix means in (i)–(vi) satisfy the Lie–Trotter–Kato product formula.
So we may consider the Log-Euclidean mean LEα as a sort of attractor for those quasi matrix
means. Note that when A,B>0, the proof is much simpler without use of Theorem A.1 and
Remark A.4.
Theorem 2.3**.**
Let 0<α<1, p>0, and Mα,p be any of the quasi matrix means given in (i)–(vi). Let
A,B≥0, with an additional assumption s(A)≥s(B) when M=SGα,p or
SGα,p. Then we have
[TABLE]
Proof.
The convergences for Aα,p, Hα,p and Gα,p are special cases of
Theorem A.1, and that for Rα,p follows from (A.2) in Appendix A. For
the remaining, assume that s(A)≥s(B). Let P0:=s(A)∧s(B)=s(B) and H0 be the range
of P0. Note that SGα,p(A,B)={(A−p#Bp)αAp(A−p#Bp)}1/p is supported on
H0. Set L:=21{−P0(logA)P0+P0(logB)P0}, where logA is defined in the
generalized sense as logA:=s(A)(logA) and similarly for logB. Then by (A.25) in
Remark A.4 one can write A−p#Bp=P0+pL+o(p) as p↘0, which implies that
(A−p#Bp)α=P0+pαL+o(p). Therefore, one has
[TABLE]
Applying this to the Taylor expansion of log(1+x) gives
[TABLE]
which yields the assertion for SGα,p. The proof for SGα,p is similar, which is
omitted and left to the reader.
∎
2.2 Matrix order relations
Here we recall different types of order relations between matrices in Mn+. In Section 4 we
will examine several of these orderings between quasi matrix means introduced in Section 2.1.
Let X,Y∈Mn+.
(a) We write X≤Y as usual to denote the Loewner order, i.e., the positive semidefiniteness order in
the sense that Y−X is positive semidefinite.
(b) We write X≤chaoY and call it the chaotic order if s(X)≤s(Y) and
s(X)(logX)s(X)≤s(X)(logY)s(X). When X,Y>0, this simply reduces to logX≤logY.
(c) We write X≤nearY and call it the near order if s(X)≤s(Y) and X#Y−1≤I,
where Y−1 is the generalized inverse of Y. This ordering was introduced in [12] and further
discussed in [16] for X,Y>0. Note that ≤near is not transitive (though ‘near’ transitive); see
[12, Theorem 2].
(d) Let λ(X)=(λ1(X),…,λn(X)) denote the eigenvalues of X in decreasing
order with multiplicities. We write X≤λY to denote the entrywise eigenvalue order, i.e.,
λi(X)≤λi(Y) for each i=1,…,n.
(e) The weak (sub)majorizationX≺wY means that
[TABLE]
This is equivalent to that ∥X∥≤∥Y∥ for any unitarily invariant norm ∥⋅∥; see, e.g.,
[21, Proposition 4.4.13]. More details on majorization theory are found in [2], [9, Chap. II]
and [32].
(f) The weak log-majorizationX≺wlogY means that
[TABLE]
The log-majorizationX≺logY means that X≺wlogY and
∏i=1nλi(X)=∏i=1nλi(Y), i.e., detX=detY.
(g) We write X≤TrY if the trace inequality TrX≤TrY holds.
In the next proposition we summarize the strength relationship between the matrix orderings defined in
(a)–(g) above, together with a few basic properties.
Proposition 2.4**.**
Let X,Y≥0.
(1)
X≤Y* holds if and only log(tI+X)≤log(tI+Y) for all t>0.*
(2)
X≤chaoY* holds if and only if s(X)≤s(Y) and Xp#Y−p≤I for all p>0 (equivalently,
for all p∈(0,δ) for some δ>0), with Y−p in the generalized sense.*
(3)
We have
[TABLE]
Proof.
(1) The ‘only if’ is obvious since logx (x>0) is operator monotone. Conversely, assume that
log(tI+X)≤log(tI+Y) for all t>0. Since
[TABLE]
we have X≤Y.
(2) This equivalence was shown in [1, Theorem 1] when X,Y>0. Assuming that P:=s(X)≤s(Y),
we may show that P(logX)P≤P(logY)P if and only if Xp#Y−p≤I for all p>0 (equivalently, for
all p∈(0,δ)). Recall [4, Theorem 2.1] that
If Xp#Y−p≤I for all p∈(0,δ), then (2.2) gives Pexp{P(logX)P−P(logY)P}≤I so
that P(logX)P−P(logY)P≤0. Conversely, if P(logX)P≤P(logY)P, then it follows from (2.1)
and (2.2) that
[TABLE]
so that Xp#Y−p≤I for all p>0.
(3) Assume that X≤Y. Then s(X)≤s(Y) and Xp#Y−p≤Yp#Y−p=s(Y)≤I for all p∈(0,1).
Hence the first implication follows from (2). The second implication is seen from (2) as well. The third was
proved in [16, Theorem 2.4], whose proof is valid in the present setting. The remaining implications are
obvious or well known; see, e.g., [21, Proposition 4.1.6] for the penultimate implication.
∎
Lemma 2.5**.**
Let X,Y,Xn,Yn,≥0 for n∈N be given such that Xn→X and Yn→Y. For each
◃∈{≤,≤λ,≺w,≺wlog,≤Tr}, if Xn◃Yn for all n∈N,
then X◃Y. This holds for ◃=≤chao, ≤near as well under
the additional assumption s(X)≤s(Y).
Proof.
The first statement is immediately seen since λ(Xn)→λ(X) and
λ(Yn)→λ(Y). For the latter, assume that s(X)≤s(Y) and Xn≤chaoYn for all n.
By Proposition 2.4(2) one has s(Xn)≤s(Yn) and Xnp#Yn−p≤I, p>0. For any p>0,
since
[TABLE]
letting n→∞ gives (Yp/2XpYp/2)1/2≤Yp for all p>0. Thanks to s(X)≤s(Y), this in turn
implies that Xp#Y−p≤s(Y)≤I. Hence X≤chaoY follows by Proposition 2.4(2) again. The
proof for ≤near is similar.
∎
The next theorem is of quite importance from the viewpoint of the scope of our study.
Theorem 2.6**.**
Let 0<α<1 and p,q>0. Let Mα,p be any of the quasi matrix means in (i)–(vii) of
Section 2.1, and ◃ be any of ≤, ≤chao, ≤near, ≤λ, ≺w,
≺wlog, ≤Tr. If Mα,p(A,B)◃Aα,q(A,B) (resp.,
Hα,q(A,B)◃Mα,p(A,B)) holds for all A,B>0, then the same holds for all
A,B≥0, where s(A)≥s(B) is assumed for Mα,p=SGα,p, SGα,p.
Proof.
Assume that Mα,p(A,B)◃Aα,q(A,B) for all A,B>0. Let A,B≥0 be arbitrary.
The assumption implies that Mα,p(A+εI,B+εI)◃Aα,q(A+εI,B+εI).
Now s(A)≥s(B) is assumed when Mα,p=SGα,p or SGα,p. Then
by Proposition 2.2 we have Mα,p(A+εI,B+εI)→Mα,p(A,B) as well as
Aα,q(A+εI,B+εI)→Aα,q(A,B) as ε↘0. Moreover, it is clear that
s(Mα,p(A,B))≤s(A)∨s(B)=s(Aα,q(A,B)). Hence we have
Mα,p(A,B)◃Aα,q(A,B) by Lemma 2.5. The proof is similar for
Hα,q(A,B)◃Mα,p(A,B) in view of
s(Mα,p(A,B))≥s(A)∧s(B)=s(Hα,q(A,B)).
∎
Concerning the quasi matrix means and matrix orderings mentioned above, some general facts relevant to
our study are in order.
Remark 2.7**.**
(1) Let 0<α,β<1 with α=β, and p,q>0 be arbitrary. Let
[TABLE]
Let fα(x):=Mα(1,x) and gβ(x):=Mβ(1,x) for scalars A=x>0 and B=1.
Since fα(1)=gβ(1)=1, fα′(1)=α and gβ′(1)=β, one has
fα(1)−gβ(1)=0 and fα′(1)−gβ′(1)=0. Therefore, the sign of
fα(x)−gβ(x) is different between the left and the right of x=1. This implies that
Mα(A,B) and Nβ(A,B) are not definitively comparable with respect to any ordering. So
we may only discuss inequalities between two of the above quasi matrix means under the same α.
(2) For any α∈(0,1) it is clear that
Gα,p(a,b)=SGα,p(a,b)=SG(a,b)=Rα,p(a,b)=LEα(a,b)=a1−αbα for scalars a,b>0, independently of p>0. For this reason we refer to
Gα,p,SGα,p,SG,Rα,p,LEα as quasi-geometric type
matrix means. Also it is immediate to see that
[TABLE]
for scalars a,b>0, and moreover Hα,q(a,b)↗a1−αbα and
Aα,p(a,b)↘a1−αbα as p↘0. Therefore, for each quasi-geometric
type mean Mα,p and for each ordering ◃ mentioned above, we may consider
possible inequalities between Mα,p,Hα,q and between
Mα,p,Aα,q in the respective directions Hα,q◃Mα,p and
Mα,p◃Aα,q only.
(3) Let 0<α<1, p>0 and Mα,p be any quasi-geometric type mean as in Theorem 2.6.
For any X,Y>0 the following are straightforward by definitions:
[TABLE]
From these, Remark 2.1(3) and Theorem 2.6 together we see that for any
◃∈{≤,≤chao,≤near,≤λ}, Hα,q(A,B)◃Mα,p(A,B)
holds for all A,B>0 if and only if Mα,p(A,B)◃Aα,q(A,B) holds for all
A,B>0. So the characterizations of inequalities between Hα,q,Mα,p are to large
extent reduced to those between Mα,p,Aα,q.
3 Some technical computations for 2×2 matrices
In this section we perform some technical computations for specified 2×2 matrices, which will be of
quite use in the next main section. Below we will often treat 2×2 matrices with parameter θ∈R,
given in the approximate form up to o(θ2) as
[TABLE]
where a,b,x11,x22,x12∈R, and little-o notation o(θ2) means that
θ2o(θ2)→0 as θ→0. In the first lemma we explain the way of computing the
functional calculus of Xθ based on Taylor’s theorem and Daleckii and Krein’s derivative formulas
(see, e.g., [21, Sec. 2.3]).
Lemma 3.1**.**
Let Xθ be given in (3.1) with a=b and f be a C2-function on an interval containing
a,b. Then the functional calculus f(Xθ) is given in the approximate form as
[TABLE]
where
[TABLE]
In the above, f[1](a1,a2) is the first divided difference of f and f[2](a1,a2,a3) is the second one
(see [21, Sec. 2.2]).
Proof.
First, note that since Xθ→X0=[a00b] as θ→0, f(Xθ)
is well defined for any θ near [math]. We write Xθ=X0+θH+θ2K+o(θ2) as
θ→0 where H:=[0x12x120] and
K:=[x1100x22]. By Taylor’s theorem for the functional calculus f(X)
at X0 (see [21, Theorem 2.3.1]) one has
[TABLE]
where Df(X0)(⋅) and D2f(X0)(⋅,⋅) are the first and the second Fréchet derivatives of f(X)
at X0. Thanks to Daleckii and Krein’s derivative formulas (see [21, p. 163]) one can write, with the
Schur product ∘,
Let a=1, b>0 in (3.1) and f(x)=logx on (0,∞). Then
[TABLE]
where
[TABLE]
We will repeatedly utilize the 2×2 positive definite matrices given as follows:
[TABLE]
for x,y>0 and θ∈R. As immediately verified, the approximate form of Bθ up to o(θ2)
is
[TABLE]
Lemma 3.3**.**
Let 0<α<1, p,q>0, and x,y>0 be such that (1−α)xp+αyp=1 and
(1−α)xq+αyq=1. Let A0 and Bθ be given in (3.5). Then we have
[TABLE]
where
[TABLE]
Furthermore, we have
[TABLE]
Proof.
Since Bθp is written as in (3.6) with yp in place of y, it follows that
Xθ:=(1−α)A0p+αBθp is of the form (3.1) with a=1,
b=(1−α)xp+αyp, x11=−α(1−yp), x22=α(1−yp) and x12=α(1−yp).
Since (1−α)xp+αyp=1 by assumption, we see by Example 3.2(1) that
Aα,p(A0,Bθ)=Xθ1/p is given in the form (3.7), where
Let 0<α<1 and x,y>0 be such that x1−αyα=1. Let A0 and Bθ be given
in (3.5). Then we have
[TABLE]
where
[TABLE]
Furthermore, when p>0 and (1−α)xp+αyp=1, we have
[TABLE]
Proof.
We note that
[TABLE]
Since logxαy1−α=0 by assumption, one can apply Example 3.2(2) with
b=logx1−αyα, x11=αlogy and x22=x12=−αlogy. Hence it
follows that LEα(A0,Bθ)=eXθ is given in the form (3.12), where
Let 0<α<1, p>0, and A0,Bθ∈M2++ be given in (3.5) with y=x>0, x=1.
Then we have
[TABLE]
where
[TABLE]
Proof.
Since Bθαp is written as in (3.6) with xαp in place of y, it is easy to
compute
[TABLE]
Then one can apply Example 3.2(1) to show (3.15) with (3.16) (as in the first
paragraph of the proof of Lemma 3.3), whose details are left to the reader.
∎
In the last lemma of this section we explain the way of computing the eigenvalues of Xθ given in
(3.1).
Lemma 3.6**.**
Let Xθ be given in (3.1) with a=b. Then the eigenvalues of Xθ is given in the
approximate form as
[TABLE]
Proof.
Since
[TABLE]
the eigenvalues λ±(θ) of Xθ are computed up to o(θ2) as
[TABLE]
Hence it is immediate to verify that λ(Xθ)=(λ+(θ),λ−(θ)) has expression
(3.17).
∎
4 Various quasi-arithmetic-geometric inequalities
Throughout the section let 0<α<1 and p,q>0. For each quasi matrix mean Mα,p from
Aα,p, LEα, Rα,p, Gα,p, SGα,p, SGα,p
and for each matrix ordering ◃ from ≤, ≤chao, ≤near, ≤λ, ≺w,
≤Tr, we will examine the inequality Mα,p(A,B)◃Aα,q(A,B). Our goal is to
find the necessary and sufficient condition of p,q,α for the inequality to hold for all A,B>0, though
we have not succeeded it for all cases. In this paper we do not deal with the inequality
Hα,q▽Mα,p (see Remark 2.7(3) and remark (4) of Section 5).
Also we do not include the ordering ≺wlog in our considerations, though it would be meaningful to
examine the differences between ≤λ, ≺wlog, ≺w (see Proposition 2.4(3))
in quasi-arithmetic-geometric mean inequalities.
The section is divided into six subsections.
4.1 Aα,p vs. Aα,q
The next theorem characterizes the inequality Aα,p≤Aα,q, which was formerly shown
in [7] with restriction to α=1/2.
Theorem 4.1**.**
Let 0<α<1 and p,q>0. Then the following conditions are equivalent:
(i)
Aα,p(A,B)≤Aα,q(A,B)* for all A,B≥0;*
(ii)
Aα,p(A,B)≤Aα,q(A,B)* for all A,B∈M2++;*
(iii)
p=q* or 1≤p<q or 1/2≤p<1≤q.*
To prove the theorem, in addition to Lemma 3.3 we need the following lemma.
Lemma 4.2**.**
Define A~0,B~θ∈M2+ by
[TABLE]
for θ∈R. Let 0<α<1 and 0<p,q<1. Then we have
[TABLE]
Proof.
The proof is a modification of that of [7, Lemma 3.3] where the case α=1/2 was treated. It is easy to
check that
[TABLE]
where
[TABLE]
Letting c:=a2+b2 (≤1), as in the proof of [7, Lemma 3.3] one can compute
[TABLE]
As θ→0 we estimate
[TABLE]
so that
[TABLE]
Hence we have
[TABLE]
and moreover
[TABLE]
thanks to 0<p<1. Therefore, we arrive at
[TABLE]
where
[TABLE]
Therefore, we obtain
[TABLE]
as asserted.
∎
Note that although (4.3) is written in the form of (3.1) with a=α+(1−α)2p and
b=0, we cannot apply Lemma 3.1 to prove Lemma 4.2 because x1/p is not twice
differentiable at x=0 when p>1/2.
(iii)⟹(i) is seen from [7, Theorem 2.1]. Indeed, the proof is easy as follows. If 1≤p<q
then
[TABLE]
since xp/q is operator concave and x1/p is operator monotone on [0,∞). If 1/2≤p<1≤q
then
[TABLE]
since x1/p is operator convex and x1/q is operator concave on [0,∞).
(i)⟹(ii) is trivial.
(ii)⟹(iii). By considering the inequality for A=xI2 and B=yI2, x,y>0, (ii) implies that p≤q. Hence it suffices
to show that (ii) fails to hold when 0<p<1/2 and q>p and when 0<p<q<1.
First, assume that 0<p<1/2 and q>p. Consider A0 and Bθ in (3.5) with 0<y<1 and
x=y2. Since (1−α)y2p+αyp<1 and (1−α)y2q+αyq<1 clearly, we can apply
Lemma 3.3. As y↘0 we estimate
[TABLE]
and
[TABLE]
Therefore, the dominant term of the big bracket [⋯] of the RHS of (3.9) is
[TABLE]
thanks to 2p<1 when y>0 is sufficiently small. For such a y>0 the RHS of (3.9) <0
if θ is small enough, which implies that
Aα,p(A0,Bθ)≤Aα,q(A0,Bθ).
Next, assume that 0<p<q<1, and apply Lemma 4.2. Let A~0 and B~θ be
given in (4.1). From expression (4.2) it follows that
det{Aα,q(A~0,B~θ)−Aα,p(A~0,B~θ)}<0 so that
Aα,p(A~0,B~θ)≤Aα,q(A~0,B~θ) if θ is
small enough. By continuity there are A,B∈M2++ such that
Aα,p(A,B)≤Aα,q(A,B).
∎
The inequalities between Aα,p and Aα,q with respect to other weaker orderings are
simple as stated in the next proposition.
Proposition 4.3**.**
Let 0<α<1 and p,q>0. Then the following conditions are equivalent:
(i)
Aα,p(A,B)≤chaoAα,q(A,B)* for all A,B≥0;*
(ii)
Aα,p(a,b)≤Aα,q(a,b)* for all scalars a,b>0;*
(iii)
p≤q.
Hence, for any ◃∈{≤chao,≤near,≤λ,≺w,≤Tr},
Aα,p◃Aα,q holds for all A,B≥0 if and only if p≤q.
Proof.
(i)⟹(ii) is obvious, and (ii)⟹(iii) is easy since (iii) means that xq/p is convex on
(0,∞). Finally, let us show (iii)⟹(i). Assume that p≤q. Let A,B≥0 be arbitrary. Note that
[TABLE]
and similarly s(Aα,q(A,B))=P. Since xp/q is operator concave on [0,∞), one has
[TABLE]
so that
[TABLE]
Hence (i) holds.
∎
The next proposition says that p>0↦TrAα,p(A,B) is strictly increasing unless A=B.
Proposition 4.4**.**
Let 0<α<1 and 0<p<q. Then for every A,B≥0 the following conditions are equivalent:
(i)
TrAα,p(A,B)=TrAα,q(A,B);
(ii)
Aα,p(A,B)=Aα,q(A,B);
(iii)
A=B.
Proof.
It is obvious that (iii)⟹(ii)⟹(i). To show (i)⟹(iii), assume (i) so that
TrAα,t(A,B) is constant for t∈[p,q] by Proposition 4.3. Since the function
t↦TrAα,t(A,B) is real analytic in t>0, it follows that TrAα,t(A,B) is
constant for all t>0. In particular, TrAα,1/2(A,B)=TrAα,1(A,B), which gives
Tr(A1/2−B1/2)2=0 so that A=B.
∎
The results of this subsection are summarized as follows:
4.2 LEα vs. Aα,q
The order relations between LEα and Aα,p are clarified in the next theorem and
proposition. Theorem 4.5 was formerly shown in [7] with restriction to α=1/2.
Theorem 4.5**.**
For any α∈(0,1) and any q>0 there exist A,B∈M2++ such that
LEα(A,B)≤Aα,q(A,B).
Proof.
Let 0<α<1 and q>0. Assume by contradiction that LEα(A,B)≤Aα,p(A,B) for all
A,B∈Mn++. Then LEα(A0,Bθ)≤Aα,p(A0,Bθ) holds for
A0,Bθ in (3.5) with any x,y>0 and θ∈R. Let 0<y<1 and x=y2. Since
x1−αyα=1 and (1−α)xq+αyq=1, by Lemma 3.4 we must have
which is negative when y>0 is sufficiently small. This contradicts (4.4).
∎
Proposition 4.6**.**
Let 0<α<1 and q>0. Then for every A,B≥0 we have
LEα(A,B)≤chaoAα,q(A,B). Hence LEα(A,B)◃Aα,q(A,B)
holds for any ◃∈{≤chao,≤near,≤λ,≺w,≤Tr} and A,B≥0.
Proof.
When 0<p<q, by Proposition 4.3 we have Aα,p(A,B)≤chaoAα,q(A,B) for
any A,B≥0. Letting p↘0 gives LEα(A,B)≤chaoAα,q(A,B) by
Theorem 2.3 and Lemma 2.5.
∎
In the above proof, by Theorem 2.6 we may assume that A,B>0. Thus, the simpler version of
Theorem 2.3 for A,B>0 is enough to prove the above proposition. However, we cannot
completely avoid use of Theorem A.1 because it is necessary in proving Lemma 2.5 via
Proposition 2.4(2).
Proposition 4.7**.**
Let 0<α<1 and q>0. Then for every A,B≥0 the following conditions are equivalent:
(i)
TrLEα(A,B)=TrAα,q(A,B);
(ii)
LEα(A,B)=Aα,q(A,B);
(iii)
A=B.
Proof.
It is obvious that (iii)⟹(ii)⟹(1). If (i) holds, then we have
TrAα,p(A,B)=TrAα,q(A,B) for any p∈(0,q] by Propositions 4.3 and
4.6. Hence (iii) follows by Proposition 4.4.
∎
The results of this subsection are summarized as follows:
4.3 Rα,p vs. Aα,q
In this subsection we discuss inequalities between Rα,p and Aα,q. The next theorem
says that the near order inequality Rα,p(A,B)≤nearAα,q(A,B) fails to hold for any
p,q>0.
Theorem 4.8**.**
For any α∈(0,1) and any p,q>0 there exist A,B∈M2++ such that
Rα,p(A,B)≤nearAα,q(A,B) (hence Rα,p(A,B)≤Aα,q(A,B)
and Rα,p(A,B)≤chaoAα,q(A,B)).
Proof.
Consider A0,Bθ∈M2++ given in (3.5) with y=x=1. Let 0<α<1 and
p,q>0 be arbitrarily fixed. The estimates of Rα,p(A0,Bθ) and
Aα,q(A0,Bθ) up to o(θ) are enough to show the theorem. Assume that
Rα,p(A0,Bθ)≤nearAα,q(A0,Bθ) for any x>0 with x=1 and any
θ∈R. Now write X:=Rα,p(A0,Bθ) and Y:=Aα,q(A0,Bθ).
Reducing (3.15) to o(θ) (see (3.4)) one has
X=[1θz12θz12x]+o(θ) with z12 in
(3.16) (without variable p for simplicity). On the other hand, reducing (3.7) and (3.8)
with y=x=1 one has Y=[1θα(1−x)θα(1−x)x]+o(θ),
and applies (3.4) to have
Y1/2=[1θα(1−x1/2)θα(1−x1/2)x1/2]+o(θ)
as θ→0.
Hence we have
Then, since X≤nearY is equivalent to (Y1/2XY1/2)1/2≤Y,
we must have
[TABLE]
Indeed, the last equality is seen as, with v∈R,
[TABLE]
Therefore, α(1−x)=1+xu12 must hold, that is,
x1/2z12+α(1−x1/2)(1+x3/2)=α(1−x2). Inserting the form of z12 in (3.16)
gives
[TABLE]
so that (x21−αp−x21+αp)/(1−xp)=α for all x>0, x=1. But
this is impossible since the LHS goes to [math] (=α) as x↘0.
∎
Proposition 4.9**.**
Let 0<α<1 and p,q>0. If p/2≤q, then we have
Rα,p(A,B))≤λAα,q(A,B) (hence
Rα,p(A,B)≺wAα,q(A,B)) for all A,B≥0.
Proof.
Ando’s matrix Young inequality [3] says that for any α∈(0,1) and A,B∈Mn+,
[TABLE]
equivalently, λ1/2(A1−αB2αA1−α)≤λ((1−α)A+αB). For
any p>0, by replacing A,B with Ap/2,Bp/2 respectively and taking the 2/p-power, one has
[TABLE]
that is, Rα,p(A,B)≤λAα,p/2(A,B). Hence the assertion follows from
Proposition 4.3.
∎
Theorem 4.10**.**
Let 0<α<1 and p,q>0. If Rα,p(A,B)≤λAα,q(A,B) holds for all
A,B∈M2++, then α(1−α)p≤q.
Proof.
Let A0 and Bθ be given in (3.5) with y=x>0 with x=1. Then we have expression
(3.15) with (3.16). On the other hand, in the present case (where y=x), expression
(3.7) with (3.8) is simplified as
[TABLE]
where
[TABLE]
By assumption we must have Rα,p(A0,Bθ)≤λAα,q(A0,Bθ) for all
θ∈R, which implies by Lemma 3.6 that
Note that (4.7) and (4.8) are equivalent; indeed, dividing both sides of (4.7) by x
and then replacing x with x−1 transform (4.7) into (4.8). Letting x↘0 in
(4.7) gives −p1≤−qα(1−α), so that α(1−α)p≤q follows.
∎
It seems that we can find no further condition other than α(1−α)p≤q from (4.7)
(equivalently, (4.8)) for all x=1.
Proposition 4.11**.**
Let 0<α<1 and p,q>0. If min{1,p/2}≤q, then TrRα,p(A,B)≤TrAα,q(A,B)
holds for all A,B≥0.
Proof.
When p/2≤q, the asserted trace inequality follows from Proposition 4.9. Assume that 1≤q
and p>0 is arbitrary. For every A,B∈Mn+, by Horn’s log-majorization (see, e.g., [9, 21]) one has
[TABLE]
so that
[TABLE]
where the last inequality is due to Proposition 4.3 since 1≤q.
∎
Theorem 4.12**.**
Let 0<α<1 and p,q>0. If TrRα,p(A,B)≤TrAα,q(A,B) holds for all
A,B∈M2++, then min{1,α(1−α)p}≤q.
Proof.
It suffices to show that we must have α(1−α)p≤q when q<1. Consider again
A0,Bθ∈M2++ in (3.5) with y=x>0, x=1. From Lemma 3.5 and
(4.5), (4.6) we compute
[TABLE]
Hence, if TrRα,p(A,B)≤TrAα,q(A,B) for all A,B∈M2++, then we must have
[TABLE]
for all x>0, x=1. Since q<1 by assumption, letting x↘0 gives −1/p≤−α(1−α)/q
and hence α(1−α)p≤q follows.
∎
In particular, when α=1/2 and p=1, the next example provides the exact characterization for
TrR1/2,1(A,B)≤TrA1/2,q(A,B).
Example 4.13**.**
Let α=1/2 and p=1. Then the following conditions are equivalent:
(i)
TrR1/2,1(A,B)≤TrA1/2,q(A,B) for all A,B≥0;
(ii)
TrR1/2,1(A,B)≤TrA1/2,q(A,B) for all A,B∈M2++;
(iii)
q≥1/4.
Proof.
(i)⟹(ii) is trivial, and (ii)⟹(iii) follows from Theorem 4.12. Next let us show that
(iii)⟹(i) holds. For this, by Proposition 4.3 it suffices to show that
TrR1/2,1(A,B)≤TrA1/2,1/4(A,B) for all A,B≥0. We note that
TrR1/2,1(A,B)=TrA1/2B1/2 and
[TABLE]
Since
[TABLE]
we have
[TABLE]
It follows from (4.10) and (4.11) that
TrA1/2,1/4(A,B)≥21Tr(A3/4B1/4+A1/4B3/4).
Moreover, by the matrix norm inequality for the Heinz-type means (see [24]) we have
[TABLE]
so that TrA1/2B1/2≤21Tr(A3/4B1/4+A1/4B3/4). Hence (i) follows.
∎
Proposition 4.14**.**
Let 0<α<1 and p,q>0 be such that min{1,p/2}≤q. Then for every A,B≥0 the following
conditions are equivalent:
(i)
TrRα,p(A,B)=TrAα,q(A,B);
(ii)
Rα,p(A,B)=Aα,q(A,B);
(iii)
A=B.
Proof.
It is obvious that (iii)⟹(ii)⟹(i). Assume (i). If p/2<q, then we have
TrAα,t(A,B)=TrAα,q(A,B) for any t∈[p/2,q] by Propositions 4.3 and 4.11. Hence
(iii) follows from Proposition 4.4. If p/2=q, then Proposition 4.9 yields
λ(Rα,p(A,B))=λ(Aα,p/2(A,B)) and hence
[TABLE]
Hence It follows from [31, Corollary 2.3] that Ap/2=Bp/2 so that A=B. If q≥1, then for
any t≥p we have TrRα,p(A,B)≤TrRα,t(A,B)≤TrAα,q(A,B) by
Araki’s log-majorization [5] and Proposition 4.11. Hence
TrRα,t(A,B)=TrAα,q(A,B) for t≥p, which extends to all t>0 thanks to real
analyticity of t↦TrRα,t(A,B) in t>0, so we can apply the above case (where p/2<q).
∎
The results of this subsection are summarized as follows:
Problem 4.15**.**
There is a gap between the sufficient condition and the necessary condition for
Rα,p≤λAα,q, Rα,p≺wAα,q and
Rα,p≤TrAα,q. It is also unknown whether Rα,p≤λAα,q
is strictly stronger than Rα,p≺wAα,q, or they are equivalent. Example 4.13
says that the sufficient condition in Proposition 4.11 is not sharp when α=1/2 and p=1.
This suggests us that the complete characterization of Rα,p≤TrAα,q is a complicated
problem.
4.4 Gα,p vs. Aα,q
In this subsection we discuss inequalities between Gα,p and Aα,q. In theory of
operator means in Kubo and Ando’s sense [29] it is well known that
A#αB≤A▽αB for any α∈(0,1) and A,B≥0. The next theorem
characterizes the Loewner inequality Gα,p≤Aα,q, extending
#α≤▽α (for p=q=1).
Theorem 4.16**.**
Let 0<α<1 and p,q>0. Then the following conditions are equivalent:
(i)
Gα,p(A,B)≤Aα,q(A,B)* for all A,B≥0;*
(ii)
Gα,p(A,B)≤Aα,q(A,B)* for all A,B∈M2++;*
(iii)
1≤p≤q.
Lemma 4.17**.**
Let α>0 and p,q>0. Let A0 and Bθ be given in (3.5) with y=x>0 and x=1.
Then we have
[TABLE]
where
[TABLE]
Proof.
Similarly to the proof of Lemma 3.5 it is easy to compute
[TABLE]
Then Example 3.2(1) is applied to compute (A0−p/2BθpA0−p/2)α
and hence
[TABLE]
where
[TABLE]
Furthermore, we apply Example 3.2(1) again to show that (4.12) holds with (4.13),
whose details are left to the reader.
∎
(iii)⟹(i). Since Ap#αBp≤Ap▽αBp and x1/p is operator monotone on
[0,∞), we have
[TABLE]
Since Aα,p(A,B)≤Aα,q(A,B) by Theorem 4.1, the result follows.
(i)⟹(ii) is trivial.
(ii)⟹(iii). Assume that (ii) holds. Note that this extends to all A,B∈M2+ by continuity. First, let
A0,Bθ∈M2++ be given as in Lemma 4.17. Then by (4.12) and (4.13)
as well as (4.5) and (4.6) we have
[TABLE]
Hence we must have
[TABLE]
(Note that (4.15) is equivalent to (4.14).) It is clear that p≤q follows from (4.14).
Now assume that p≤q, and let A0 be given in (3.5) with x>0 satisfying
(1−α)xq=1, and B~θ∈M2+ be in (4.1). Then by Example 3.2(1)
we compute
[TABLE]
where
[TABLE]
On the other hand, since B~θp=B~θ=∣ϕ⟩⟨ϕ∣ where
ϕ:=[cosθsinθ], one has
[TABLE]
so that Gα(A0p,B~θp)=∥A0−p/2ϕ∥2(α−1)∣ϕ⟩⟨ϕ∣. Hence one writes
If p<1, then (4.20) →−(1−α)2<0 as x↘0, which is impossible since
(4.20) ≥0 in the limit x↘0. Hence p≥1 must hold.
∎
The next corollary is a particular case of Theorem 4.16 when q=1.
Corollary 4.18**.**
Let 0<α<1 and p>0. Then the following conditions are equivalent:
(i)
Gα,p(A,B)≤αA+(1−α)B* for all A,B≥0;*
(ii)
Gα,p(A,B)≤αA+(1−α)B* for all A,B∈M2++;*
(iii)
p=1.
Proposition 4.19**.**
Let 0<α<1. If 0<p≤q, then we have Gα,p(A,B)≤chaoAα,q(A,B) for all
A,B≥0.
Proof.
Assume that 0<p≤q. Since 1≤q/p, it follows from Theorem 4.16 that
Gα,1(Ap,Bp)≤Aα,q/p(Ap,Bp), i.e.,
[TABLE]
which implies that
[TABLE]
and
[TABLE]
so that
[TABLE]
that is, Gα,p(A,B)≤chaoAα,q(A,B).
∎
Theorem 4.20**.**
Let 0<α<1 and p,q>0. If Gα,p(A,B))≤λAα,q(A,B) holds for all
A,B∈M2++, then p≤q.
Proof.
We use A0,Bθ∈M2++ in (3.5) with y=x=1 once again, and argue in the same
way as in the proof of Theorem 4.10. Then by assumption we must have the same inequalities as
in (4.7) and (4.8) with z11(p),z22(p),z12(p) in (4.13) instead of
(3.16). These are specified in the present case as
[TABLE]
for all x>0, x=1. Hence we have p≤q.
∎
Proposition 4.21**.**
Let 0<α<1 and p,q>0 be arbitrary. Then for every A,B≥0 we have
Gα,p(A,B)≺wlogAα,q(A,B) and hence
Gα,p(A,B)≺wAα,q(A,B).
Proof.
When A,B>0, the result follows from the log-majorization Gα,p(A,B)≺logLEα(A,B)
(see [4, Corollary 2.3]) and Proposition 4.6. For general A,B≥0 one can take the limit from
the result for A+εI and B+εI as ε↘0.
∎
Proposition 4.22**.**
Let 0<α<1 and p,q>0. Then for every A,B≥0 the following conditions are equivalent:
(i)
TrGα,p(A,B)=TrAα,q(A,B);
(ii)
Gα,p(A,B)=Aα,q(A,B);
(iii)
A=B.
Proof.
It suffices to show (i)⟹(iii). If (i) holds, then TrAα,t(A,B)=TrAα,q(A,B)
for any t∈(0,q] by Propositions 4.3 and 4.21. Hence we have (iii) by Proposition 4.4.
∎
The results of this subsection are summarized as follows:
4.5 SGα,p vs. Aα,q
In this subsection we discuss inequalities between SGα,p and Aα,q. The next theorem
says that SGα,p(A,B)≤chaoAα,q(A,B) fails to hold for any p,q>0.
Theorem 4.23**.**
For any α∈(0,1) and any p,q>0 there exist A,B∈M2++ such that
SGα,p(A,B)≤chaoAα,q(A,B) (hence
SGα,p(A,B)≤Aα,q(A,B)).
Proof.
For A,B>0 set Y:=Ap and X:=A−p#Bp; then X=Y−1#Bp and hence Bp=XYX by the
Riccati lemma, so that B=(XYX)1/p. Therefore, SGα,p(A,B)≤chaoAα,q(A,B) is
equivalently written as
[TABLE]
Moreover, for any X,Y>0, if we set A:=Y1/p and B:=(XYX)1/p, then Y=Ap and
X=A−p#Bp. Hence, by replacing q/p with r, it suffices to show that for any r>0 there exist
X,Y∈M2++ such that
[TABLE]
To do this, let X:=A0 and Y:=Bθ in (3.5) for x,y>0 with x2y=1. Since
Now assume that rlog(XαYXα)≤log{(1−α)Yr+α(XYX)r} for all
X,Y∈M2++. Then we must have
[TABLE]
for all x,y>0 with x2y=1 and x2αy=1. Let x>0 with x=1 be fixed, so that x2y<1
and x2αy<1 for sufficiently small y>0. Since (x2r)α<αx2r+1−α so that
log(αx2r+1−α)−2rαlogx>0, it follows from (4.30) that c11−rd11≥0.
Let us estimate c11 and d11 when y↘0. As y↘0 we have b→0 and
logb≈rlogy by (4.24). Since a11→−r+(r−1)x2 and a12→x by (4.22),
we have by (4.24)
so that (1−α)(r−1)−(1−α)2(q−1)≥0, implying that r≥1+(1−α)(q−1). Examining the
coefficient of the maximal order term xqr+2 of (4.43) we also find that
[TABLE]
so that α(r−1)+α2(q+1)−2qα2≥0, implying that r≥1+α(q−1). This is immediate
since the assumption of the theorem holds for 1−α in place of α from symmetry of
SGα,p and Aα,q (see Remark 2.1(2)). Thus the result follows.
∎
Proposition 4.25**.**
Let 0<α<1 and p,q>0.
(1)
If p/q≤min{α,1−α}, then we have
SGα,p(A,B)≺logRα,q(A,B) for every A,B≥0 with s(A)≥s(B).
(2)
If p/q≤2min{α,1−α}, then we have
SGα,p(A,B)≺wlogAα,q(A,B) (hence
SGα,p(A,B)≺wAα,q(A,B)) for every A,B≥0 with s(A)≥s(B).
Proof.
By continuity (Proposition 2.2) and symmetry mentioned in Remark 2.1(2) we may assume
that A,B>0 and 0<α≤1/2.
(1) For 0<α≤1/2 and A,B>0, by Araki’s log-majorization [5] we have
[TABLE]
Replacing A,B with Ap,Bp gives
SGα,p(A,B)≺logRα,1/α(Ap,Bp)1/p=Rα,p/α(A,B). The result
follows since Rα,p/α(A,B)≺logRα,q(A,B) if p/α≤q by Araki’s
log-majorization again.
(2) If p/q≤2α and so p/α≤2q, then we have
Rα,p/α(A,B)≤λAα,q(A,B) by Proposition 4.9. Combining this with (1)
gives the result.
∎
Remark 4.26**.**
As for log-majorization in (1) above, for 0<α<1 and p,q>0, it is indeed known that
SGα,p(A,B)≺logRα,q(A,B) for all A,B>0 if and only if p/q≤min{α,1−α},
and that Rα,q(A,B)≺logSGα,p(A,B) for all A,B>0 if and only if
p/q≥max{α,1−α}. The details on these facts will be provided in [23], while the ‘if’ part
of the latter was indeed shown in [19].
Theorem 4.27**.**
Let 0<α<1 and p,q>0. If λ1(SGα,p(A,B))≤λ1(Aα,q(A,B))
holds for all A,B∈M2++, then q/p≥max{α,1−α}. Hence, if
SGα,p(A,B)≺wAα,q(A,B) for all A,B∈M2++, then
q/p≥max{α,1−α}.
Proof.
Assume that λ1(SGα,p(A,B))≤λ1(Aα,q(A,B)) holds for all
A,B∈M2++. Then as in the first paragraph of the proof of Theorem 4.23 it follows that
[TABLE]
holds for all X,Y∈M2++, where r:=q/p. Now set X:=A0 and Y:=Bθ in (3.5)
for x,y>0 with x2y=1. Apply Lemma 3.6 to (4.23) and (4.35) with q=1 to
find that
[TABLE]
where b,b11,b12 are in (4.24) via (4.22) and s^11,s^12 are in
(4.36) with q=1. Hence, for any x>0, if y>0 is sufficiently small, then we must have
[TABLE]
When y↘0, the above inequality becomes
[TABLE]
Letting x↘0 in the above gives −r≤−αr−(1−α)+(1−α)2, so that r≥α.
Moreover, from symmetry we have r≥1−α too, so the result follows.
∎
Note that the sufficient condition in Proposition 4.25 is indeed stricter than the necessary condition in
Theorem 4.27, because 2min{α,1−α}<1/max{α,1−α}.
Proposition 4.28**.**
Let 0<α<1 and p,q>0. If q≥1 or p/q≤2min{α,1−α}, then
TrSGα,p(A,B)≤TrAα,q(A,B) holds for all A,B≥0 with s(A)≥s(B).
Proof.
If p/q≤2min{α,1−α}, then the result follows from Proposition 4.25(2). If q≥1, then
we have, with r>0 satisfying p/r≤min{α,1−α},
Let 0<α<1 and p,q>0. If TrSGα,p(A,B)≤TrAα,q(A,B) holds for all
A,B∈M2++, then min{1,p/2}≤q.
Proof.
It suffices to show that we must have p/2≤q when q<1. Consider once again
A0,Bθ∈M2++ in (3.5) with y=x>0, x=1. Note that
[TABLE]
We compute
[TABLE]
where
[TABLE]
This computation is similar to the previous ones based on Example 3.2(1), so the details are left to
the reader. Furthermore, with H:=[0ξ12ξ120] and
K:=[ξ1100ξ22] we have by the Taylor expansion,
[TABLE]
from which it is easy to compute
[TABLE]
where
[TABLE]
Then by Lemma 3.6 the two eigenvalues of
A0p/2(A0−p#Bθp)2αA0p/2 are
[TABLE]
Hence we find that
[TABLE]
On the other hand, we have estimated TrAα,q(A0,Bθ) in (4.9). Therefore, we
must have
[TABLE]
for all x>0, x=1. As x↘0 note that
[TABLE]
where the last convergence is due to the assumption q<1. Therefore, we have
−p2α(1−α)≤−qα(1−α), showing that p/2≤q.
∎
Proposition 4.30**.**
Let 0<α<1 and p,q>0 be such that q≥1 or p/q≤2min{α,1−α}. Then for every
A,B≥0 with s(A)≥s(B) the following conditions are equivalent:
(i)
TrSGα,p(A,B)=TrAα,q(A,B);
(ii)
SGα,p(A,B)=Aα,q(A,B);
(iii)
A=B.
Proof.
It suffices to show (i)⟹(iii). Assume (i). Let r:=min{α,1−α}; then min{1,(p/r)/2}≤q.
By Propositions 4.25(1) and 4.11 we have
[TABLE]
so that TrRα,p/r(A,B)=TrAα,q(A,B). Hence (iii) follows from Proposition 4.14.
∎
The results of this subsection are summarized as follows:
Problem 4.31**.**
At the moment we find no sufficient condition for SGα,p≤nearAα,q to hold, while a
necessary condition is given in Theorem 4.24. It might happen that this inequality never holds for
any p,q>0. For example, when α=1/2, since the necessary condition gives p≤q+12q,
we notice that SG1/2,p≤nearA1/2,q fails to hold for any q>0 if p≥2, and for any
q<1 if p=1. But it is still unknown if F(A,B)≤near2A+B (the case p=q=1) holds for
all A,B>0. The situation is similar for SGα,p≤λAα,q except the case
α=1/2. When α=1/2, since λ(F(A,B))=λ((B1/2AB1/2)1/2)
(see [15, Theorem 3.2, Item 8]), it is easy to see that if 0<p≤q, then
SG1/2,p(A,B)≤λA1/2,q(A,B) for all A,B>0. As for SGα,p≺wAα,q
and SGα,p≤TrAα,q, there is a rather big gap between the sufficient condition and the
necessary condition obtained.
4.6 SGα,p vs. Aα,q
In this subsection we discuss inequalities between SGα,p and Aα,q. The next
theorem says that SGα,p(A,B)≤nearAα,q(A,B) fails to hold for any
α∈(0,1)∖{1/2} and any p,q>0.
Theorem 4.32**.**
For any α∈(0,1)∖{1/2} and any p,q>0 there exist A,B∈M2++ such that
SGα,p(A,B)≤nearAα,q(A,B). Hence, for any α∈(0,1) and any
p,q>0 there exist A,B∈M2++ such that
SGα,p(A,B)≤chaoAα,q(A,B).
Proof.
First, note that the latter assertion follows from the first and Theorem 4.23 since
SG1/2,p=SG1/2,p. For A,B>0 set Y:=Ap and X:=A−p#αBp; then
Bp=Y−1/2(Y1/2XY1/2)1/αY−1/2. We will express this RHS as Y−1#1/αX,
though 1/α>1, in analogy of the geometric mean #α. Then
SGα,p(A,B)≤nearAα,q(A,B) is equivalently written as
[TABLE]
where r:=q/p. Conversely, for any X,Y>0, if we set A:=Y1/p and B:=(Y−1#1/αX)1/p,
then Y=Ap and X=A−p#αBp. Hence it suffices to show that for any
α∈(0,1)∖{1/2} and any r,q>0 there exist X,Y∈M2++ for which (4.44)
is violated.
Let X:=Bθ and Y:=A0 in (3.5) with y=x2α−1>0 (α=1/2) and x=1.
Set
[TABLE]
In the following computations we will repeatedly apply the reduced version of Example 3.2(1) up to
o(θ). One computes
X1/2Y2(1−α)X1/2=[1θa12θa12x]+o(θ),
where
[TABLE]
and hence
[TABLE]
On the other hand, one computes
Y−1#1/αX=[1θξ12θξ12x]+o(θ),
where
[TABLE]
and hence
(Y−1#1/αX)r=[1θζ12θζ12xr]+o(θ),
where
[TABLE]
Since (1−α)Yr+α(Y−1#1/αX)r=[1θαζ12θαζ12xr]+o(θ),
one has
Now assume that L≤nearM for all X,Y∈M2++. Then we must have
(M1/2LM1/2)1/2≤M so that by (4.48) and (4.49),
[TABLE]
Therefore, we must have
[TABLE]
which becomes 1−xrαζ12=1−xa12. By (4.45) and (4.47)
this gives
[TABLE]
for all x>0 with x=1, which fails to hold as x↘0 in either case 0<α<1/2 or
1/2<α<1.
∎
Proposition 4.33**.**
Let 0<α<1 and p,q>0.
(1)
If p≤αq, then we have SGα,p(A,B)≺logRα,q(A,B) for
every A,B≥0 with s(A)≥s(B).
(2)
If p≤2αq, then we have SGα,p(A,B)≺wlogAα,q(A,B)
(hence SGα,p(A,B)≺wAα,q(A,B)) for every A,B≥0 with s(A)≥s(B).
Proof.
(1) By continuity (Proposition 2.2) we may assume that A,B>0. Similarly to the proof of
Proposition 4.25(1) it suffices to show that
[TABLE]
For 0<α≤1/2, by Araki’s log-majorization [5] we have
[TABLE]
The remaining proof is the same as in the proof of Proposition 4.25(1).
(2) follows from (1) and Proposition 4.9 as in the proof of Proposition 4.25(2).
∎
Remark 4.34**.**
Although we have shown the log-majorizations in Propositions 4.25(1) and 4.33(1) from
Araki’s log-majorization, they can also be shown by directly applying the familiar antisymmetric tensor
power technique (see [4]). Furthermore, it is known that
SGα,p(A,B)≺logRα,q(A,B) for all A,B>0 if and only if p/q≤α,
and that Rα,q(A,B)≺logSGα,p(A,B) for all A,B>0 if α≤1/2
and q≤p, and the same holds only if α≤1/2 and p/q≥1/2. The details on these facts will be
provided in [23].
Theorem 4.35**.**
Let 0<α<1 and p,q>0. If
λ1(SGα,p(A,B))≤λ1(Aα,q(A,B)) holds for all
A,B∈M2++, then q/p≥1−α. Hence, if
SGα,p(A,B)≺wAα,q(A,B) for all A,B∈M2++, then
q/p≥1−α.
Proof.
Assume that λ1(SGα,p(A,B))≤λ1(Aα,q(A,B)) holds
for all A,B∈M2++, and argue as in the first paragraph of the proof of Theorem 4.32. Then,
since λ((X1/2Y2(1−α)X1/2)r/q)=λ((Y1−αXY1−α)r/q), we have
[TABLE]
holds for all X,Y∈M2++, where r:=q/p. Now set X:=Bθ and Y:=A0 in (3.5)
for x,y>0 with xy=1 and x1−αy=1. We compute
[TABLE]
On the other hand, using Example 3.2(1) we compute
[TABLE]
where
[TABLE]
and hence
[TABLE]
where
[TABLE]
Therefore,
[TABLE]
Let x,y<1. Note that x2(1−α)y<1 and
(1−α)xr+αxα(1−α)ryαr<1. Then by Lemma 3.6 we have
[TABLE]
so that
[TABLE]
This implies that
[TABLE]
Letting y↘0 gives
[TABLE]
Hence we must have
[TABLE]
Letting x↘0 further gives 0≤α(r−1)+α2, i.e., r≥1−α.
∎
Proposition 4.36**.**
Let 0<α<1 and p,q>0. If min{1,2αp}≤q, then
TrSGα,p(A,B)≤TrAα,q(A,B) holds for all A,B≥0 with s(A)≥s(B).
Proof.
If p≤2αq, then the result follows from Proposition 4.33(2). If q≥1, then we have, with
r>0 satisfying p/r≤α,
From this and (4.9) for TrAα,q(A0,Bθ) we must have
[TABLE]
for all x>0, x=1. As x↘0, noting that η11(p)→−α, η12(p)→0 and
[TABLE]
we find that the LHS of (4.50) goes to −pα, while the RHS goes to
−qα(1−α) since q<1. Therefore, −pα≤−qα(1−α), showing
that p≤1−αq.
∎
The following is seen similarly to Proposition 4.30.
Proposition 4.38**.**
Let 0<α<1 and p,q>0 be such that min{1,2αp}≤q. Then for every A,B≥0 with
s(A)≥s(B) the following conditions are equivalent:
(i)
TrSGα,p(A,B)=TrAα,p(A,B);
(ii)
SGα,p(A,B)=Aα,p(A,B);
(iii)
A=B.
The results of this subsection are summarized as follows. Note that the conditions here are not symmetric
under interchanging α and 1−α, unlike those in the previous subsections. This is a reflection
of the fact that SGα,p is not symmetric in the sense of Remark 2.1(2), unlike
other quasi matrix means.
Problem 4.39**.**
We find no necessary condition for SGα,p≤nearAα,q and no sufficient
condition for SGα,p≤λAα,q. The proof of Theorem 4.32 cannot
apply to the case α=1/2. Indeed, when α=1/2, since SG1/2,p=SG1/2,p,
the problem has already been pointed out in Problem 4.31. Moreover, as for
SGα,p≺wAα,q and SGα,p≤TrAα,q,
there is a big gap between the sufficient condition and the necessary condition, similarly to those for
SGα,p and Aα,q.
5 Concluding remarks
(1) For each quasi matrix mean Mα,p∈{Aα,p,LEα,Rα,p,Gα,p,SGα,p,SGα,p} and for each matrix order
◃∈{≤,≤chao,≤near,≤λ,≺w,≤Tr}, we have aimed at finding the
necessary and sufficient condition on p,q,α under which the inequality
Mα,p(A,B)◃Aα,q(A,B) holds for all A,B>0. When
Mα,p=Aα,p,LEα,Gα,p, our objective has perfectly been achieved
as seen in the tables at the end of Sections 4.1, 4.2 and 4.4. However, when
Mα,p=Rα,p,SGα,p,SGα,p, that has not completely be done
as seen in the tables of Sections 4.3, 4.5 and 4.6, where there is a gap
between the sufficient condition and the necessary condition for some of our target inequalities. Therefore,
the problem is still left open for those cases as explained in Problems 4.15, 4.31 and
4.39. We are especially concerned with the question whether p/2≤q is the necessary and
sufficient condition for Rα,p≤λAα,q to hold or not. This is indeed equivalent to
saying whether ∣AαpB(1−α)p∣1/p≤λαA+(1−α)B holds for all A,B>0
only if p≤1 (in other words, whether Ando’s matrix Young inequality
∣AαB1−α∣≤λαA+(1−α)B is the best possible case or not).
(2) We have considered an inequality Mα,p◃Aα,q in the two directions of the
sufficiency part (to show the inequality under some condition) and the necessity part (to find a necessary
condition for the inequality to hold). The former direction is a more or less easy task by applying well known
facts or methods in matrix analysis. Thus we have presented the results in this direction as propositions. On
the other hand, the latter direction is computation-oriented, where we provide a counter-example with use of
a specific pair of 2×2 matrices, typically the pair A0,Bθ given in (3.5). We have
prepared in Section 3 some technical computations on A0,Bθ based on Taylor’s theorem
for matrix functions, which are repeatedly used in the main Section 4. In this way, the results in the
latter direction are much involved with plenty of computations though elementary in nature, so we have
presented those as theorems. In order to settle the aforementioned open questions by improving the current
necessary conditions, we probably need to seek a more sophisticated example of 2×2 matrix pair,
or otherwise a matrix pair of higher degrees, though computations with matrix pairs of 3×3 or higher
seem difficult to perform by hand. Also, it seems that numerical computations are not so much helpful
because the problem is to find the best possible necessary condition.
(3) It is intuitively clear that the implications stated in Proposition 2.4(3) are all strict. This fact has been
exemplified in our study of inequalities of quasi matrix means. In fact, the strictness of those implications
except for ≤chao⇒≤near is manifest in the tables at the end of Sections
4.1–4.4. As for ≤chao⇒≤near, the following remark is worth noting:
Due to Proposition 2.4(2) this implication is equivalent to the so-called Ando–Hiai inequality [4]
(i.e., for X,Y>0, X#Y≤I⇒Xp#Yp≤I for p≥1); see also [13, Proposition 4].
Therefore, its strictness corresponds to X#Y≤I⇏Xp#Yp≤I for 0<p<1.
(4) We can apply the method explained in Remark 2.7(3) to have the characterization of
Hα,q◃Hα,p from that of Aα,p◃Aα,q given in
Section 4.1. Furthermore, we have the sufficient condition (or the necessary condition) for
Hα,q◃Mα,p from that for Mα,p◃Aα,q given in
Sections 4.2–4.6 for any
Mα,p∈{Gα,p,SGα,p,SGα,p,Rα,p,Lα} and
◃∈{≤,≤chao,≤near,≤λ}. But the inequalities
Hα,q≺wMα,p and Hα,q≤TrMα,p are not touched in this
paper. As for Hα,p◃Aα,q, we have Hα,p≤chaoAα,q
for any p,q>0 since Proposition 4.19 yields
Hα,p≤chaoGα,r≤chaoAα,q with r=min{p,q}. We have also
Hα,p≤Aα,q for any p,q≥1 since Theorem 4.16 yields
Hα,p≤Gα,1≤Aα,q. An interesting problem left open is whether
Hα,p≤Aα,q holds only if p,q≥1 or not. Here note that for any q>0,
Hα,p≤Aα,q fails to hold at least for sufficiently small p>0, because otherwise
letting p↘0 gives LEα≤Aα,q thanks to Theorem 2.3 and it contradicts
Theorem 4.5.
(5) The study of this paper was partly motivated by Milan Mosonyi’s question to the author, asking whether
there exists, for 0<α<1, a ‘reasonable’ α-weighted geometric-type mean M(A,B)
(A,B>0) other than #α, where ‘reasonable’ is used in the sense that M satisfies
(i) M(A,B)=A1−αBα if AB=BA,
(ii) tensor multiplicative M(A1⊗A2,B1⊗B2)=M(A1,B1)⊗M(A2,B2),
(iii) block additive M(A1⊕A2,B1⊕B2)=M(A1,B1)⊕M(A2,B2),
and (iv) arithmetic-geometric inequality M(A,B)≤(1−α)A+αB.
We examined the possibility of the quasi-geometric type means discussed in this paper to satisfy condition
(iv) as it is obvious that they satisfy the other conditions (i)–(iii). But it turned out that none of them other
than #α satisfies (iv); see the tables of Sections 4.2–4.6 and
Corollary 4.18 for Gα,p. Meanwhile, Mosonyi and his coauthors settled the question as
follows in [14]: For any α∈(0,1), if an α-weighted matrix mean
M:Mn++×Mn++→Mn++ (n∈N) satisfies (i),
(ii*′) (weakly) tensor multiplicative M(A⊗n,B⊗n)=M(A,B)⊗n,
(ii′′*) scalar tensor multiplicative M(aA,bB)=M(a,b)M(A,B) for a,b∈(0,∞), (iii) and (iv), then
M(A,B)=A#αB for all A,B>0. This result establishes a remarkable new characterization of the
operator mean #α.
(6) In Theorem A.1 of the appendix below, we present the Lie–Trotter–Kato product formula for operator
means in the positive semidefinite matrix case, which we have used in Section 2. The author has
known Theorem A.1 for long years, without publication though it was briefly explained in [8]
without proof. This product formula for operator means in the positive semidefinite case seems unfamiliar
even to matrix analysis experts, while that in the positive definite matrix case is rather well known (see, e.g.,
[25, Lemma 3.3], [20, Sec. 4.3]). So it would be worthwhile for us to take this opportunity to
present its complete description.
Acknowledgements
The author thanks Milan Mosonyi for invitation to the workshop at the Erdős Center in July, 2024 and for
valuable suggestions which helped to improve this paper.
Appendix A The operator mean version of the Lie–Trotter–Kato product formula for positive semidefinite
matrices
The famous Lie–Trotter–Kato product formula originally established in [35, 27] says that if H
and K are lower bounded self-adjoint operators in a Hilbert space H, then (e−H/me−K/m)m
converges in strong operator topology to e−(H+˙K)P0 as m→∞, where H+˙K is the
form sum (see [26]) and P0 is the orthogonal projection onto the closure of the domain
of H+˙K. According to the proof in [26] (see also [20, Theorem 3.6]) the formula can be
modified in the symmetric form with a continuous parameter as
[TABLE]
Furthermore, it is known [27, Sec. 5] that the formula is valid even if H and K have non-dense
domains.
For Hermitian matrices H and K the product formula simply becomes the Lie formula
[TABLE]
which has plenty of applications in matrix analysis. The unitary orbital version of this (without limit) is also
worth noting [34, 18]. For positive semidefinite (not necessarily positive definite) matrices A,B we
consider H:=−logA and K:=−logB defined under the restriction to the ranges of the support projections
A0:=s(A) and B0:=s(B), respectively. Applying (A.1) (for non-dense domains) to these H,K
we have222There seems no literature which provides the proof of (A.2) in a way specialized to the matrix setting.
[TABLE]
where P0:=A0∧B0.
This appendix is aimed to supply the operator mean version of (A.2) for matrices A,B≥0.
Throughout the appendix let A,B be n×n positive semidefinite matrices. Define logA in the
generalized sense as logA:=(logA)A0 restricted on the range of A0 (and zero on the range of
A0⊥=I−A0), and similarly logB:=(logB)B0. We write P0:=A0∧B0 as above.
Now, let σ be a Kubo–Ando’s operator mean with the representing operator monotone function f
on (0,∞), and let α:=f′(1). Note that 0≤α≤1 and if α=0 (resp., α=1) then
AσB=A (resp., AσB=B) so that (ApσBp)1/p=A (resp.,
(ApσBp)1/p=B) for all A,B≥0 and p>0. So in the rest we assume that 0<α<1.
Theorem A.1**.**
With the above assumptions, for every A,B≥0,
[TABLE]
Remark A.2**.**
Note [25, Sect. 4] that the RHS of (A.3) is written as
[TABLE]
so that we may write
[TABLE]
The next lemma is essential to prove the theorem. The proof of the lemma is a slight
modification of that of [25, Lemma 4.1].
Lemma A.3**.**
For each p∈(0,p0) with some p0>0, a Hermitian matrix Z(p) is given in the
2×2 block form as
[TABLE]
where Z0(p) is m×m, Z1(p) is l×l and Z2(p) is m×l.
Assume:
(a)
Z0(p)→Z0* as p↘0,*
(b)
sup{∥Z2(p)∥∞:p∈(0,p0)}<∞,
(c)
there is a δ>0 such that pZ1(p)≤−δIl for all p∈(0,p0).
Then
[TABLE]
Proof.
We list the eigenvalues of Z(p) in decreasing order (with multiplicities) as
[TABLE]
together with the corresponding orthonormal eigenvectors
[TABLE]
so that we write
[TABLE]
Furthermore, let μ1(p)≥⋯≥μm(p) be the eigenvalues of Z0(p) and
μ1≥⋯≥μm be the eigenvalues of Z0 Then μi(p)→μi as p↘0
thanks to assumption (a). By the majorization result for eigenvalues in
[2, Corollary 7.2] we have
[TABLE]
Since
[TABLE]
thanks to assumptions (a)–(c), it follows that for m<i≤m+l, pλi(p)<−δ/2 for any
p>0 sufficiently small so that
[TABLE]
Hence it suffices to prove that for any sequence (p0>)pk↘0 there exist a subsequence
{pk′} of {pk} and vectors v1,…,vm∈Cm such that we have for 1≤i≤m
[TABLE]
Indeed, it then follows that v1,…,vm are orthonormal vectors in Cm, so from
(A.4) and (A.6) we obtain
[TABLE]
Now, replacing {pk} with a subsequence, we may assume that ui(pk) itself converges to some
ui∈Cm⊕Cl for 1≤i≤m. Writing ui(pk)=vi(k)⊕wi(k) in
Cm⊕Cl, we have
[TABLE]
by assumption (c). For i=1, since μ1(pk)≤λ1(pk) by (A.5) for r=1, it follows
from (A.10) that
[TABLE]
so that
[TABLE]
as k→∞ (pk↘0) due to assumptions (a) and (b). Hence we have w1(k)→0 so that
u1(pk)→u1=v1⊕0 in Cm⊕Cl (hence v1(k)→v1) for some v1∈Cm.
From (A.10) again we furthermore have
[TABLE]
Therefore, λ1(pk)→⟨v1,Z0v1⟩=μ1 and hence Z0v1=μ1v1 since μ1 is
the largest eigenvalue of Z0. Next, when k≥2 and i=2, since λ2(pk) is bounded below
by (A.5) for r=2, it follows as above that w2(k)→0 and hence u2(pk)→u2=v2⊕0
for some v2∈Cm. Therefore,
[TABLE]
so that λ2(pk)→⟨v2,Z0v2⟩=μ2 and Z0v2=μ2v2 since μ2 is the largest
eigenvalue of Z0 restricted to {v1}⊥∩Cm. Repeating this argument we obtain
v1,…,vm∈Cm for which (A.7)–(A.9) hold for 1≤i≤m.
∎
Let us divide the proof into two steps. In the proof below we use the α-weighted arithmetic mean
▽α and the α-weighted harmonic mean !α. Note that
[TABLE]
Step 1. First, we prove the theorem in the case where PσQ=P∧Q for all orthogonal projections
P,Q (this is the case, for instance, when σ=!α or #α the weighted geometric mean,
see [29, Theorem 3.7]). Let H0 be the range of P0 (=A0!αB0=A0σB0).
From the operator monotonicity of logx (x>0) it follows that for every p>0,
[TABLE]
For every ε>0 we have
[TABLE]
where A−p=(A−1)p and B−p=(B−1)p are defined via the generalized inverses. Therefore,
[TABLE]
since the support projection of A0⊥▽αB0⊥ is
A0⊥∨B0⊥=P0⊥. In the above, {⋯}−1 is the generalized inverse (with
support H0) and the inequality is due to [10, Corollary 2.3]. Letting ε↘0 in
(A.12) gives
Step 2. For a general operator mean σ the integral representation theorem [29, Theorems 3.4, 3.7]
says that there are θ,β∈[0,1] and an operator mean τ such that
[TABLE]
and PτQ=P∧Q for all orthogonal projections P,Q. Moreover, τ has the representing
operator monotone function g on (0,∞) for which γ:=g′(1)∈(0,1) and
[TABLE]
We may assume that 0<θ≤1 since the case θ=0 was shown in Step 1. Moreover, when
θ=1, we have β=α∈(0,1). At the moment, assume that 0<θ≤1 and 0<β<1.
Let A,B≥0 be given, and note that
A0σB0=θA0▽βB0+(1−θ)(A0∧B0) has the support projection
A0∨B0. Let H, H0 and H1 denote the ranges of A0∨B0,
P0=A0∧B0 and A0∨B0−P0, respectively, so that H=H0⊕H1. Note that
the support of ApσBp for any p>0 is H. We will describe
p1log(ApσBp)H in the 2×2 block form with respect to the decomposition
H=H0⊕H1. Let
[TABLE]
It follows from Step 1 that limp↘0(ApτBp)1/p=P0eY0P0 and hence
[TABLE]
In the above, the fourth equality follows since log(eY0+o(1))=Y0+o(1). On the other hand,
we have
[TABLE]
Therefore, we have
[TABLE]
Setting
[TABLE]
we write
[TABLE]
where C is a positive definite contraction on H and H is a Hermitian operator on H. Note that
the eigenspace of C for the eigenvalue 1 is H0. Hence, with a basis consisting of orthonormal
eigenvectors for C we may assume that C is diagonal so that C=diag(c1,…,cm+l) with
[TABLE]
where m=dimH0 and m+l=dimH.
Applying Taylor’s theorem (see, e.g., [21, Theorem 2.3.1]) to log(C+pH+o(p)) we have
[TABLE]
where D(logx)(C)(⋅) denotes the Fréchet derivative of the functional calculus by logx at C.
Daleckii and Krein’s derivative formula (see, e.g., [21, Theorem 2.3.1]) says that
[TABLE]
where ∘ is the Schur product and (logci−logcj)/(ci−cj) is understood as 1/ci when ci=cj.
We write D(logx)(C)(H) in the 2×2 block form on H0⊕H1 as
[Z0Z2∗Z2Z1] where Z0:=P0HP0∣H0. By
(A.21)–(A.23) we can write
[TABLE]
where
[TABLE]
This 2×2 block form of Z(p):=p1log(ApσBp)H satisfies assumptions (a)–(c)
of Lemma A.3 for p∈(0,p0) with a sufficiently small p0>0. Therefore, the lemma implies that
[TABLE]
on H=H0⊕H1. Finally, we have
[TABLE]
thanks to (A.20). Hence the desired limit formula follows.
For the remaining case where 0<θ<1 and β=0 or 1, the proof is similar to the above when
we take as H the range of A0 (for β=0) or B0 (for β=1) instead of the range of
A0∨B0.
∎
Remark A.4**.**
Assume that the operator mean σ satisfies PσQ=P∧Q for all orthogonal projections
P,Q, that is, the representing function f satisfies f(0+)=0 and limx→∞f(x)/x=0 (see
[29, Theorem 3.7]). This is the case when σ#α for instance. Then, from the proof of
Step 1 above, for any A,B≥0 we have a slightly improved form of (A.3) as follows:
[TABLE]
Indeed, set L:=(1−α)P0(logA)P0+αP0(logB)P0. By (A.13) and (A.16)
one has
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] T. Ando, On some operator inequalities, Math. Ann. 279 (1987), 157–159.
2[2] T. Ando, Majorizations, doubly stochastic matrices, and comparison of eigenvalues, Linear Algebra Appl. 118 (1989), 163–248.
3[3] T. Ando, Matrix Young inequalities, in: Operator Theory in Function Spaces and Banach Lattices , Oper. Theory Adv. Appl. Vol. 75, C. B. Huijsmans et al. (eds), Birkhäuser, Basel, 1995, pp. 33–38.
4[4] T. Ando and F. Hiai, Log majorization and complementary Golden–Thompson type inequalities, Linear Algebra Appl. 197 (1994), 113–131.
5[5] H. Araki, On an inequality of Lieb and Thirring, Lett. Math. Phys. 19 (1990), 167–170.
6[6] K. M. R. Audenaert and N. Datta, α \alpha - z z -relative entropies, J. Math. Phys. 56 (2015), 022202, 16 pp.
7[7] K. M. R. Audenaert and F. Hiai, On matrix inequalities between the power means: counterexamples, Linear Algebra Appl. 439 (2013), 1590–1604.
8[8] K. M. R. Audenaert and F. Hiai, Reciprocal Lie–Trotter formula, Linear and Multilinear Algebra 64 (2016), 1220–1235.