Geometric mean flows and the Cartan barycenter on the Wasserstein space over positive definite matrices
Fumio Hiai, Yongdo Lim

TL;DR
This paper introduces flows on the Wasserstein space of probability measures over positive definite matrices, analyzing their differentiability, fixed points, and related inequalities, advancing understanding of geometric and probabilistic structures in matrix spaces.
Contribution
It develops a new class of flows on Wasserstein space over positive definite matrices, establishing differentiability, a Lie-Trotter formula, and fixed point results related to the Karcher equation.
Findings
Established differentiability of Cartan barycentric trajectories.
Derived a version of the Lie-Trotter formula for these flows.
Proved a fixed point theorem related to the Karcher equation.
Abstract
We introduce a class of flows on the Wasserstein space of probability measures with finite first moment on the Cartan-Hadamard Riemannian manifold of positive definite matrices, and consider the problem of differentiability of the corresponding Cartan barycentric trajectory. As a consequence we have a version of Lie-Trotter formula and a related unitarily invariant norm inequality. Furthermore, a fixed point theorem related to the Karcher equation and the Cartan barycentric trajectory is also presented as an application.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeometric Analysis and Curvature Flows · Advanced Differential Geometry Research · Point processes and geometric inequalities
Geometric mean flows and the Cartan barycenter on the Wasserstein space
over positive definite matrices
Fumio Hiai and Yongdo Lim
Tohoku University (Emeritus), Hakusan 3-8-16-303, Abiko 270-1154, Japan
Department of Mathematics, Sungkyunkwan University, Suwon 440-746, Korea
Abstract.
We introduce a class of flows on the Wasserstein space of probability measures with finite first moment on the Cartan-Hadamard Riemannian manifold of positive definite matrices, and consider the problem of differentiability of the corresponding Cartan barycentric trajectory. As a consequence we have a version of Lie-Trotter formula and a related unitarily invariant norm inequality. Furthermore, a fixed point theorem related to the Karcher equation and the Cartan barycentric trajectory is also presented as an application.
2010 Mathematics Subject Classification. 15A42, 47A64, 47B65, 47L07
Key words and phrases. Positive definite matrix, Probability measure, Riemannian trace metric, Cartan barycenter, Wasserstein distance, Lie-Trotter formula
1. Introduction and main theorem
Let be the set of positive definite matrices, which is a smooth Riemannian manifold with the Riemannian trace metric where and the Euclidean space of Hermitian matrices equipped with the inner product . Then is a Cartan-Hadamard Riemannian manifold, a simply connected complete Riemannian manifold with non-positive sectional curvature (the canonical -tensor is non-negative). The Riemannian distance between with respect to the above metric is given by , where for , and the unique (up to parametrization) geodesic joining and is given as the curve of weighted geometric means
[TABLE]
Let denote the set of all probability measures on the Borel sets of , and be the set of with finite first moment, i.e., for some (equivalently for all) , For the Cartan barycenter is uniquely defined as
[TABLE]
independently of the choice of a fixed (see [21]). For every , is characterized by the Karcher equation
[TABLE]
which is equivalent to the gradient zero equation for the function Z\mapsto\int_{\mathbb{P}_{m}}\bigl{[}d^{2}(Z,X)-d^{2}(Y,X)\bigr{]}\,d\mu(X) on . See [9, Theorem 3.1].
When and is a weight vector (i.e., , ), we denote by the Cartan mean of a finitely supported measure , where is the point measure of mass (or the Dirac mass) at . In particular, with for coincides with the weighted geometric mean in (1.1). For we have no such formula, and properties of have to be established by indirect arguments. The multivariate mean has been the subject of intensive study in the past ten years, e.g., [20, 4, 14, 5, 18, 23].
We now introduce a class of flows induced by the weighted geometric mean map on the probability measure space .
Definition 1.1**.**
For , and , define by
[TABLE]
i.e., the push-forward of by the homeomorphic map defined by , where we use the notation given in (1.1) without restricting to (indeed, the expression in (1.1) is meaningful for all ). We also define the Cartan barycentric trajectory of (1.3) by
[TABLE]
The one-parameter family provides a flow on and is also considered as a -valued Markov process (see Theorem 2.4 and Remark 2.5 for more details).
The main result of the paper is the following:
Theorem 1.2**.**
Let and . Then the map defined by (1.4) is locally Lipschitz continuous on and differentiable at with
[TABLE]
The proof of the theorem will be presented in Section 3. The theorem has an important consequence on the Lie-Trotter formula for the Cartan barycenter, as shown in the rest of this introductory section.
For general square matrices and the well-known Lie-Trotter formula expresses
[TABLE]
The symmetric form with a continuous parameter is also well-known as
[TABLE]
for . This formula has also been known in many other situations; for example, see [10, 8, 7, 1, 3] for and other means. The Lie-Trotter formula for the Cartan mean of a finite number of (or a finitely supported measure ) is
[TABLE]
as given in [6, 11]. In [9], the authors have extended this Lie-Trotter formula for a certain sub-class of in such a way that
[TABLE]
for any satisfying for some . Here, denotes the operator norm of , while any two norms on are equivalent due to finite dimensionality.
The action of -th power on is defined by the push-forward measure of by the matrix -th power on , that is,
[TABLE]
for any Borel set , which is indeed comparable to the case in (1.6) since for . When is the identity matrix , we have with . Theorem 1.2 implies that so that
[TABLE]
Therefore,
[TABLE]
This provides the following extension of the above Lie-Trotter formula to the most general case of .
Corollary 1.3**.**
The formula (1.7) holds true for every .
It turns out [9, Corollary 4.5] that \big{|}\big{|}\big{|}G(\mu^{t})^{1\over t}\big{|}\big{|}\big{|} is increasing as for any unitarily invariant norm . As a byproduct of Corollary 1.3 we have:
Corollary 1.4**.**
Let . Then for every unitarily invariant norm and for every ,
[TABLE]
and \big{|}\big{|}\big{|}G(\mu^{t})^{1\over t}\big{|}\big{|}\big{|} increases to \big{|}\big{|}\big{|}\exp\int_{\mathbb{P}_{m}}\log X\,d\mu(X)\big{|}\big{|}\big{|} as .
2. Geometric mean flows on the probability measure space
Let and . For every , define as in Definition 1.1, that is, is the push-forward of by , .
Lemma 2.1**.**
We have for every .
Proof.
It is immediate to see that
[TABLE]
When , we have
[TABLE]
[TABLE]
Therefore, by (2.1) we have
[TABLE]
which implies that since for all and
[TABLE]
When , the argument is similar since . ∎
Note that , where is defined in (1.8), and When , since if and only if , i.e., , we see that
[TABLE]
for any Borel set . Moreover, note that if , then .
The -Wasserstein distance on is defined by
[TABLE]
where is the set of all couplings for , i.e., whose marginals are and . Recall (see [21]) that is a complete metric space with the metric and that the set of uniform probability measures with finite support (i.e., the measures of the form ) is dense in . An important fact called the fundamental contraction property in [21] (also [9, Theorem 2.3]) is that the Cartan barycenter is a Lipschitz map with Lipschitz constant ; namely, for every ,
[TABLE]
The next lemma will play a role, which was given in [17, Lemma 2.2] in a more general setting.
Lemma 2.2**.**
Let be a Lipschitz map with Lipschitz constant . Then the push-forward map , , is Lipschitzian with respect to with Lipschitz constant .
Lemma 2.3**.**
For every and
[TABLE]
Proof.
It is known (see [2]) that
[TABLE]
By the triangular inequality, for every ,
[TABLE]
For , in , it is known (see Introduction of [22]) that
[TABLE]
where is the permutation group on . Therefore, for every we find a so that
[TABLE]
Hence the required inequality holds for all . Since by (2.3), we see by Lemma 2.2 that is Lipschitzian with Lipschitz constant . Since is dense in , the result follows. ∎
Theorem 2.4**.**
For each the map defined by
[TABLE]
is a continuous flow satisfying
[TABLE]
Moreover, for a fixed , the map is locally Lipschitz continuous with respect to , that is, for every there exists a constant such that
[TABLE]
Proof.
It is immediate to see that for every , which yields
[TABLE]
This is nothing but (2.4). Continuity follows from Lemma 2.3.
Let be fixed. Lemma 2.3 shows in particular that for every with . When , since and , we have
[TABLE]
which immediately gives
[TABLE]
Moreover,
[TABLE]
Hence the result holds for .
For any and write and with . Then by (2.5) we can write and with , By the above case with in place of we have
[TABLE]
for some constant . Hence the result follows with . ∎
Remark 2.5**.**
Theorem 2.4 says that () is a multiplicative -flow on . Modifying as (), we have an additive -flow on starting at () and attracted to (as ). This flow is also considered as a -valued Markov stochastic process (with smooth sample paths) on the probability space .
3. Proof of Theorem 1.2
In the following we fix and For notational simplicity we write for (with ), which is uniquely characterized by the Karcher equation (see (1.2))
[TABLE]
that is,
[TABLE]
We set , where is defined by . (Note that is with in the notation in [13].) Moreover, let
[TABLE]
where the action of -th power on is defined by the push-forward measure of by the matrix -th power on , that is,
Lemma 3.1**.**
For , and . Then and
[TABLE]
Proof.
Since , we have , where . Therefore,
[TABLE]
Now, we recall (see [13]) that the Cartan barycenter has the invariance property , i.e., . Hence (3.2) follows. ∎
To prove Theorem 1.2, we may and do assume that from (3.2). In this case, (with ) and (1.5) is simply .
Lemma 3.2**.**
For any there exists a constant such that for every and every ,
[TABLE]
Proof.
For any , by Lemma 2.3 we have
[TABLE]
Applying this to the fundamental contraction property (2.2) and using the exponential metric increasing property (EMI) (see [2, Theorem 6.1.4])
[TABLE]
we have
[TABLE]
In particular, for all . For any and we find that
[TABLE]
where . ∎
Lemma 3.3**.**
There exists a constant such that
[TABLE]
for every and every .
Proof.
From , we may assume that . Since
[TABLE]
we have
[TABLE]
Let and . Then we have
[TABLE]
Note here that , , and are all uniformly bounded for by Lemma 3.2. Combining the above estimates together with (2.1), we find a constant such that
[TABLE]
∎
Proof of Theorem 1.2. For let . We will prove that converges as . Since is bounded by Lemma 3.2, we may prove that a limit point of as is unique. Note that for each
[TABLE]
Now, assume that for a sequence with , so that
[TABLE]
as . By (3.1) we have
[TABLE]
Thanks to Lemma 3.3, the Lebesgue convergence theorem can be applied to (3.3) so that we obtain
[TABLE]
Therefore, is a unique limit point of as . This means that is differentiable at with the derivative . Since , we find that is differentiable at and
[TABLE]
which is the desired conclusion (as we assumed that ). ∎
Theorem 3.4**.**
Let and let be as in Theorem 1.2. Then the following are equivalent:
- (i)
;
- (ii)
;
- (iii)
* for all *equivalently for some ;
- (iv)
* for all *equivalently for some .
Proof.
For every , the Karcher equation (1.2) is equivalent to thanks to (1.5). Hence we have (i)(ii). Moreover, we note that
[TABLE]
Therefore, it immediately follows that (ii)–(iv) are equivalent. ∎
Corollary 3.5**.**
Let and let Then is differentiable at with
When , i.e., has finite second moment, the equivalence of (ii) and (iii) of Theorem 3.4 was shown in [13, Theorem 3.1]. The Karcher equation or equivalently has played a crucial role in the Riemannian geometric approach of multivariate geometric means as in [20, 18, 16], which has been extended to the Cartan barycenter in [12, 13, 9]. For a finitely supported measure , the fixed point Cartan mean equation appeared in [18] and [16]. The formula (1.5) is evidently new and deserves to receive its attention due to its relation to the Karcher equation.
4. Final remarks and open problems
(1) In the present paper, we first prove the differentiability of the Cartan barycentric trajectory at and then use it to prove the Lie-Trotter formula for . One can also proceed in the opposite way. Indeed, we have a direct proof of the Lie-Trotter formula in Corollary 1.3, which in turn shows Theorem 1.2 immediately. It is worth noting that the Lebesgue convergence theorem is essential in our direct proof of (1.7) for , as it is so in the proof of Theorem 1.2 in Section 3.
(2) We are also interested in the extension of Theorem 1.2 to any , that is, in the differentiability problem of and, in this case, in what is the form of derivative . It does not seem possible to generalize the above proof for to the case for at . But, under a stronger assumption that with some , we can prove the differentiability of for , though the expression of is much complicated.
(3) Given , it is well-known (see [15]) that the (Euclidean) gradient of the function \psi(X):={1\over 2}\int_{\mathbb{P}_{m}}\bigl{[}d^{2}(X,A)-d^{2}(Y,A)\bigr{]}\,d\mu(A) at is
[TABLE]
and the Riemannian gradient of at is . Hence the Riemannian gradient flow on is introduced as the solution of the Cauchy problem
[TABLE]
with initial value . In [19], Lim and Pálfia have discussed this gradient flow (called an ODE flow there) and obtained its description by using the resolvent operator defined by
[TABLE]
for and . Note that is the Cartan barycentric trajectory of the arithmetic mean flow on . When , from the arithmetic-geometric mean inequality , we can see that in the partial order on considered in [13, 9]. By the monotonicity property of the Cartan barycenter (see [9, Theorem 3.2]) we have for . It might be interesting to find more relations of the trajectory with and the gradient flow.
Acknowledgments
The work of F. Hiai was supported by Grant-in-Aid for Scientific Research (C)17K05266. The work of Y. Lim was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MEST) No.2015R1A3A2031159 and 2016R1A5A1008055.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] E. Ahn, S. Kim and Y. Lim, An extended Lie-Trotter formula and its applications, Linear Algebra Appl. 427 (2007), 190–196.
- 2[2] R. Bhatia, Positive definite matrices, Princeton Series in Applied Mathematics, Princeton University Press, Princeton, NJ, 2007.
- 3[3] R. Bhatia and P. Grover, Norm inequalities related to the matrix geometric mean, Linear Algebra Appl. 437 (2012), 726–733.
- 4[4] R. Bhatia and J. Holbrook, Riemannian geometry and matrix geometric means, Linear Algebra Appl. 413 (2006), 594–618.
- 5[5] R. Bhatia and R. Karandikar, Monotonicity of the matrix geometric mean, Math. Ann. 353 (2012), 1453–1467.
- 6[6] J. I. Fujii, M. Fujii, Y. Seo, The Golden-Thompson-Segal type inequalities related to the weighted geometric mean due to Lawson-Lim, J. Math. Inequal. 3 (2009), 511–518.
- 7[7] T. Furuta, Convergence of logarithmic trace inequalities via generalized Lie-Trotter formulae, Linear Algebra Appl. 396 (2005), 353–372.
- 8[8] F. Hiai, Log-majorizations and norm inequalities for exponential operators, in Linear Operators , J. Janas, F. H. Szafraniec and J. Zemánek (eds.), Banach Center Publications, Vol. 38, 1997, pp. 119–181.
