Warped metrics for location-scale models
Salem Said, Yannick Berthoumieu

TL;DR
This paper demonstrates that warped Riemannian metrics naturally arise in location-scale models, providing explicit curvature and geodesic solutions, which facilitate the development of efficient statistical algorithms for classification and estimation.
Contribution
It establishes that the Rao-Fisher metric of location-scale models is a warped metric, derives its curvature and geodesic solutions, and shows their application in statistical algorithm development.
Findings
Rao-Fisher metric of location-scale models is a warped metric under invariance conditions.
Explicit formulas for sectional curvature of these metrics.
Geodesic equations have exact analytic solutions.
Abstract
This paper argues that a class of Riemannian metrics, called warped metrics, plays a fundamental role in statistical problems involving location-scale models. The paper reports three new results : i) the Rao-Fisher metric of any location-scale model is a warped metric, provided that this model satisfies a natural invariance condition, ii) the analytic expression of the sectional curvature of this metric, iii) the exact analytic solution of the geodesic equation of this metric. The paper applies these new results to several examples of interest, where it shows that warped metrics turn location-scale models into complete Riemannian manifolds of negative sectional curvature. This is a very suitable situation for developing algorithms which solve problems of classification and on-line estimation. Thus, by revealing the connection between warped metrics and location-scale models, the present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMorphological variations and asymmetry · Bayesian Methods and Mixture Models · Statistical Mechanics and Entropy
11institutetext: Laboratoire IMS (CNRS - UMR 5218), Université de Bordeaux
11email: salem.said;[email protected]
Warped metrics for location-scale models
Salem Said
Yannick Berthoumieu
Abstract
This paper argues that a class of Riemannian metrics, called warped metrics, plays a fundamental role in statistical problems involving location-scale models. The paper reports three new results : i) the Rao-Fisher metric of any location-scale model is a warped metric, provided that this model satisfies a natural invariance condition, ii) the analytic expression of the sectional curvature of this metric, iii) the exact analytic solution of the geodesic equation of this metric. The paper applies these new results to several examples of interest, where it shows that warped metrics turn location-scale models into complete Riemannian manifolds of negative sectional curvature. This is a very suitable situation for developing algorithms which solve problems of classification and on-line estimation. Thus, by revealing the connection between warped metrics and location-scale models, the present paper paves the way to the introduction of new efficient statistical algorithms.
Keywords:
R
ao-Fisher metric, warped metric, location-scale model, sectional curvature, geodesic equation
1 Introduction : definition and two examples
This paper argues that a class of Riemannian metrics, called warped metrics, is natural and useful to statistical problems involving location-scale models. A warped metric is defined as follows [1]. Let be a Riemannian manifold with Riemannian metric . Consider the manifold , equipped with the Riemannian metric,
[TABLE]
where each is a couple with and . The Riemannian metric (1) is called a warped metric on . The functions and have strictly positive values and are part of the definition of this metric.
The main claim of this paper is that warped metrics arise naturally as Rao-Fisher metrics for a variety of location-scale models. Here, to begin, two examples of this claim are given. Example 1 is classic, while Example 2, to our knowledge, is new in the literature. As of now, the reader is advised to think of as a statistical manifold, where is a location parameter and is either a scale parameter or a concentration parameter.
Example 1 (univariate normal model) : let , with the canonical metric of . If each in is identified with the univariate normal density of mean and standard deviation , then the resulting Rao-Fisher metric on is given by [2]
[TABLE]
Example 2 (von Mises-Fisher model) : let , the unit sphere with its canonical metric induced from . Identify in with the von Mises-Fisher density of mean direction and concentration parameter [3]. The resulting Rao-Fisher metric on is given by
[TABLE]
Remark a : note that is a scale parameter in Example 1, but a concentration parameter in Example 2. Accordingly, at , the metric (2) becomes infinite, while the metric (3) remains finite and degenerates to . Thus, (3) gives a Riemannian metric on the larger Riemannian manifold , which contains , obtained by considering as a radial coordinate and as the origin of .
2 A general theorem : from Rao-Fisher to warped metrics
Examples 1 and 2 of the previous section are special cases of Theorem 1, given here. To state this theorem, let be an irreducible Riemannian homogeneous space, under the action of a group of isometries [4]. Denote by the action of on . Then, assume each in can be identified uniquely and regularly with a probability density on , with respect to the Riemannian volume element, such that the following property is verified,
[TABLE]
The densities form a statistical model on , where is a location parameter and can be chosen as either a scale or a concentration parameter, (roughly, a scale parameter is the inverse of a concentration parameter).
In the statement of Theorem 1, and denotes the Riemannian gradient vector field of , with respect to . Moreover, denotes the length of this vector field, as measured by the metric .
Theorem 2.1 (warped metrics)
The Rao-Fisher metric of the statistical model is a warped metric of the form (1), defined by
[TABLE]
where denotes expectation with respect to . Due to property (4), the two expectations appearing in (5) do not depend on the parameter , so and are well-defined functions of .
Remark b : the proof of Theorem 1 cannot be given here, due to lack of space. It relies strongly on the assumption that the Riemannian homogeneous space is irreducible. In particular, this allows the application of Schur’s lemma, from the theory of group representations [5]. To say that is an irreducible Riemannian homogeneous space means that the following property is verified : if is the stabiliser in of , then the isotropy representation is an irreducible representation of in the tangent space .
Remark c : if the assumption that is irreducible is relaxed, then Theorem 1 generalises to a similar statement, involving so-called multiply warped metrics. Roughly, this is because a homogeneous space which is not irreducible, may still decompose into a direct product of irreducible homogeneous spaces [4].
Remark d : statistical models on which verify (4) often arise under an exponential form,
[TABLE]
where is a natural parameter, and is the cumulant generating function of the statistic . Then, for assumption (4) to hold, it is necessary and sufficient that
[TABLE]
Both examples 1 and 2 are of the form (6), as is Example 3, in the following section, which deals with the Riemannian Gaussian model [6][7].
3 Curvature equations and the extrinsic geometry of
For each , there is an embedding of into , as the surface . This embedding yields an extrinsic geometry of , given by the first and second fundamental forms [8].
The first fundamental form is the restriction of the metric of to the tangent bundle of . This will be denoted for . It is clear from (1) that
[TABLE]
This extrinsic Riemannian metric on is a scaled version of its intrinsic metric . It induces an extrinsic Riemannian distance given by
[TABLE]
where is the intrinsic Riemannian distance, induced by the metric .
The extrinsic distance (9) is a generalisation of the famous Mahalanobis distance. In fact, replacing in Example 1 yields the classical expression of the Mahalanobis distance . The significance of this distance can be visualised as follows : if is a dispersion parameter, the extrinsic distance between two otherwise fixed points will decrease as increases, as if the space were contracting, (for a concentration parameter, there is an expansion, rather than a contraction).
The second fundamental form is given by the tangent component of the covariant derivative of the unit normal to the surface . This unit normal is where is the vertical distance coordinate, given by . Using Koszul’s formula [9], it is possible to express the second fundamental form,
[TABLE]
for any tangent to . Knowledge of the second fundamental form is valuable, as it yields the relationship between extrinsic and intrinsic curvatures of .
Proposition 1 (curvature equations)
Let and denote the sectional curvatures of and . The following are true
[TABLE]
[TABLE]
for any linearly independent tangent to .
Remark e : here, Equation (11) is the Gauss curvature equation. Roughly, it shows that embedding into adds negative curvature. Equation (12) is the mixed curvature equation. If the intrinsic sectional curvature is negative, then (11) and (12) show that the sectional curvature of is negative if and only if is a convex function of the vertical distance .
Return to example 1 : here, is one-dimensional, so the Gauss equation (11) does not provide any information. The mixed curvature equation gives the curvature of the two-dimensional manifold . In this equation, , and it follows that
[TABLE]
so has constant negative curvature. In fact, it was observed long ago that the metric (2) is essentially the Poincaré half-plane metric [2].
Return to example 2 : in Example 2, so is constant. It follows from the Gauss equation that each sphere has constant extrinsic curvature, equal to
[TABLE]
Upon replacing the expressions of and based on (3), this is found to be strictly negative for ,
[TABLE]
Thus, the Rao-Fisher metric (3) induces a negative extrinsic curvature on each spherical surface . In fact, by studying the mixed curvature equation (12), it is seen the whole manifold equipped with the Rao-Fisher metric (3) is a manifold of negative sectional curvature.
Example 3 (Riemannian Gaussian model) : a Riemannian Gaussian distribution may be defined on any Riemannian symmetric space of non-positive curvature. It is given by the probability density with respect to Riemannian volume
[TABLE]
where the normalising constant admits a general expression, which was given in [7]. If is an irreducible Riemannian symmetric space, then Theorem 1 above applies to the Riemannian Gaussian model (16), leading to a warped metric with
[TABLE]
where and . The result of equation (17) is here published for the first time. Consider now the special case where is the hyperbolic plane. The analytic expression of and can be found from (17) using
[TABLE]
which was derived in [6]. Here, denotes the error function. Then, replacing (17) in the curvature equations (11) and (12) yields the same result as for Example 2 : the manifold equipped with the Rao-Fisher metric (17) is a manifold of negative sectional curvature.
Remark f (a conjecture) : based on the three examples just considered, it seems reasonable to conjecture that warped metrics arising from Theorem 1 will always lead to manifolds of negative sectional curvature.
4 Solution of the geodesic equation : conservation laws
If the assumptions of Theorem 1 are slightly strengthened, then an analytic solution of the geodesic equation of the Riemannian metric (1) on can be obtained, by virtue of the existence of a sufficient number of conservation laws. To state this precisely, let and denote respectively the scalar products defined by the metrics and .
Two kinds of conservation laws hold along any affinely parameterised geodesic curve in , with respect to the metric . These are conservation of energy and conservation of moments [10]. If the geodesic is expressed as a couple where and , then the energy of this geodesic is
[TABLE]
where the dot denotes differentiation with respect to , and the Riemannian length of as measured by the metric .
On the other hand, if is any element of the Lie algebra of the group of isometries acting on , the corresponding moment of the geodesic is
[TABLE]
where is the vector field on given by . The equation of the geodesic is given as follwos.
Proposition 2 (conservation laws and geodesics)
For any geodesic , its energy and its moment for any are conserved quantities, remaining constant along this geodesic. If is an irreducible Riemannian symmetric space, the equation of the geodesic is the following,
[TABLE]
[TABLE]
where denotes the Riemannian exponential mapping of the metric on , and is the function , with .
Remark g : under the assumption that is an irreducible Riemannian symmetric space, the second part of Proposition 2, stating the equations of and is a corollary of the first part, stating the conservation of energy and moment. The proof, as usual not given due to lack of space, relies on a technique of lifting the geodesic equation to the Lie algebra of the group of isometries .
Remark h : here, Equation (21) states that describes a geodesic curve in the space , with respect to the metric , at a variable speed equal to . Equation (22) states that describes the one-dimensional motion of a particle of energy and mass , in a potential field .
Remark i (completeness of ) : from Equation (22) it is possible to see that any geodesic in is defined for all , if and only if the following conditions are verified
[TABLE]
where the missing integration bounds are arbitrary. The first condition ensures that may not escape to within a finite time, while the second condition ensures the same for . The two conditions (23), taken together, are necessary and sufficient for to be a complete Riemannian manifold.
Return to Example 2 : for the von Mises-Fisher model of Example 2, the second condition in (23) is verified, but not the first. Therefore, a geodesic in may escape to within a finite time. However, is also a geodesic in the larger manifold , which contains as its origin. If arrives at at some finite time, it will just go through this point and immediately return to . In fact, is a complete Riemannian manifold which has as an isometrically embedded submanifold.
5 The road to applications: classification and estimation
The theoretical results of the previous chapters have established that warped metrics are natural statistical objects arising in connection with location-scale models, which are invariant under some group action. Precisely, Theorem 1 has stated that warped metrics appear as Rao-Fisher metrics for all location-scale models which verify the group invariance condition (4).
Analytical knowledge of the Rao-Fisher metric of a statistical model is potentially useful to many applications. In particular, to problems of classification and efficient on-line estimation. However, in order for such applications to be realised, it is necessary for the Rao-Fisher metric to be well-behaved. Propositions 2 and 3 in the above seem to indicate such a good behavior for warped metrics on location-scale models.
Indeed, as conjectured in Remark f, the curvature equations of Proposition 2 would indicate that the sectional curvature of these warped metrics is always negative. Then, if the conditions for completeness, given in Remark i based on Proposition 3, are verified, the location-scale models equipped with these warped metrics appear as complete Riemannian manifolds of negative curvature. This is a favourable scenario, (which at least holds for the von Mises-Fisher model of Example 2), under which many algorithms can be implemented.
For classification problems, it becomes straightforward to find the analytic expression of Rao’s Riemannian distance, and to compute Riemannian centres of mass, whose existence and uniqueness will be guaranteed. These form the building blocks of many classification methodologies.
For efficient on-line estimation, Amari’s natural gradient algorithm turns out to be identical to the stochastic Riemannian gradient algorithm, defined using the Rao-Fisher metric. Then, analytical knowledge of the Rao-Fisher metric, (which is here a warped metric), and of its completeness and curvature properties, yields an elegant formulation of the natural gradient algorithm, and a geometrical means of proving its efficiency and understanding its convergence properties.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Petersen, P.: Riemannian geometry, (2nd edition). Springer (2006)
- 2[2] Atkinson, C., Mitchell, A.: Rao’s distance measure. Sankhya Ser. A 43 (1981) 345–365
- 3[3] Mardia, K.V., Jupp, P.E.: Directional statistics. John Wiley & Sons ltd. (2000)
- 4[4] Kobayashi, S., Nomizu, K.: Foundations of differential geometry, Volume II. John Wiley & Sons, Inc. (1969)
- 5[5] Chevalley, C.: Theory of Lie groups, Volume I. Princeton University Press (1946)
- 6[6] Said, S., Bombrun, L., Berthoumieu, Y., Manton, J.H.: Riemannian Gaussian distributions on the space of symmetric positive definite matrices (accepted). IEEE Trans. Inf. Theory (2016)
- 7[7] Said, S., Hajri, H., Bombrun, L., Vemuri, B.C.: Gaussian distributions on Riemannian symmetric spaces : statistical learning with structured covariance matrices (under review). IEEE Trans. Inf. Theory (2017)
- 8[8] Do Carmo, M.P.: Riemannian geometry (1st edition). Birkhauser (1992)
