Is affine invariance well defined on SPD matrices? A principled continuum of metrics
Yann Thanwerdas (UCA, Inria, EPIONE), Xavier Pennec (UCA, Inria,, EPIONE)

TL;DR
This paper explores the theoretical landscape of metrics on SPD matrices, introducing a continuum of affine-invariant metrics and examining principles to guide metric selection for applications.
Contribution
It introduces a continuum of affine-invariant metrics on SPD matrices, including power-affine and deformed-affine metrics, and investigates principles for selecting appropriate metrics.
Findings
Introduces a continuum of metrics on SPD matrices.
Analyzes principles guiding metric choice.
Provides theoretical insights into affine invariance.
Abstract
Symmetric Positive Definite (SPD) matrices have been widely used in medical data analysis and a number of different Riemannian met-rics were proposed to compute with them. However, there are very few methodological principles guiding the choice of one particular metric for a given application. Invariance under the action of the affine transformations was suggested as a principle. Another concept is based on symmetries. However, the affine-invariant metric and the recently proposed polar-affine metric are both invariant and symmetric. Comparing these two cousin metrics leads us to introduce much wider families: power-affine and deformed-affine metrics. Within this continuum, we investigate other principles to restrict the family size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Université Côte d’Azur, Inria, Epione, France
Is affine-invariance well defined on SPD matrices? A principled continuum of metrics
Yann Thanwerdas
Xavier Pennec
Abstract
Symmetric Positive Definite (SPD) matrices have been widely used in medical data analysis and a number of different Riemannian metrics were proposed to compute with them. However, there are very few methodological principles guiding the choice of one particular metric for a given application. Invariance under the action of the affine transformations was suggested as a principle. Another concept is based on symmetries. However, the affine-invariant metric and the recently proposed polar-affine metric are both invariant and symmetric. Comparing these two cousin metrics leads us to introduce much wider families: power-affine and deformed-affine metrics. Within this continuum, we investigate other principles to restrict the family size.
Keywords:
SPD matrices Riemannian symmetric space.
1 Introduction
Symmetric positive definite (SPD) matrices have been used in many different contexts. In diffusion tensor imaging for instance, a diffusion tensor is a 3-dimensional SPD matrix [1, 2, 3]; in brain-computer interfaces (BCI) [4], in functional MRI [5] or in computer vision [6], an SPD matrix can represent a covariance matrix of a feature vector, for example a spatial covariance of electrodes or a temporal covariance of signals in BCI. In order to make statistical operations on SPD matrices like interpolations, computing the mean or performing a principal component analysis, it has been proposed to consider the set of SPD matrices as a manifold and to provide it with some geometric structures like a Riemannian metric, a transitive group action or some symmetries. These structures can be more or less natural depending on the context of the applications, and they can provide closed-form formulas and consistent algorithms [2, 7].
Many Riemannian structures have been introduced over the manifold of SPD matrices [7]: Euclidean, log-Euclidean, affine-invariant, Cholesky, square root, power-Euclidean, Procrustes… Each of them has different mathematical properties that can fit the data in some problems but can be inappropriate in some other contexts: for example the curvature can be null, positive, negative, constant, not constant, covariantly constant… These properties on the curvature have some important consequences on the way we interpolate two points, on the consistence of algorithms, and more generally on every statistical operation one could want to do with SPD matrices. Therefore, a natural question one can ask is: given the practical context of an application, how should one choose the metric on SPD matrices? Are there some relations between the mathematical properties of the geometric structure and the intrinsic properties of the data?
In this context, the affine-invariant metric [2, 8, 3] was introduced to give an invariant computing framework under affine transformations of the variables. This metric endows the manifold of SPD matrices with a structure of a Riemannian symmetric space. Such spaces have a covariantly constant curvature, thus they share some convenient properties with constant curvature spaces but with less constraints. It was actually shown that there exists not only one but a one-parameter family that is invariant under these affine transformations [9]. More recently, [10, 11, 12] introduced another Riemannian symmetric structure that does not belong to the previous one-parameter family: the polar-affine metric.
In this work, we unify these two frameworks by showing that the polar-affine metric is a square deformation of the affine-invariant metric (Section 2). We generalize in Section 3.1 this construction to a family of power-affine metrics that comprises the two previous metrics, and in Section 3.2 to the wider family of deformed-affine metrics. Finally, we propose in Section 4 a theoretical approach in the choice of subfamilies of the deformed-affine metrics with relevant properties.
2 Affine-invariant versus polar-affine
The affine-invariant metric [2, 8, 3] and the polar-affine metric [12] are different but they both provide a Riemannian symmetric structure to the manifold of SPD matrices. Moreover, both claim to be very naturally introduced. The former uses only the action of the real general linear group on covariance matrices. The latter uses the canonical left action of on the left coset space and the polar decomposition , where is the orthogonal group. Furthermore, the affine-invariant framework is exhaustive in the sense that it provides all the metrics invariant under the chosen action [9] whereas the polar-affine framework only provides one invariant metric.
In this work, we show that the two frameworks coincide on the same quotient manifold but differ because of the choice of the diffeomorphism between this quotient and the manifold of SPD matrices. In particular, we show that there exists a one-parameter family of polar-affine metrics and that any polar-affine metric is a square deformation of an affine-invariant metric.
In 2.1 and 2.2, we build the affine-invariant metrics and the polar-affine metric in a unified way, using indexes to differentiate them. First, we give explicitly the action and the quotient diffeomorphism ; then, we explain the construction of the orthogonal-invariant scalar product that characterizes the metric ; finally, we give the expression of the metrics and . In 2.3, we summarize the results and we focus on the Riemannian symmetric structures of .
2.1 The one-parameter family of affine-invariant metrics
2.1.1 Affine action and quotient diffeomorphism
In many applications, one would like the analysis of covariance matrices to be invariant under affine transformations of the random vector , where and . Then the covariance matrix , is modified under the transformation . This transformation can be thought as a transitive Lie group action of the general linear group on the manifold of SPD matrices:
[TABLE]
This transitive action induces a diffeomorphism between the manifold and the quotient of the acting group by the stabilizing group at any point . It reduces to the orthogonal group at so we get the quotient diffeomorphism :
[TABLE]
2.1.2 Orthogonal-invariant scalar product
We want to endow the manifold with a metric invariant under the affine action , i.e. an affine-invariant metric. As the action is transitive, the metric at any point is characterized by the metric at one given point . As the metric is affine-invariant, this scalar product has to be invariant under the stabilizing group of . As a consequence, the metric is characterized by a scalar product on the tangent space that is invariant under the action of the orthogonal group.
The tangent space is canonically identified with the vector space of symmetric matrices by the differential of the canonical embedding . Thus we are now looking for all the scalar products on symmetric matrices that are invariant under the orthogonal group. Such scalar products are given by the following formula [9], where and : for all tangent vectors , .
2.1.3 Affine-invariant metrics
To give the expression of the metric, we need a linear isomorphism between the tangent space at any point and the tangent space . Since the action sends to , its differential given by is such a linear isomorphism. Combining this transformation with the expression of the metric at and reordering the terms in the trace, we get the general expression of the affine-invariant metric: for all tangent vectors ,
[TABLE]
As the geometry of the manifold is not much affected by a scalar multiplication of the metric, we often drop the parameter , as if it were equal to 1, and we consider that this is a one-parameter family indexed by .
2.2 The polar-affine metric
2.2.1 Quotient diffeomorphism and affine action
In [12], instead of defining a metric directly on the manifold of SPD matrices, a metric is defined on the left coset space , on which the general linear group naturally acts by the left action . Then this metric is pushed forward on the manifold into the polar-affine metric thanks to the polar decomposition or more precisely by the quotient diffeomorphism :
[TABLE]
This quotient diffeomorphism induces an action of the general linear group on the manifold , under which the polar-affine metric will be invariant:
[TABLE]
It is characterized by for .
2.2.2 Orthogonal-invariant scalar product
The polar-affine metric is characterized by the scalar product on the tangent space . This scalar product is obtained by pushforward of a scalar product on the tangent space . It is itself induced by the Frobenius scalar product on , defined by , which is orthogonal-invariant. This is summarized on the following diagram.
[TABLE]
Finally, we get the scalar product for .
2.2.3 Polar-affine metric
Since the action sends to , a linear isomorphism between tangent spaces is given by the differential of the action . Combined with the above expression of the scalar product at , we get the following expression for the polar affine metric: for all tangent vectors ,
[TABLE]
2.3 The underlying Riemannian symmetric manifold
In the affine-invariant framework, we started from defining the affine action (on covariance matrices) and we inferred the quotient diffeomorphism . In the polar-affine framework, we started from defining the quotient diffeomorphism (corresponding to the polar decomposition) and we inferred the affine action . The two actually correspond to the same underlying affine action on the quotient . Then there is also a one-parameter family of affine-invariant metrics on the quotient and a one-parameter family of polar-affine metrics on the manifold . This is stated in the following theorems.
Theorem 2.1 (Polar-affine is a square deformation of affine-invariant)
There exists a one-parameter family of affine-invariant metrics on the quotient . 2. 2.
This family is in bijection with the one-parameter family of affine-invariant metrics on the manifold of SPD matrices thanks to the diffeomorphism . The corresponding action is . 3. 3.
*This family is also in bijection with a one-parameter family of polar-affine metrics on the manifold of SPD matrices thanks to the diffeomorphism . The corresponding action is . * 4. 4.
The diffeomorphism \mathrm{pow}_{2}:\left\{\begin{array}[]{ccc}({SPD}_{n},4g^{2})&\longrightarrow&({SPD}_{n},g^{1})\\ \Sigma&\longmapsto&\Sigma^{2}\\ \end{array}\right. is an isometry between polar-affine metrics and affine-invariant metrics .
In other words, performing statistical analyses (e.g. a principal component analysis) with the polar-affine metric on covariance matrices is equivalent to performing these statistical analyses with the classical affine-invariant metric on the square of our covariance matrix dataset.
All the metrics mentioned in Theorem 1 endow their respective space with a structure of a Riemannian symmetric manifold. We recall the definition of that geometric structure and we give the formal statement.
Definition 1 (Symmetric manifold, Riemannian symmetric manifold)
A manifold is symmetric if it is endowed with a family of involutions called symmetries such that and is an isolated fixed point of . It implies that . A Riemannian manifold is symmetric if it is endowed with a family of symmetries that are isometries of , i.e. that preserve the metric: for .
Theorem 2.2 (Riemannian symmetric structure on )
The Riemannian manifold , where is an affine-invariant metric, is a Riemannian symmetric space with symmetry . The Riemannian manifold , where is a polar-affine metric, is also a Riemannian symmetric space whose symmetry is .
This square deformation of affine-invariant metrics can be generalized into a power deformation to build a family of affine-invariant metrics that we call power-affine metrics. It can even be generalized into any diffeomorphic deformation of SPD matrices. We now develop these families of affine-invariant metrics.
3 Families of affine-invariant metrics
There is a theoretical interest in building families comprising some of the known metrics on SPD matrices to understand how one can be deformed into another. For example, power-Euclidean metrics [13] comprise the Euclidean metric and tends to the log-Euclidean metric [14] when the power tends to 0. We recall that the log-Euclidean metric is the pullback of the Euclidean metric on symmetric matrices by the symmetric matrix logarithm . There is also a practical interest in defining families of metrics: for example, it is possible to optimize the power to better fit the data with a certain distribution [13].
First, we generalize the square deformation by deforming the affine-invariant metrics with a power function to define the power-affine metrics. Then we deform the affine-invariant metrics by any diffeomorphism to define the deformed-affine metrics.
3.1 The two-parameter family of power-affine metrics
We recall that is the manifold of SPD matrices. For a power , we define the -power-affine metric as the pullback by the diffeomorphism of the affine-invariant metric, scaled by a factor .
Equivalently, the -power-affine metric is the metric invariant under the -affine action whose scalar product at coincides with the scalar product . The -affine action induces an isomorphism between tangent spaces. The -power-affine metric is given by:
[TABLE]
Because a scaling factor is of low importance, we can set and consider that this family is a two-parameter family indexed by and .
We have chosen to define the metric so that the power function is an isometry. Why this factor ? The first reason is for consistence with previous works: the analogous power-Euclidean metrics have been defined with that scaling [13]. The second reason is for continuity: when the power tends to 0, the power-affine metric tends to the log-Euclidean metric.
Theorem 3.1 (Power-affine tends to log-Euclidean for )
Let and . Then where the log-Euclidean metric is .
3.2 The continuum of deformed-affine metrics
In the following, we call a diffeomorphism a deformation. We define the -deformed-affine metric as the pullback by the diffeomorphism of the affine-invariant metric, so that is an isometry. (Regarding the discussion before the Theorem 3, .)
The -deformed-affine metric is invariant under the -affine action . It is given by where . The basic Riemannian operations are obtained by pulling back the affine-invariant operations.
Theorem 3.2 (Basic Riemannian operations)
For SPD matrices and a tangent vector , we have at all time :
\begin{array}[]{|c|l|}\hline\cr\mathrm{Geodesics}&\gamma^{f}_{(\Sigma,V)}(t)=f^{-1}(f(\Sigma)^{1/2}\exp(tf(\Sigma)^{-1/2}T_{\Sigma}f(V)f(\Sigma)^{-1/2})f(\Sigma)^{1/2})\\ \hline\cr\mathrm{Logarithm}&\mathrm{Log}^{f}_{\Sigma}(\Lambda)=(T_{\Sigma}f)^{-1}(f(\Sigma)^{1/2}\log(f(\Sigma)^{-1/2}f(\Lambda)f(\Sigma)^{-1/2})f(\Sigma)^{1/2})\\ \hline\cr\mathrm{Distance}&d_{f}(\Sigma,\Lambda)=d_{1}(f(\Sigma),f(\Lambda))=\sum_{k=1}^{n}{(\log\lambda_{k})^{2}}\\ \hline\cr\end{array}**
where are the eigenvalues of the symmetric matrix .
All tensors are modified thanks to the pushforward and pullback operators, e.g. the Riemann tensor of the -deformed metric is . As a consequence, the deformation does not affect the values taken by the sectional curvature and these metrics are negatively curved.
From a computational point of view, it is very interesting to notice that the identification simplifies the above expressions by removing the differential . This change of basis can prevent from numerical approximations of the differential but one must keep in mind that in general. This identification was already used for the polar-affine metric () in [12] without explicitly mentioning.
4 Interesting subfamilies of deformed-affine metrics
Some deformations have already been used in applications. For example, the family where was proposed to map the anisotropy of water measured by diffusion tensors to the one of the diffusion of tumor cells in tumor growth modeling [15]. The inverse function or the adjugate function were also proposed in the context of DTI [16, 17]. Let us find some properties satisfied by some of these examples. We define the following subsets of the set of diffeomorphisms of .
(Spectral) . Spectral deformations are characterized by their values on sorted diagonal matrices so the deformations described above are spectral: .
For a spectral deformation , so we can unically define a smooth diffeomorphism by .
(Univariate) . The power functions are univariate. Any polynomial null at 0, with non-positive roots and positive coefficient , also gives rise to a univariate deformation.
(Diagonally-stable) . The deformations described above and the univariate deformations are clearly diagonally-stable: and .
(Log-linear) . The adjugate function and the power functions are log-linear deformations. More generally, the functions for , are log-linear deformations. We can notice that the -deformed-affine metric belongs to the one-parameter family of -power-affine metrics with .
The deformations just introduced are also spectral and the following result states that they are the only spectral log-linear deformations.
Theorem 4.1 (Characterization of the power-affine metrics)
If is a spectral log-linear diffeomorphism, then there exist real numbers such that and the -deformed-affine metric is a -power-affine metric.
The interest of this theorem comes from the fact that the group of spectral deformations and the vector space of log-linear deformations have large dimensions while their intersection is reduced to a two-parameter family. This strong result is a consequence of the theory of Lie group representations because the combination of the spectral property and the linearity makes a homomorphism of -modules (see the sketch of proof below).
Sketch of the proof. Thanks to Lie group representation theory, the linear map appears as a homomorphism of -modules for the representation . Once shown that is a -irreducible decomposition of and that each one is stable by , then according to Schur’s lemma, is homothetic on each subspace, i.e. there exist such that for , , so .
5 Conclusion
We have shown that the polar-affine metric is a square deformation of the affine-invariant metric and this process can be generalized to any power function or any diffeomorphism on SPD matrices. It results that the invariance principle of symmetry is not sufficient to distinguish all these metrics, so we should find other principles to limit the scope of acceptable metrics in statistical computing. We have proposed a few characteristics (spectral, diagonally-stable, univariate, log-linear) that include some functions on tensors previously introduced. Future work will focus on studying the effect of such deformations on real data and on extending this family of metrics to positive semi-definite matrices. Finding families that comprise two non-cousin metrics could also help understand the differences between them and bring principles to make choices in applications.
Acknowledgements.
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant G-Statistics agreement No 786854). This work has been supported by the French government, through the UCAJEDI Investments in the Future project managed by the National Research Agency (ANR) with the reference number ANR-15-IDEX-01.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] C. Lenglet, M. Rousson, R. Deriche, and O. Faugeras. Statistics on the Manifold of Multivariate Normal Distributions: Theory and Application to Diffusion Tensor MRI Processing. J. of Math. Imaging and Vision , 25(3):423–444, October 2006.
- 2[2] X. Pennec, P. Fillard, and N. Ayache. A Riemannian Framework for Tensor Computing. International Journal of Computer Vision , 66(1):41–66, January 2006.
- 3[3] T. Fletcher and S. Joshi. Riemannian Geometry for the Statistical Analysis of Diffusion Tensor Data. Signal Processing , 87:250–262, 2007.
- 4[4] A. Barachant, S. Bonnet, M. Congedo, and C. Jutten. Classification of covariance matrices using a Riemannian-based kernel for BCI applications. Elsevier Neurocomputing , 112:172–178, 2013.
- 5[5] F. Deligianni, G. Varoquaux, B. Thirion, E. Robinson, D. Sharp, D. Edwards, and D. Rueckert. A Probabilistic Framework to Infer Brain Functional Connectivity from Anatomical Connections. IPMI Conference , pages 296–307, 2011.
- 6[6] G. Cheng and B. Vemuri. A novel dynamic system in the space of SPD matrices with applications to appearance tracking. SIIMS , 6(16):592–615, 2013.
- 7[7] I. Dryden, A. Koloydenko, and D. Zhou. Non-Euclidean Statistics for Covariance Matrices with Applications to Diffusion Tensor Imaging. Annals of Applied Statistics , 3:1102–1123, 2009.
- 8[8] C. Lenglet, M. Rousson, and R. Deriche. DTI segmentation by statistical surface evolution. IEEE Transactions on Medical Imaging , 25:685–700, 2006.
