Riemannian Gaussian distributions on the space of positive-definite quaternion matrices
Salem Said, Nicolas Le Bihan, Jonathan H. Manton

TL;DR
This paper extends Riemannian Gaussian distributions to positive-definite quaternion matrices by developing their geometric properties, providing formulas, sampling methods, and inference techniques for this new space.
Contribution
It introduces the Riemannian geometry of positive-definite quaternion matrices and formulates Gaussian distributions on this space, including density, sampling, and inference methods.
Findings
Derived the Riemannian metric and geodesics for quaternion matrices
Provided explicit formulas for Riemannian Gaussian densities
Developed sampling algorithms and statistical inference methods
Abstract
Recently, Riemannian Gaussian distributions were defined on spaces of positive-definite real and complex matrices. The present paper extends this definition to the space of positive-definite quaternion matrices. In order to do so, it develops the Riemannian geometry of the space of positive-definite quaternion matrices, which is shown to be a Riemannian symmetric space of non-positive curvature. The paper gives original formulae for the Riemannian metric of this space, its geodesics, and distance function. Then, it develops the theory of Riemannian Gaussian distributions, including the exact expression of their probability density, their sampling algorithm and statistical inference.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMorphological variations and asymmetry · Statistical Mechanics and Entropy · Random Matrices and Applications
11institutetext: 1. Laboratoire IMS (CNRS - UMR 5218), 2. Gipsa-lab (CNRS - UMR 5216),
- The University of Melbourne, Dept. of Electrical and Electronic Engineering
Riemannian Gaussian distributions on the space of positive-definite quaternion matrices
Salem Said* 1*
Nicolas Le Bihan* 2*
Jonathan H. Manton* 3*
Abstract
Recently, Riemannian Gaussian distributions were defined on spaces of positive-definite real and complex matrices. The present paper extends this definition to the space of positive-definite quaternion matrices. In order to do so, it develops the Riemannian geometry of the space of positive-definite quaternion matrices, which is shown to be a Riemannian symmetric space of non-positive curvature. The paper gives original formulae for the Riemannian metric of this space, its geodesics, and distance function. Then, it develops the theory of Riemannian Gaussian distributions, including the exact expression of their probability density, their sampling algorithm and statistical inference.
Keywords:
R
iemannian Gaussian distribution, quaternion, positive-definite
matrix, symplectic group, Riemannian barycentre
1 Introduction
The Riemannian geometry of the spaces and , respectively of positive-definite real and complex matrices, is well-known to the information science community [1, 2]. These spaces have the property of being Riemannian symmetric spaces of non-positive curvature [3, 4],
[TABLE]
where and denote the real and complex linear groups, and and the orthogonal and unitary groups. Using this property, Riemannian Gaussian distributions were recently introduced on and [5, 6]. The present paper introduces the Riemannian geometry of the space of positive-definite quaternion matrices, which is also a Riemannian symmetric space of non-positive curvature [4],
[TABLE]
where denotes the quaternion linear group, and the compact symplectic group. It then studies Riemannian Gaussian distributions on . The main results are the following : Proposition 1 gives the Riemannian metric of the space , Proposition 2 expresses this metric in terms of polar coordinates on the space , Proposition 3 uses Proposition 2 to compute the moment generating function of a Riemannian Gaussian distribution on , and Propositions 4 and 5 describe the sampling algorithm and maximum likelihood estimation of Riemannian Gaussian distributions on . Motivation for studying matrices from comes from their potential use in multidimensional bivariate signal processing [7].
2 Quaternion matrices, and
Recall the non-commutative division algebra of quaternions, denoted , is made up of elements where , and the imaginary units satisfy the relations [8]
[TABLE]
The real part of is , its conjugate is and its squared norm is . The multiplicative inverse of is given by .
The set consists of quaternion matrices [9]. These are arrays where . The product of is the element of with
[TABLE]
A quaternion matrix is said invertible if it has a multiplicative inverse with where is the identity matrix. The conjugate-transpose of is which is a quaternion matrix with .
The rules for computing with quaternion matrices are quite different from the rules for computing with real or complex matrices [9]. For example, in general, , and where T denotes the transpose. For the results in this paper, only the following rules are needed [9],
[TABLE]
consists of the set of invertible quaternion matrices . The subset of such that is denoted .
It follows from (3) that and are groups under the operation of matrix multiplication, defined by (2). However, one has more. Both these groups are real Lie groups. Usually, is called the quaternion linear group, and the compact symplectic group. In fact, is a compact connected Lie subgroup of [10].
The Lie algebras of these two Lie groups are given by
[TABLE]
with the bracket operation . The Lie group exponential is identical to the quaternion matrix exponential
[TABLE]
For and , let . Then,
[TABLE]
as can be seen from (5).
3 The space and its Riemannian metric
The space consists of all quaternion matrices which verify and
[TABLE]
In other words, is the space of positive-definite quaternion matrices. Note that, due to the condition , the sum in (7) is a real number.
Define now the action of on by for and . This is a left action, and is moreover transitive. Indeed [9], each can be diagonalized by some ,
[TABLE]
where the second equality follows from (6). Thus, each can be written for some , which is the same as .
For , note that iff , which means that . Therefore, as a homogeneous space under the left action of ,
[TABLE]
The space is a real differentiable manifold. In fact, if is the real vector space of such that , then it can be shown is an open subset of . Therefore, is a manifold, and for each the tangent space may be identified with . Moreover, can be equipped with a Riemannian metric as follows.
Define on the -invariant scalar product
[TABLE]
For in , let
[TABLE]
where is any element of such that .
Proposition 1 (Riemannian metric)
*(i) For each , formula (11) defines a scalar product on , which is independent of the choice of .
(ii) Moreover,*
[TABLE]
*which yields a Riemannian metric on .
(iii) This Riemannian metric is invariant under the action of on .*
The proof of Proposition 1 only requires the fact that (11) is a scalar product on , and application of the rules (3). It is here omitted for lack of space.
4 The metric in polar coordinates
In order to provide analytic expressions in Sections 5 and 6, we now introduce the expression of the Riemannian metric (12) in terms of polar coordinates. For , the polar coordinates of are the pair appearing in the decomposition (8). It is an abuse of language to call them coordinates, as they are not unique. However, this terminology is natural and used quite often in the literature [5, 6].
The expression of the metric (12) in terms of the polar coordinates is here given in Proposition 2. This requires the following notation. For , let be the quaternion-valued differential form on ,
[TABLE]
Note that, by differentiating the identity , it follows that . Proposition 2 expresses the length element corresponding to the Riemannian metric (12).
Proposition 2 (the metric in polar coordinates)
In terms of the polar coordinates , the length element corresponding to the Riemannian metric (12) is given by,
[TABLE]
where denote the diagonal elements of the matrix .
The proof of this proposition cannot be given here, due to lack of space.
Proposition 2 is valuable to understanding the Riemannian geometry of the space . Precisely, it can be used to infer, with almost no calculation, the expressions of geodesics and of distance, on this space. Indeed, it becomes clear from (14) that the shortest curve connecting the identity to a diagonal (and therefore real) element , is given by for . Using this simple result, and the fact that the metric (12) is invariant under the action of on , the equation of the minimising geodesic curve connecting two elements can be obtained,
[TABLE]
Accordingly, the distance between and is
[TABLE]
where is the norm corresponding to the scalar product (10).
In (15) and (16) matrix functions, such as elevation to a power and logarithm, are computed via the decomposition (8), where the functions are applied to the diagonal matrix .
5 Riemannian Gaussian distributions on
It is possible to define Riemannian Gaussian distributions on any Riemannian symmetric space of non-positive curvature [6]. This is indeed the case of the space , as can be seen from its representation (9) as a quotient space, by consulting the tables which classify irreducible Riemannian symmetric spaces of type III [4].
Accordingly, it is possible to define Riemannian Gaussian distributions on . Precisely, a Riemannian Gaussian distribution on with Riemannian barycentre and dispersion parameter has the following probability density
[TABLE]
with respect to the Riemannian volume element of , here denoted . In this probability density, is the Riemannian distance given by (16).
The first step to understanding this definition is computing the normalising constant . This is given by the integral,
[TABLE]
As shows in [6], this does not depend on , and therefore it is possible to take . From the decomposition (8) and formula (16), it follows that
[TABLE]
Given this simple expression, it seems reasonable to pursue the computation of the integral (18) in polar coordinates. This is achieved in the following Proposition 3. For the statement, write the quaternion-valued differential form of (13) as where are real-valued.
Proposition 3 (normalising constant)
(i) In terms of the polar coordinates , the Riemannian volume element corresponding to the Riemannian metric (12) is given by
[TABLE]
(ii) The integral appearing in (18) is given by
[TABLE]
This proposition is a corollary of Proposition 2. Formula (20) is a straightforward consequence of formula (14). Furthermore, (21) is an immediate application of (19) and (20).
6 Sampling and inference
The present section describes two aspects of Riemannian Gaussian distributions on : i) sampling from these distributions, ii) maximum likelihood estimation of these distributions.
The first of these aspects is given in Proposition 4 below. This relies on the use of polar coordinates which appear in the decomposition (8).
Proposition 4 (Gaussian distribution in polar coordinates)
Let and be independent random variables, with their values in and respectively. Assume is uniformly distributed on , and has the following probability density, with respect to the Lebesgue measure on ,
[TABLE]
If is given by (8), where the matrix has diagonal elements , then has a Riemannian Gaussian distribution (17) with Riemannian barycentre and dispersion parameter . Moreover, for any and such that , if then has Riemannian Gaussian distribution with Riemannian barycentre and dispersion parameter .
Proposition 4 provides a sampling algorithm for Riemannian Gaussian distributions on . Indeed, the proposition states that in order to obtain with Riemannian Gaussian distribution of barycentre and dispersion , it is enough to know how to sample from a Riemannian Gaussian distribution with barycentre . In turn, this is done using polar coordinates, through decomposition (8).
In this decomposition, must be sampled from a uniform distribution on , and with diagonal elements from the multivariate density (22). Sampling from a uniform distribution on can be achieved as follows : let be an quaternion matrix whose elements are independent normal proper quaternion random variables [11], and write for the polar decomposition of [9]. Then, has a uniform distribution on . On the other hand, sampling from the multivariate density (22) can be carried out using a Metropolis-Hastings algorithm, which is included in most statistical software [12].
Consider now maximum likelihood estimation of Riemannian Gaussian distributions. This is given by the following Proposition 5. This proposition brings out the important role of the function defined by (18) and (21). Precisely, this is the moment generating function of the Riemannian Gaussian distribution (17). If and , then is a strictly convex function, which is the cumulant generating function of the distribution (17).
Proposition 5 (Maximum likelihood estimation)
Let be independent samples from a Riemannian Gaussian distribution with density (17). Based on these samples, the maximum likelihood estimate of is the sample Riemannian barycentre ,
[TABLE]
where the distance is given by (16). Moreover, the maximum likelihood estimate of is ,
[TABLE]
where is the reciprocal function of , the derivative of .
Proposition 5 indicates how the maximum likelihood estimates and can be computed. First, is the sample Riemannian barycentre of . Its existence and uniqueness are guaranteed by the fact that is a Riemannian manifold of non-positive curvature. In practice, it can be computed using a Riemannian gradient descent algorithm [13, 14]. Once has been obtained, is found by direct application of (24). This only requires knowledge of the cumulant generating function , which can be tabulated using the Monte Carlo method of [15].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Pennec, X.: Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements. J. Math. Imaging Vis. 25 (1) (2006) 127–154
- 2[2] Chebbi, Z., Moakher, M.: Means of Hermitian positive-definite matrices based on the log-determinant alpha-divergence function. Linear Algebra Appl. 436 (7) (2012) 1872–1889
- 3[3] Helgason, S.: Differential geometry, Lie groups, and symmetric spaces. American Mathematical Society (2001)
- 4[4] Besse, A.L.: Einstein manifolds, (first edition). Springer Verlag (2007)
- 5[5] Said, S., Bombrun, L., Berthoumieu, Y., Manton, J.H.: Riemannian Gaussian distributions on the space of symmetric positive definite matrices (accepted). IEEE Trans. Inf. Theory (2016)
- 6[6] Said, S., Hajri, H., Bombrun, L., Vemuri, B.C.: Gaussian distributions on Riemannian symmetric spaces : statistical learning with structured covariance matrices (under review). IEEE Trans. Inf. Theory (2017)
- 7[7] Flamant, J., Le Bihan, N., Chainais, P.: Time-frequency analysis of bivariate signals (under review). Applied and Computational Harmonic Analysis (2017)
- 8[8] Conway, J.H., Smith, D.A.: On quaternions and octonions, their geometry, arithmetic and symmetry. CRC Press (2003)
