Riemannian Gaussian distributions on the space of positive-definite   quaternion matrices

Salem Said; Nicolas Le Bihan; Jonathan H. Manton

arXiv:1703.09940·math.ST·March 30, 2017·GSI

Riemannian Gaussian distributions on the space of positive-definite quaternion matrices

Salem Said, Nicolas Le Bihan, Jonathan H. Manton

PDF

Open Access

TL;DR

This paper extends Riemannian Gaussian distributions to positive-definite quaternion matrices by developing their geometric properties, providing formulas, sampling methods, and inference techniques for this new space.

Contribution

It introduces the Riemannian geometry of positive-definite quaternion matrices and formulates Gaussian distributions on this space, including density, sampling, and inference methods.

Findings

01

Derived the Riemannian metric and geodesics for quaternion matrices

02

Provided explicit formulas for Riemannian Gaussian densities

03

Developed sampling algorithms and statistical inference methods

Abstract

Recently, Riemannian Gaussian distributions were defined on spaces of positive-definite real and complex matrices. The present paper extends this definition to the space of positive-definite quaternion matrices. In order to do so, it develops the Riemannian geometry of the space of positive-definite quaternion matrices, which is shown to be a Riemannian symmetric space of non-positive curvature. The paper gives original formulae for the Riemannian metric of this space, its geodesics, and distance function. Then, it develops the theory of Riemannian Gaussian distributions, including the exact expression of their probability density, their sampling algorithm and statistical inference.

Equations52

P_{n} = GL (n, R) / O (n) H_{n} = GL (n, C) / U (n)

P_{n} = GL (n, R) / O (n) H_{n} = GL (n, C) / U (n)

Q_{n} = GL (n, H) / Sp (n)

Q_{n} = GL (n, H) / Sp (n)

i^{2} = j^{2} = k^{2} = ijk = - 1

i^{2} = j^{2} = k^{2} = ijk = - 1

C_{ij} = l = 1 \sum n A_{i l} B_{l j}

C_{ij} = l = 1 \sum n A_{i l} B_{l j}

(A B)^{- 1} = B^{- 1} A^{- 1} (A B)^{†} = B^{†} A^{†} Re tr (A B) = Re tr (B A)

(A B)^{- 1} = B^{- 1} A^{- 1} (A B)^{†} = B^{†} A^{†} Re tr (A B) = Re tr (B A)

gl (n, H) = M_{n} (H) sp (n) = {X \in gl (n, H) X + X^{†} = 0}

gl (n, H) = M_{n} (H) sp (n) = {X \in gl (n, H) X + X^{†} = 0}

exp (X) = m \geq 0 \sum \frac{X ^{m}}{m !} X \in gl (n, H)

exp (X) = m \geq 0 \sum \frac{X ^{m}}{m !} X \in gl (n, H)

A exp (X) A^{- 1} = exp (Ad (A) \cdot X)

A exp (X) A^{- 1} = exp (Ad (A) \cdot X)

i, j = 1 \sum n \overset{x}{ˉ}_{i} S_{ij} x_{j} > 0 for all non-zero (x_{1}, \dots, x_{n}) \in H^{n}

i, j = 1 \sum n \overset{x}{ˉ}_{i} S_{ij} x_{j} > 0 for all non-zero (x_{1}, \dots, x_{n}) \in H^{n}

S = K exp (R) K^{- 1} = exp (Ad (K) \cdot R); R real diagonal matrix

S = K exp (R) K^{- 1} = exp (Ad (K) \cdot R); R real diagonal matrix

Q_{n} = GL (n, H) / Sp (n)

Q_{n} = GL (n, H) / Sp (n)

⟨ X ∣ Y ⟩ = Re tr (X Y^{†}) X, Y \in gl (n, H)

⟨ X ∣ Y ⟩ = Re tr (X Y^{†}) X, Y \in gl (n, H)

(u, v)_{S} = ⟨ (A^{- 1}) u (A^{- 1})^{†} (A^{- 1}) v (A^{- 1})^{†} ⟩

(u, v)_{S} = ⟨ (A^{- 1}) u (A^{- 1})^{†} (A^{- 1}) v (A^{- 1})^{†} ⟩

(u, v)_{S} = Re tr (S^{- 1} u S^{- 1} v)

(u, v)_{S} = Re tr (S^{- 1} u S^{- 1} v)

θ_{ij} (K) = l = 1 \sum n K_{i l}^{†} d K_{l j}

θ_{ij} (K) = l = 1 \sum n K_{i l}^{†} d K_{l j}

d s^{2} (R, K) = i = 1 \sum n d r_{i}^{2} + 8 i < j \sum sinh^{2} (∣ r_{i} - r_{j} ∣/2) ∣ θ_{ij} ∣^{2}

d s^{2} (R, K) = i = 1 \sum n d r_{i}^{2} + 8 i < j \sum sinh^{2} (∣ r_{i} - r_{j} ∣/2) ∣ θ_{ij} ∣^{2}

γ (t) = S^{\frac{1}{2}} (S^{- \frac{1}{2}} Q S^{- \frac{1}{2}})^{t} S^{\frac{1}{2}}

γ (t) = S^{\frac{1}{2}} (S^{- \frac{1}{2}} Q S^{- \frac{1}{2}})^{t} S^{\frac{1}{2}}

d (S, Q) = lo g (S^{- \frac{1}{2}} Q S^{- \frac{1}{2}})

d (S, Q) = lo g (S^{- \frac{1}{2}} Q S^{- \frac{1}{2}})

p (S ∣ \overset{˘}{S}, σ) = \frac{1}{Z ( σ )} exp [- \frac{d ^{2} ( S , S ˘ )}{2 σ ^{2}}]

p (S ∣ \overset{˘}{S}, σ) = \frac{1}{Z ( σ )} exp [- \frac{d ^{2} ( S , S ˘ )}{2 σ ^{2}}]

Z (σ) = \int_{Q_{n}} exp [- \frac{d ^{2} ( S , S ˘ )}{2 σ ^{2}}] d v (S)

Z (σ) = \int_{Q_{n}} exp [- \frac{d ^{2} ( S , S ˘ )}{2 σ ^{2}}] d v (S)

d^{2} (S, I) = i = 1 \sum n r_{i}^{2}

d^{2} (S, I) = i = 1 \sum n r_{i}^{2}

d v (R, K) = 8^{n (n - 1)} i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2) i = 1 \prod n d r_{i} i < j ⋀ θ_{ij}^{a} i < j ⋀ θ_{ij}^{b} i < j ⋀ θ_{ij}^{c} i < j ⋀ θ_{ij}^{d}

d v (R, K) = 8^{n (n - 1)} i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2) i = 1 \prod n d r_{i} i < j ⋀ θ_{ij}^{a} i < j ⋀ θ_{ij}^{b} i < j ⋀ θ_{ij}^{c} i < j ⋀ θ_{ij}^{d}

Z (σ) = Const. \times \int_{R^{n}} exp (- \frac{1}{2 σ ^{2}} i = 1 \sum n r_{i}^{2}) i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2) i = 1 \prod n d r_{i}

Z (σ) = Const. \times \int_{R^{n}} exp (- \frac{1}{2 σ ^{2}} i = 1 \sum n r_{i}^{2}) i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2) i = 1 \prod n d r_{i}

p (r_{1}, \dots, r_{n}) \propto exp (- \frac{1}{2 σ ^{2}} i = 1 \sum n r_{i}^{2}) i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2)

p (r_{1}, \dots, r_{n}) \propto exp (- \frac{1}{2 σ ^{2}} i = 1 \sum n r_{i}^{2}) i < j \prod sinh^{4} (∣ r_{i} - r_{j} ∣/2)

\hat{S}_{N} = argmin_{S \in Q_{n}} i = 1 \sum N d^{2} (S_{i}, S)

\hat{S}_{N} = argmin_{S \in Q_{n}} i = 1 \sum N d^{2} (S_{i}, S)

\overset{η}{^}_{N} = (ψ^{'})^{- 1} (\frac{1}{N} i = 1 \sum N d^{2} (S_{i}, \hat{S}_{N}))

\overset{η}{^}_{N} = (ψ^{'})^{- 1} (\frac{1}{N} i = 1 \sum N d^{2} (S_{i}, \hat{S}_{N}))

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMorphological variations and asymmetry · Statistical Mechanics and Entropy · Random Matrices and Applications

Full text

11institutetext: 1. Laboratoire IMS (CNRS - UMR 5218), 2. Gipsa-lab (CNRS - UMR 5216),

The University of Melbourne, Dept. of Electrical and Electronic Engineering

Riemannian Gaussian distributions on the space of positive-definite quaternion matrices

Salem Said* 1*

Nicolas Le Bihan* 2*

Jonathan H. Manton* 3*

Abstract

Recently, Riemannian Gaussian distributions were defined on spaces of positive-definite real and complex matrices. The present paper extends this definition to the space of positive-definite quaternion matrices. In order to do so, it develops the Riemannian geometry of the space of positive-definite quaternion matrices, which is shown to be a Riemannian symmetric space of non-positive curvature. The paper gives original formulae for the Riemannian metric of this space, its geodesics, and distance function. Then, it develops the theory of Riemannian Gaussian distributions, including the exact expression of their probability density, their sampling algorithm and statistical inference.

Keywords:

R

iemannian Gaussian distribution, quaternion, positive-definite

matrix, symplectic group, Riemannian barycentre

1 Introduction

The Riemannian geometry of the spaces $\mathcal{P}_{n}$ and $\mathcal{H}_{n}\,$ , respectively of $n\times n$ positive-definite real and complex matrices, is well-known to the information science community [1, 2]. These spaces have the property of being Riemannian symmetric spaces of non-positive curvature [3, 4],

[TABLE]

where $\mathrm{GL}(n,\mathbb{R})$ and $\mathrm{GL}(n,\mathbb{C})$ denote the real and complex linear groups, and $\mathrm{O}(n)$ and $\mathrm{U}(n)$ the orthogonal and unitary groups. Using this property, Riemannian Gaussian distributions were recently introduced on $\mathcal{P}_{n}$ and $\mathcal{H}_{n}$ [5, 6]. The present paper introduces the Riemannian geometry of the space $\mathcal{Q}_{n}$ of $n\times n$ positive-definite quaternion matrices, which is also a Riemannian symmetric space of non-positive curvature [4],

[TABLE]

where $\mathrm{GL}(n,\mathbb{H})$ denotes the quaternion linear group, and $\mathrm{Sp}(n)$ the compact symplectic group. It then studies Riemannian Gaussian distributions on $\mathcal{Q}_{n}$ . The main results are the following : Proposition 1 gives the Riemannian metric of the space $\mathcal{Q}_{n}$ , Proposition 2 expresses this metric in terms of polar coordinates on the space $\mathcal{Q}_{n}$ , Proposition 3 uses Proposition 2 to compute the moment generating function of a Riemannian Gaussian distribution on $\mathcal{Q}_{n}$ , and Propositions 4 and 5 describe the sampling algorithm and maximum likelihood estimation of Riemannian Gaussian distributions on $\mathcal{Q}_{n}$ . Motivation for studying matrices from $\mathcal{Q}_{n}$ comes from their potential use in multidimensional bivariate signal processing [7].

2 Quaternion matrices, $\mathrm{GL}(\mathbb{H})$ and $\mathrm{Sp}(n)$

Recall the non-commutative division algebra of quaternions, denoted $\mathbb{H}$ , is made up of elements $q=q_{0}\,+\,q_{1}\,\mathrm{i}+\,q_{2}\,\mathrm{j}+\,q_{3}\,\mathrm{k}$ where $q_{0},q_{1},q_{2},q_{3}\in\mathbb{R}$ , and the imaginary units $\mathrm{i},\mathrm{j},\mathrm{k}$ satisfy the relations [8]

[TABLE]

The real part of $q$ is $\mathrm{Re}(q)=q_{0}\,$ , its conjugate is $\bar{q}=q_{0}\,-\,q_{1}\,\mathrm{i}-\,q_{2}\,\mathrm{j}-\,q_{3}\,\mathrm{k}$ and its squared norm is $|q|^{2}=q\bar{q}\,$ . The multiplicative inverse of $q\neq 0$ is given by $q^{-1}=\bar{q}/|q|^{2}\,$ .

The set $M_{n}(\mathbb{H})$ consists of $n\times n$ quaternion matrices $A$ [9]. These are arrays $A=(A_{ij}\,;\,i,j=1,\ldots,n)$ where $A_{ij}\in\mathbb{H}$ . The product $C=AB$ of $A,B\in M_{n}(\mathbb{H})$ is the element of $M_{n}(\mathbb{H})$ with

[TABLE]

A quaternion matrix $A$ is said invertible if it has a multiplicative inverse $A^{-1}$ with $AA^{-1}=A^{-1}A=I$ where $I$ is the identity matrix. The conjugate-transpose of $A$ is $A^{\dagger}$ which is a quaternion matrix with $A^{\dagger}_{ij}=\bar{A}_{ji\,}$ .

The rules for computing with quaternion matrices are quite different from the rules for computing with real or complex matrices [9]. For example, in general, $\mathrm{tr}(AB)\neq\mathrm{tr}(BA)$ , and $(AB)^{T}\neq B^{\,T}A^{T}$ where T denotes the transpose. For the results in this paper, only the following rules are needed [9],

[TABLE]

$\mathrm{GL}(n,\mathbb{H})$ consists of the set of invertible quaternion matrices $A\in M_{n}(\mathbb{H})$ . The subset of $A\in\mathrm{GL}(n,\mathbb{H})$ such that $A^{-1}=A^{\dagger}$ is denoted $\mathrm{Sp}(n)\subset\mathrm{GL}(n,\mathbb{H})$ .

It follows from (3) that $\mathrm{GL}(n,\mathbb{H})$ and $\mathrm{Sp}(n)$ are groups under the operation of matrix multiplication, defined by (2). However, one has more. Both these groups are real Lie groups. Usually, $\mathrm{GL}(n,\mathbb{H})$ is called the quaternion linear group, and $\mathrm{Sp}(n)$ the compact symplectic group. In fact, $\mathrm{Sp}(n)$ is a compact connected Lie subgroup of $\mathrm{GL}(n,\mathbb{H})$ [10].

The Lie algebras of these two Lie groups are given by

[TABLE]

with the bracket operation $[X,Y]=XY-YX$ . The Lie group exponential is identical to the quaternion matrix exponential

[TABLE]

For $A\in\mathrm{GL}(n,\mathbb{H})$ and $X\in\mathfrak{gl}(n,\mathbb{H})$ , let $\mathrm{Ad}(A)\cdot X=AXA^{-1}\,$ . Then,

[TABLE]

as can be seen from (5).

3 The space $\mathcal{Q}_{n}$ and its Riemannian metric

The space $\mathcal{Q}_{n}$ consists of all quaternion matrices $S\in M_{n}(\mathbb{H})$ which verify $S=S^{\dagger}$ and

[TABLE]

In other words, $\mathcal{Q}_{n}$ is the space of positive-definite quaternion matrices. Note that, due to the condition $S=S^{\dagger}$ , the sum in (7) is a real number.

Define now the action of $\mathrm{GL}(n,\mathbb{H})$ on $\mathcal{Q}_{n}$ by $A\cdot S=ASA^{\dagger}$ for $A\in\mathrm{GL}(n,\mathbb{H})$ and $S\in\mathcal{Q}_{n}\,$ . This is a left action, and is moreover transitive. Indeed [9], each $S\in\mathcal{Q}_{n}$ can be diagonalized by some $K\in\mathrm{Sp}(n)$ ,

[TABLE]

where the second equality follows from (6). Thus, each $S\in\mathcal{Q}_{n}$ can be written $S=AA^{\dagger}$ for some $A\in\mathrm{GL}(n,\mathbb{H})$ , which is the same as $S=A\cdot I$ .

For $A\in\mathrm{GL}(n,\mathbb{H})$ , note that $A\cdot I=I$ iff $AA^{\dagger}=I$ , which means that $A\in\mathrm{Sp}(n)$ . Therefore, as a homogeneous space under the left action of $\mathrm{GL}(n,\mathbb{H})$ ,

[TABLE]

The space $\mathcal{Q}_{n}$ is a real differentiable manifold. In fact, if $\mathfrak{p}_{n}$ is the real vector space of $X\in\mathfrak{gl}(n,\mathbb{H})$ such that $X=X^{\dagger}$ , then it can be shown $\mathcal{Q}_{n}$ is an open subset of $\mathfrak{p}_{n}\,$ . Therefore, $\mathcal{Q}_{n}$ is a manifold, and for each $S\in\mathcal{Q}_{n}$ the tangent space $T_{S}\mathcal{Q}_{n}$ may be identified with $\mathfrak{p}_{n}$ . Moreover, $\mathcal{Q}_{n}$ can be equipped with a Riemannian metric as follows.

Define on $\mathfrak{gl}(n,\mathbb{H})$ the $\mathrm{Sp}(n)$ -invariant scalar product

[TABLE]

For $u,v$ in $T_{S}\mathcal{Q}_{n}\simeq\mathfrak{p}_{n}\,$ , let

[TABLE]

where $A$ is any element of $\mathrm{GL}(n,\mathbb{H})$ such that $S=A\cdot I\,$ .

Proposition 1 (Riemannian metric)

*(i) For each $S\in\mathcal{Q}_{n}\,$ , formula (11) defines a scalar product on $T_{S}\mathcal{Q}_{n}\simeq\mathfrak{p}_{n}\,$ , which is independent of the choice of $A$ .

(ii) Moreover,*

[TABLE]

*which yields a Riemannian metric on $\mathcal{Q}_{n}$ .

(iii) This Riemannian metric is invariant under the action of $\mathrm{GL}(n,\mathbb{H})$ on $\mathcal{Q}_{n}$ .*

The proof of Proposition 1 only requires the fact that (11) is a scalar product on $\mathfrak{p}_{n}$ , and application of the rules (3). It is here omitted for lack of space.

4 The metric in polar coordinates

In order to provide analytic expressions in Sections 5 and 6, we now introduce the expression of the Riemannian metric (12) in terms of polar coordinates. For $S\in\mathcal{Q}_{n}$ , the polar coordinates of $S$ are the pair $(R,K)$ appearing in the decomposition (8). It is an abuse of language to call them coordinates, as they are not unique. However, this terminology is natural and used quite often in the literature [5, 6].

The expression of the metric (12) in terms of the polar coordinates $(R,K)$ is here given in Proposition 2. This requires the following notation. For $i,j=1\,,\ldots,\,n$ , let $\theta_{ij}$ be the quaternion-valued differential form on $\mathrm{Sp}(n)$ ,

[TABLE]

Note that, by differentiating the identity $K^{\dagger}K=I$ , it follows that $\theta_{ij}=-\bar{\theta}_{ji}\,$ . Proposition 2 expresses the length element corresponding to the Riemannian metric (12).

Proposition 2 (the metric in polar coordinates)

In terms of the polar coordinates $(R,K)$ , the length element corresponding to the Riemannian metric (12) is given by,

[TABLE]

where $r_{i}$ denote the diagonal elements of the matrix $R$ .

The proof of this proposition cannot be given here, due to lack of space.

Proposition 2 is valuable to understanding the Riemannian geometry of the space $\mathcal{Q}_{n}$ . Precisely, it can be used to infer, with almost no calculation, the expressions of geodesics and of distance, on this space. Indeed, it becomes clear from (14) that the shortest curve connecting the identity $I\in\mathcal{Q}_{n}$ to a diagonal (and therefore real) element $a\in\mathcal{Q}_{n}$ , is given by $t\mapsto a^{t}$ for $t\in[0,1]$ . Using this simple result, and the fact that the metric (12) is invariant under the action of $\mathrm{GL}(n,\mathbb{H})$ on $\mathcal{Q}_{n}$ , the equation of the minimising geodesic curve $\gamma(t)$ connecting two elements $S,Q\in\mathcal{Q}_{n}$ can be obtained,

[TABLE]

Accordingly, the distance between $S$ and $Q$ is

[TABLE]

where $\|\cdot\|$ is the norm corresponding to the scalar product (10).

In (15) and (16) matrix functions, such as elevation to a power and logarithm, are computed via the decomposition (8), where the functions are applied to the diagonal matrix $\exp(R)$ .

5 Riemannian Gaussian distributions on $\mathcal{Q}_{n}$

It is possible to define Riemannian Gaussian distributions on any Riemannian symmetric space of non-positive curvature [6]. This is indeed the case of the space $\mathcal{Q}_{n}\,$ , as can be seen from its representation (9) as a quotient space, by consulting the tables which classify irreducible Riemannian symmetric spaces of type III [4].

Accordingly, it is possible to define Riemannian Gaussian distributions on $\mathcal{Q}_{n}$ . Precisely, a Riemannian Gaussian distribution on $\mathcal{Q}_{n}$ with Riemannian barycentre $\breve{S}\in\mathcal{Q}_{n}$ and dispersion parameter $\sigma>0$ has the following probability density

[TABLE]

with respect to the Riemannian volume element of $\mathcal{Q}_{n}$ , here denoted $dv$ . In this probability density, $d(S,\,\breve{S})$ is the Riemannian distance given by (16).

The first step to understanding this definition is computing the normalising constant $Z(\sigma)$ . This is given by the integral,

[TABLE]

As shows in [6], this does not depend on $\breve{S}$ , and therefore it is possible to take $\breve{S}=I$ . From the decomposition (8) and formula (16), it follows that

[TABLE]

Given this simple expression, it seems reasonable to pursue the computation of the integral (18) in polar coordinates. This is achieved in the following Proposition 3. For the statement, write the quaternion-valued differential form $\theta_{ij}$ of (13) as $\theta_{ij}=\theta^{a}_{ij}+\theta^{b}_{ij}\,\mathrm{i}+\theta^{c}_{ij}\,\mathrm{j}+\theta^{d}_{ij}\,\mathrm{k}$ where $\theta^{a}_{ij},\theta^{b}_{ij},\theta^{c}_{ij},\theta^{d}_{ij}$ are real-valued.

Proposition 3 (normalising constant)

(i) In terms of the polar coordinates $(R,K)$ , the Riemannian volume element $dv(S)$ corresponding to the Riemannian metric (12) is given by

[TABLE]

(ii) The integral $Z(\sigma)$ appearing in (18) is given by

[TABLE]

This proposition is a corollary of Proposition 2. Formula (20) is a straightforward consequence of formula (14). Furthermore, (21) is an immediate application of (19) and (20).

6 Sampling and inference

The present section describes two aspects of Riemannian Gaussian distributions on $\mathcal{Q}_{n}$ : i) sampling from these distributions, ii) maximum likelihood estimation of these distributions.

The first of these aspects is given in Proposition 4 below. This relies on the use of polar coordinates $(R,K)$ which appear in the decomposition (8).

Proposition 4 (Gaussian distribution in polar coordinates)

Let $K$ and $r$ be independent random variables, with their values in $\mathrm{Sp}(n)$ and $\mathbb{R}^{n}$ respectively. Assume $K$ is uniformly distributed on $\mathrm{Sp}(n)$ , and $r$ has the following probability density, with respect to the Lebesgue measure on $\mathbb{R}^{n}$ ,

[TABLE]

If $S$ is given by (8), where the matrix $R$ has diagonal elements $r_{i\,}$ , then $S$ has a Riemannian Gaussian distribution (17) with Riemannian barycentre $\breve{S}=I$ and dispersion parameter $\sigma$ . Moreover, for any $\breve{S}\in\mathcal{Q}_{n}$ and $A\in\mathrm{GL}(n,\mathbb{H})$ such that $A\cdot I=\breve{S}$ , if $Q=A\cdot S$ then $Q$ has Riemannian Gaussian distribution with Riemannian barycentre $\breve{S}$ and dispersion parameter $\sigma$ .

Proposition 4 provides a sampling algorithm for Riemannian Gaussian distributions on $\mathcal{Q}_{n}$ . Indeed, the proposition states that in order to obtain $Q$ with Riemannian Gaussian distribution of barycentre $\breve{S}$ and dispersion $\sigma$ , it is enough to know how to sample $S$ from a Riemannian Gaussian distribution with barycentre $I$ . In turn, this is done using polar coordinates, through decomposition (8).

In this decomposition, $K$ must be sampled from a uniform distribution on $\mathrm{Sp}(n)$ , and $R$ with diagonal elements $r_{i}$ from the multivariate density (22). Sampling from a uniform distribution on $\mathrm{Sp}(n)$ can be achieved as follows : let $Z$ be an $n\times n$ quaternion matrix whose elements are independent normal proper quaternion random variables [11], and write $Z=KP$ for the polar decomposition of $Z$ [9]. Then, $K$ has a uniform distribution on $\mathrm{Sp}(n)$ . On the other hand, sampling from the multivariate density (22) can be carried out using a Metropolis-Hastings algorithm, which is included in most statistical software [12].

Consider now maximum likelihood estimation of Riemannian Gaussian distributions. This is given by the following Proposition 5. This proposition brings out the important role of the function $Z(\sigma)$ defined by (18) and (21). Precisely, this is the moment generating function of the Riemannian Gaussian distribution (17). If $\eta=-1/2\sigma^{2}$ and $\psi(\eta)=\log Z(\sigma)$ , then $\psi(\eta)$ is a strictly convex function, which is the cumulant generating function of the distribution (17).

Proposition 5 (Maximum likelihood estimation)

Let $S_{1}\,,\ldots,\,S_{N}\,$ be independent samples from a Riemannian Gaussian distribution with density (17). Based on these samples, the maximum likelihood estimate of $\breve{S}$ is the sample Riemannian barycentre $\hat{S}_{N}$ ,

[TABLE]

where the distance $d(S_{i},\,S)$ is given by (16). Moreover, the maximum likelihood estimate of $\eta=-1/2\sigma^{2}$ is $\hat{\eta}_{N}$ ,

[TABLE]

where $\left(\,\psi^{\prime}\,\right)^{-1}$ is the reciprocal function of $\psi^{\prime}$ , the derivative of $\psi$ .

Proposition 5 indicates how the maximum likelihood estimates $\hat{S}_{N}$ and $\hat{\eta}_{N}$ can be computed. First, $\hat{S}_{N}$ is the sample Riemannian barycentre of $S_{1}\,,\ldots,\,S_{N}\,$ . Its existence and uniqueness are guaranteed by the fact that $\mathcal{Q}_{N}$ is a Riemannian manifold of non-positive curvature. In practice, it can be computed using a Riemannian gradient descent algorithm [13, 14]. Once $\hat{S}_{N}$ has been obtained, $\hat{\eta}_{N}$ is found by direct application of (24). This only requires knowledge of the cumulant generating function $\psi(\eta)$ , which can be tabulated using the Monte Carlo method of [15].

Bibliography15

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Pennec, X.: Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements. J. Math. Imaging Vis. 25 (1) (2006) 127–154
2[2] Chebbi, Z., Moakher, M.: Means of Hermitian positive-definite matrices based on the log-determinant alpha-divergence function. Linear Algebra Appl. 436 (7) (2012) 1872–1889
3[3] Helgason, S.: Differential geometry, Lie groups, and symmetric spaces. American Mathematical Society (2001)
4[4] Besse, A.L.: Einstein manifolds, (first edition). Springer Verlag (2007)
5[5] Said, S., Bombrun, L., Berthoumieu, Y., Manton, J.H.: Riemannian Gaussian distributions on the space of symmetric positive definite matrices (accepted). IEEE Trans. Inf. Theory (2016)
6[6] Said, S., Hajri, H., Bombrun, L., Vemuri, B.C.: Gaussian distributions on Riemannian symmetric spaces : statistical learning with structured covariance matrices (under review). IEEE Trans. Inf. Theory (2017)
7[7] Flamant, J., Le Bihan, N., Chainais, P.: Time-frequency analysis of bivariate signals (under review). Applied and Computational Harmonic Analysis (2017)
8[8] Conway, J.H., Smith, D.A.: On quaternions and octonions, their geometry, arithmetic and symmetry. CRC Press (2003)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Riemannian Gaussian distributions on the space of positive-definite quaternion matrices

Abstract

Keywords:

1 Introduction

2 Quaternion matrices, GL(H)\mathrm{GL}(\mathbb{H})GL(H) and Sp(n)\mathrm{Sp}(n)Sp(n)

3 The space Qn\mathcal{Q}_{n}Qn​ and its Riemannian metric

Proposition 1 (Riemannian metric)

4 The metric in polar coordinates

Proposition 2 (the metric in polar coordinates)

5 Riemannian Gaussian distributions on Qn\mathcal{Q}_{n}Qn​

Proposition 3 (normalising constant)

6 Sampling and inference

Proposition 4 (Gaussian distribution in polar coordinates)

Proposition 5 (Maximum likelihood estimation)

2 Quaternion matrices, $\mathrm{GL}(\mathbb{H})$ and $\mathrm{Sp}(n)$

3 The space $\mathcal{Q}_{n}$ and its Riemannian metric

5 Riemannian Gaussian distributions on $\mathcal{Q}_{n}$