Intrinsic data depth for Hermitian positive definite matrices

Joris Chau; Hernando Ombao; Rainer von Sachs

arXiv:1706.08289·stat.ME·November 12, 2019

Intrinsic data depth for Hermitian positive definite matrices

Joris Chau, Hernando Ombao, Rainer von Sachs

PDF

1 Repo

TL;DR

This paper introduces new statistical data depth functions for collections of Hermitian positive definite matrices, leveraging Riemannian geometry to enable robust inference and characterization of centrality in such matrix data.

Contribution

It develops the first intrinsic data depth functions for Hermitian positive definite matrices, with computationally efficient methods and practical applications in confidence region construction.

Findings

01

Proposed two fast data depth functions satisfying key properties.

02

Demonstrated robustness and efficiency of the depth functions.

03

Applied depth-based methods to analyze covariance matrices from a clinical trial.

Abstract

Nondegenerate covariance, correlation and spectral density matrices are necessarily symmetric or Hermitian and positive definite. The main contribution of this paper is the development of statistical data depths for collections of Hermitian positive definite matrices by exploiting the geometric structure of the space as a Riemannian manifold. The depth functions allow one to naturally characterize most central or outlying matrices, but also provide a practical framework for inference in the context of samples of positive definite matrices. First, the desired properties of an intrinsic data depth function acting on the space of Hermitian positive definite matrices are presented. Second, we propose two computationally fast pointwise and integrated data depth functions that satisfy each of these requirements and investigate several robustness and efficiency aspects. As an application, we…

Tables1

Table 1. Table 1 : Average sizes and empirical coverages of depth-based percentile bootstrap confidence for B = 5 000 𝐵 5000 B=5\,000 bootstrap samples and N = 1 000 𝑁 1000 N=1\,000 simulations, using pdMean() and pdDepth() .

	Intrinsic zonoid depth, $(n = 100, d = 2)$			Geodesic distance depth, $(n = 100, d = 2)$
5-GND	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage
$80 %$ -CR	0.0181	0.171 (0.78)	0.760	0.810	0.171 (0.21)	0.805
$90 %$ -CR	0.0064	0.196 (0.95)	0.889	0.794	0.195 (0.25)	0.901
$95 %$ -CR	0.0023	0.214 (1.20)	0.935	0.780	0.216 (0.30)	0.957
2-GND	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage
$80 %$ -CR	0.0181	0.205 (1.71)	0.796	0.775	0.207 (0.61)	0.825
$90 %$ -CR	0.0064	0.236 (2.19)	0.897	0.756	0.237 (0.60)	0.898
$95 %$ -CR	0.0023	0.260 (2.80)	0.947	0.740	0.264 (0.70)	0.950
1.5-GND	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage	Ave.- $β_{*}$	Ave.-Size (SE $\times 10^{- 5}$ )	Coverage
$80 %$ -CR	0.0181	0.228 (2.64)	0.798	0.755	0.230 (0.91)	0.828
$90 %$ -CR	0.0065	0.262 (3.55)	0.892	0.734	0.263 (0.88)	0.914
$95 %$ -CR	0.0023	0.284 (4.73)	0.925	0.716	0.294 (0.92)	0.952

Equations139

⟨ h_{1}, h_{2} ⟩_{p}

⟨ h_{1}, h_{2} ⟩_{p}

δ_{R} (p_{1}, p_{2})

δ_{R} (p_{1}, p_{2})

δ_{R} (p_{1}, p_{2})

δ_{R} (p_{1}, p_{2})

Exp_{p} (h)

Exp_{p} (h)

Log_{p} (q)

δ_{R} (p_{1}, p_{2}) = ∥ Log_{p_{1}} (p_{2}) ∥_{p_{1}} = ∥ Log_{p_{2}} (p_{1}) ∥_{p_{2}}, \forall p_{1}, p_{2} \in M,

δ_{R} (p_{1}, p_{2}) = ∥ Log_{p_{1}} (p_{2}) ∥_{p_{1}} = ∥ Log_{p_{2}} (p_{1}) ∥_{p_{2}}, \forall p_{1}, p_{2} \in M,

conv (S) := {p \in S : p = Exp_{p} (\int_{S} Log_{p} (x) w (x) λ (d x)), w : S \to [0, 1], \int_{S} w (x) λ (d x) = 1},

conv (S) := {p \in S : p = Exp_{p} (\int_{S} Log_{p} (x) w (x) λ (d x)), w : S \to [0, 1], \int_{S} w (x) λ (d x) = 1},

P_{p} (M)

P_{p} (M)

μ = E_{ν} [X] := ar g y \in supp (ν) min \int_{M} δ_{R} (y, x)^{2} ν (d x) .

μ = E_{ν} [X] := ar g y \in supp (ν) min \int_{M} δ_{R} (y, x)^{2} ν (d x) .

E_{ν} [Log_{μ} (X)]

E_{ν} [Log_{μ} (X)]

m = GM_{ν} (X) := ar g y \in supp (ν) min \int_{M} δ_{R} (y, x) d ν (x) .

m = GM_{ν} (X) := ar g y \in supp (ν) min \int_{M} δ_{R} (y, x) d ν (x) .

E_{ν} [\frac{Log _{m} ( X )}{δ _{R} ( m , X )}]

E_{ν} [\frac{Log _{m} ( X )}{δ _{R} ( m , X )}]

D (ν, y)

D (ν, y)

D (ν, μ)

D (ν, μ)

D (ν, Exp_{μ} (t_{1} h))

D (ν, Exp_{μ} (t_{1} h))

iD (ν, y_{1})

iD (ν, y_{1})

M \to \infty lim ∥ Log (y) ∥_{F} \geq M sup D (ν, y)

M \to \infty lim ∥ Log (y) ∥_{F} \geq M sup D (ν, y)

n \to \infty lim D (ν, y_{n})

n \to \infty lim D (ν, y_{n})

y \in M sup ∣ D (ν_{n}, y) - D (ν, y) ∣

y \in M sup ∣ D (ν_{n}, y) - D (ν, y) ∣

D_{α} (ζ)

D_{α} (ζ)

ZD_{R^{d}} (ζ, y)

ZD_{R^{d}} (ζ, y)

ZD_{M} (ν, y)

ZD_{M} (ν, y)

ZD_{M} (ν, y)

ZD_{M} (ν, y)

\displaystyle D_{\alpha}^{\mathcal{M}}(\nu)\ =\ \Bigg{\{}y\in\mathcal{M}\,\Big{|}\,y=\textnormal{Exp}_{y}\left(\int_{\mathcal{M}}\textnormal{Log}_{y}(x)w(x)\>\nu(dx)\right),\ w:\mathcal{M}\to[0,1/\alpha],\int_{\mathcal{M}}w(x)\>\nu(dx)=1\Bigg{\}},

\displaystyle D_{\alpha}^{\mathcal{M}}(\nu)\ =\ \Bigg{\{}y\in\mathcal{M}\,\Big{|}\,y=\textnormal{Exp}_{y}\left(\int_{\mathcal{M}}\textnormal{Log}_{y}(x)w(x)\>\nu(dx)\right),\ w:\mathcal{M}\to[0,1/\alpha],\int_{\mathcal{M}}w(x)\>\nu(dx)=1\Bigg{\}},

n \to \infty lim ZD_{M} (ν, y_{n})

n \to \infty lim ZD_{M} (ν, y_{n})

y \in rint (conv (ν)) sup ∣ ZD_{M} (ν_{n}, y) - ZD_{M} (ν, y) ∣

y \in rint (conv (ν)) sup ∣ ZD_{M} (ν_{n}, y) - ZD_{M} (ν, y) ∣

DR_{1 - α}

DR_{1 - α}

iZD_{M} (ν, y) := \int_{I} ZD_{M} (ν (t), y (t)) d t = \int_{I} sup {α : 0_{d \times d} \in D_{α} (ζ_{y} (t))} d t,

iZD_{M} (ν, y) := \int_{I} ZD_{M} (ν (t), y (t)) d t = \int_{I} sup {α : 0_{d \times d} \in D_{α} (ζ_{y} (t))} d t,

n \to \infty lim iZD_{M} (ν, y_{n})

n \to \infty lim iZD_{M} (ν, y_{n})

y \in rint (conv (ν)) sup ∣ iZD_{M} (ν_{n}, y) - iZD_{M} (ν, y) ∣

y \in rint (conv (ν)) sup ∣ iZD_{M} (ν_{n}, y) - iZD_{M} (ν, y) ∣

GDD (ν, y)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JorisChau/pdSpecEst
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Intrinsic data depth for Hermitian positive definite matrices

Joris Chau111 Corresponding author, [email protected], Institute of Statistics, Biostatistics, and Actuarial Sciences, Université catholique de Louvain, Voie du Roman Pays 20, B-1348, Louvain-la-Neuve, Belgium. , Hernando Ombao222 Department of Statistics, University of California at Irvine, Bren Hall 2206, Irvine, CA, 92697, United States. Department of Applied Mathematics and Computational Science, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. and Rainer von Sachs333 Institute of Statistics, Biostatistics, and Actuarial Sciences, Université catholique de Louvain, Voie du Roman Pays 20, B-1348, Louvain-la-Neuve, Belgium.

Abstract

Nondegenerate covariance, correlation and spectral density matrices are necessarily symmetric or Hermitian and positive definite. The main contribution of this paper is the development of statistical data depths for collections of Hermitian positive definite matrices by exploiting the geometric structure of the space as a Riemannian manifold. The depth functions allow one to naturally characterize most central or outlying matrices, but also provide a practical framework for inference in the context of samples of positive definite matrices. First, the desired properties of an intrinsic data depth function acting on the space of Hermitian positive definite matrices are presented. Second, we propose two computationally fast pointwise and integrated data depth functions that satisfy each of these requirements and investigate several robustness and efficiency aspects. As an application, we construct depth-based confidence regions for the intrinsic mean of a sample of positive definite matrices, which is applied to the exploratory analysis of a collection of covariance matrices associated to a multicenter research trial.

Keywords: Data depth, Hermitian positive definite matrices, Riemannian manifold, Confidence regions, Affine-invariant metric, Covariance matrices.

1 Introduction

In numerous applications in multivariate statistics, we are interested not only in the first-order behavior (mean) of a sample of random vectors, but also in the second-order behavior or variability of the sample. In fact, our primary interest is often precisely the analysis of covariance or correlation structures between components of the random vectors. In many areas of statistical research, such as neuroscience, biomedical science, environmental science, demographics or finance, it is increasingly common to encounter covariance or correlation matrices across a large number of temporal or spatial locations, or across a large number of replicated subjects or trials in an experiment. In this work, our aim is to develop data exploration and inference tools for large collections or samples of such matrices.

The data objects of interest, nondegenerate covariance or correlation matrices, are necessarily elements of the space of Hermitian positive definite (HPD) matrices, which includes the space of symmetric positive definite (SPD) matrices in the real-valued case. The space of HPD matrices, although very well-structured, is inherently non-Euclidean and standard Euclidean-based statistical procedures (e.g., regression, clustering or inference procedures) may be unstable or break down due to the geometric constraints of the space. For this reason, it is necessary to generalize statistical procedures for data in the space of symmetric or Hermitian PD matrices, taking into account the non-Euclidean geometry of the space. Several recent works addressing this issue include: Smith (2000), Pennec et al. (2006), Fletcher et al. (2009), Zhu et al. (2009), Dryden et al. (2009), Fletcher et al. (2011), Yuan et al. (2012) Said et al. (2015), Holbrook et al. (2016) and Chau and von Sachs (2017) among others. The main contribution of this paper is the generalization of notions of data depth for samples of HPD matrices to provide a center-to-outward ordering of positive definite matrix-valued objects.

Data depth is a useful tool for data exploration to identify most central or outlying data observations (as in Liu et al. (1999) or Sun and Genton (2012) in a Euclidean context); or as a means of inference, by way of rank-based hypothesis testing (as in Liu and Singh (1993), Chenouri and Small (2012), or (Mosler, 2002, Chapter 5)), classification (see e.g., Li et al. (2012)) or the construction of confidence regions (see e.g., Yeh and Singh (1997)) among other applications. Although many different depth functions have been proposed and studied in the literature over the years, most data depth functions are constructed in the first place for vector-valued observations in the Euclidean space $\mathbb{R}^{d}$ . Exceptions include Liu and Singh (1992), where the authors consider depth functions for directional data on circles or spheres; Hu et al. (2011), on projection depth for tensor objects; or the recent work, Paindaveine and Van Bever (2017), on halfspace depths for scatter, concentration and shape matrices. For an overview of various Euclidean data depth functions and their specific properties, we refer the reader to e.g., Liu et al. (1999), Zuo and Serfling (2000), or Mosler (2002).

The space of $(d\times d)$ -dimensional Hermitian (not necessarily PD) matrices $(\mathbb{H}_{d\times d},+,\cdot_{S})$ together with matrix addition and matrix scalar multiplication is a real vector space, and each Hermitian matrix bijectively maps to a vector in $\mathbb{R}^{d^{2}}$ by expanding the matrix with respect to some basis. To calculate data depth values for a sample of Hermitian matrices, it suffices to apply any ordinary Euclidean data depth function to the basis component vectors of the Hermitian matrices, given that the computed depth values do not depend on the chosen basis. In contrast, due to the nonlinear positive definite constraints, the space of HPD matrices $(\mathbb{P}_{d\times d},+,\cdot_{S})$ is not a vector space. Moreover, the cone of HPD matrices embedded in a Euclidean space endowed with the Eulidean metric is not a complete metric space. As a consequence, Euclidean data depth applied to a sample of HPD matrices violates the basic properties of a proper depth function. To illustrate, according to Zuo and Serfling (2000), a proper depth function should be monotonicallly non-increasing moving outwards from a well-defined center. Moving away from a central point along a straight line is not always well-defined in the cone of HPD matrices, as the boundary of the space lies at a finite distance. Also, pointwise or uniform continuity properties of the data depth functions fail to hold due to the incompleteness of the metric space.

Instead of embedding the space of HPD matrices in an ambient Euclidean space, we exploit the geometric structure of the space of HPD matrices as a curved Riemannian manifold equipped with the affine-invariant (Pennec et al. (2006)) –also natural invariant (Smith (2000)), canonical (Holbrook et al. (2016)), trace (Yuan et al. (2012)), Rao-Fisher (Said et al. (2015))– Riemannian metric, or simply the Riemannian metric (Bhatia (2009), Dryden et al. (2009)). The affine-invariant metric plays an important role in estimation problems in the space of symmetric or Hermitian PD matrices for several reasons: (i) the space of HPD matrices equipped with the affine-invariant metric is a complete metric space, (ii) the affine-invariant metric is invariant under congruence transformation by any invertible matrix, see Section 2, and (iii) there is no swelling effect as with the Euclidean metric, where interpolating two HPD matrices may yield a matrix with a determinant larger than either of the original matrices, which may lead to computational instability, (Pasternak et al. (2010)). The first property allows us to construct proper data depth functions in the space of HPD matrices satisfying all of the intrinsic versions of the axiomatic properties in Zuo and Serfling (2000). The second property is important to ensure that the depth functions are general linear congruence invariant, which in practice means that the depth values do not non-trivially depend on the chosen coordinate system of the data. In Dryden et al. (2009), the authors list several additional metrics for estimation problems in the space of HPD matrices, such as the Log-Euclidean metric, also studied in Arsigny et al. (2006). The Log-Euclidean metric transforms the space of HPD matrices into a complete metric space and is invariant under congruence transformations by the unitary group, but not by the general linear group, as is true for the affine-invariant metric.

In the preliminary Section 2, we introduce the necessary geometric tools to develop data depths acting directly on the space of HPD matrices as a geodesically complete manifold. In Section 3, we present the desired properties an intrinsic depth function should satisfy, and we propose two data depth functions that satisfy each of these requirements. In addition, we consider integrated depth functions that act on curves of HPD matrices, such as spectral density matrices. In Section 4, we compare the two depth functions in terms of robustness and efficiency aspects. In Section 5, as an application of the depth functions, we construct depth-based confidence regions for the intrinsic mean of a sample of HPD matrices, and in Section 6 we apply the intrinsic depth functions to explore a collection of covariance matrices from a multicenter clinical trial. The technical proofs and additional figures can be found in the supplementary material. The accompanying R-code, containing the necessary tools to compute the intrinsic data depths and to perform rank-based hypothesis testing for samples of HPD matrices, is publicly available in the R-package pdSpecEst on CRAN, (Chau (2017)).

2 Preliminaries

2.1 Geometry of HPD matrices

In order to develop data depths for observations in the space of HPD matrices, we study the space as a Riemannian manifold as in Pennec et al. (2006), (Bhatia, 2009, Chapter 6), or Smith (2000) among others. Denote $\mathcal{M}:=\mathbb{P}_{d\times d}$ for the space of $(d\times d)$ HPD matrices. $\mathcal{M}$ is an open subset of the space $(d\times d)$ Hermitian matrices $\mathcal{H}:=\mathbb{H}_{d\times d}$ , and as such a smooth manifold. The tangent space $T_{p}(\mathcal{M})$ at a point (i.e., a matrix) $p\in\mathcal{M}$ can be identified by the Hermitian space $\mathcal{H}$ , and the Frobenius inner product on $\mathcal{H}$ induces the affine-invariant Riemannian metric $g_{R}$ on the manifold $\mathcal{M}$ given by the smooth family of inner products:

[TABLE]

with $h_{1},h_{2}\in T_{p}(\mathcal{M})$ . Here and throughout this paper, $y^{1/2}$ always denotes the Hermitian square root matrix of $y\in\mathcal{M}$ , and we write $y\ast x:=y^{*}xy$ for matrix congruence transformation, where ∗ denotes the conjugate transpose of a matrix. The Riemannian distance $\delta_{R}$ on $\mathcal{M}$ derived from the affine-invariant Riemannian metric is given by:

[TABLE]

where $\|\cdot\|_{F}$ denotes the matrix Frobenius norm and $\textnormal{Log}(\cdot)$ is the matrix logarithm. Denote the general linear group by $\textnormal{GL}(d,\mathbb{C}):=\{a\in\mathbb{C}^{d\times d}\,:\,\det(a)\neq 0\}$ . The mapping $x\mapsto a\ast x$ is an isometry for each invertible matrix $a\in\textnormal{GL}(d,\mathbb{C})$ , i.e., it is distance-preserving:

[TABLE]

By (Bhatia, 2009, Theorem 6.1.6 and Prop. 6.2.2), the Riemannian manifold $(\mathcal{M},g_{R})$ is geodesically complete. By the Hopf-Rinow Theorem this implies that there exists a unique geodesic segment joining any two points $p_{1},p_{2}\in\mathcal{M}$ and every geodesic can be extended indefinitely. The Hopf-Rinow Theorem also implies that for every $p\in\mathcal{M}$ the exponential map $\textnormal{Exp}_{p}$ and the logarithmic (i.e., inverse exponential) map $\textnormal{Log}_{p}$ are global diffeomorphisms with domains $T_{p}(\mathcal{M})$ and $\mathcal{M}$ respectively. By (Pennec et al. (2006)), the exponential $\textnormal{Exp}_{p}:T_{p}(\mathcal{M})\to\mathcal{M}$ and logarithmic $\textnormal{Log}_{p}:\mathcal{M}\to T_{p}(\mathcal{M})$ maps are given by,

[TABLE]

where $\textnormal{Exp}(\cdot)$ denotes the matrix exponential. The Riemannian distance may now also be expressed in terms of the logarithmic map as:

[TABLE]

where throughout this paper $\|h\|_{p}:=\langle h,h\rangle_{p}$ denotes the norm of $h\in T_{p}(\mathcal{M})$ induced by the affine-invariant metric.

As there exists a unique geodesic curve connecting any two points $p_{1},p_{2}\in\mathcal{M}$ , geodesically convex sets are well-defined. A subset $\mathcal{K}\subseteq\mathcal{M}$ is said to be convex or geodesically convex if for each pair of points $p_{1},p_{2}\in\mathcal{K}$ , the geodesic segment $[p_{1},p_{2}]$ is contained entirely in $\mathcal{K}$ . If $\mathcal{S}\subseteq\mathcal{M}$ , then the convex hull of $\mathcal{S}$ , denoted by $\textnormal{conv}(\mathcal{S})$ , is the smallest convex set containing $\mathcal{S}$ . This set is conveniently expressed as,

[TABLE]

where $\lambda$ is the Lebesgue measure on the finite-dimensional metric space $(\mathcal{M},\delta_{R})$ and $w$ is a measurable function. For more details on the construction of (approximate) convex hulls on the manifold $\mathcal{M}$ , we refer to Fletcher et al. (2011).

2.2 Probability distributions and random variables

A random variable $X:\Omega\to\mathcal{M}$ on the Riemannian manifold $(\mathcal{M},g_{R})$ is a measurable function from some probability space $(\Omega,\mathcal{A},\nu)$ to the measurable space $(\mathcal{M},\mathcal{B}(\mathcal{M}))$ , where $\mathcal{B}(\mathcal{M})$ is the Borel algebra, i.e., the smallest $\sigma$ -algebra containing all open sets in $(\mathcal{M},g_{R})$ . In the following, we always work directly with the induced probability on $\mathcal{M}$ , $\nu(B)=\nu(\{\omega\in\Omega:X(\omega)\in B\})$ . By $P(\mathcal{M})$ , we denote the set of all probability measures on $(\mathcal{M},\mathcal{B}(\mathcal{M}))$ and $P_{p}(\mathcal{M})$ denotes the subset of probability measures in $P(\mathcal{M})$ that have finite moments of order $p$ with respect to the Riemannian distance, i.e., the $L^{p}$ -Wasserstein space (Villani, 2009, Definition 6.4):

[TABLE]

Note that if $\int_{\mathcal{M}}\delta_{R}(y_{0},x)^{p}\>\nu(dx)<\infty$ for some $y_{0}\in\mathcal{M}$ and $1\leq p<\infty$ , this is true for any $y\in\mathcal{M}$ . This follows by the triangle inequality and the fact that $\delta_{R}(p_{1},p_{2})<\infty$ for any $p_{1},p_{2}\in\mathcal{M}$ , as $\int_{\mathcal{M}}\delta_{R}(y,x)^{p}\>\nu(dx)\leq 2^{p}\left(\delta_{R}(y,y_{0})^{p}+\int_{\mathcal{M}}\delta_{R}(y_{0},x)^{p}\>\nu(dx)\right)<\infty$ . For a sequence of probability measures $(\nu_{n})_{n\in\mathbb{N}}$ in $P(\mathcal{M})$ , $\nu_{n}\overset{w}{\to}\nu$ denotes weak convergence to the probability measure $\nu$ in the usual sense, i.e., $\int_{\mathcal{M}}\phi(x)\>\nu_{n}(dx)\to\int_{\mathcal{M}}\phi(x)\>\nu(dx)$ for every continuous and bounded function $\phi:\mathcal{M}\to\mathbb{R}$ , and a sequence $(\nu_{n})_{n\in\mathbb{N}}$ is said to be uniformly integrable if $\lim_{K\to\infty}\sup_{n\in\mathbb{N}}\int_{\mathcal{M}}\delta_{R}(y_{0},x)\boldsymbol{1}_{\{\delta_{R}(y_{0},x)>K\}}\>\nu_{n}(dx)=0$ for some $y_{0}\in\mathcal{M}$ . Note that if $(\nu_{n})_{n\in\mathbb{N}}$ is uniformly integrable for some $y_{0}\in\mathcal{M}$ , then the sequence is uniformly integrable for any $y\in\mathcal{M}$ . Finally, we use the notation $\textnormal{conv}(\nu):=\textnormal{conv}(\textnormal{supp}(\nu))$ for the convex hull of the support of the measure $\nu$ on $\mathcal{M}$ , and $\textnormal{rint}(\textnormal{conv}(\nu))$ and $\textnormal{r}\delta(\textnormal{conv}(\nu))$ for its relative interior and relative boundary.

2.3 Measures of centrality

Intrinsic mean.

To characterize the center of a random variable $X$ with probability measure $\nu$ , one important measure of centrality is the Karcher or Fréchet mean, which is also referred to as the intrinsic mean as it is intrinsic to the Riemannian distance measure on the manifold. The intrinsic mean turns out to be the point of maximum depth in the intrinsic zonoid depth introduced in Section 3.2. The set of intrinsic means consists of the points that minimize the second moment with respect to the Riemannian distance,

[TABLE]

If $\nu\in P_{2}(\mathcal{M})$ , then at least one intrinsic mean exists as the above expectation is finite for $y\in\mathcal{M}$ . Moreover, since the manifold $\mathcal{M}$ is a geodesically complete manifold of non-positive curvature (see Pennec et al. (2006) or Skovgaard (1984)), by (Le, 1995, Proposition 1) the intrinsic mean $\mu$ is unique for any distribution $\nu\in P_{2}(\mathcal{M})$ . By (Pennec, 2006, Corollary 1), the intrinsic mean is also represented by the point $\mu\in\mathcal{M}$ that satisfies,

[TABLE]

where $\boldsymbol{0}$ is the zero matrix. The sample intrinsic mean of a set of manifold-valued observations minimizes a sum of squared Riemannian distances and can be computed efficiently through a gradient descent algorithm as in Pennec (2006).

Intrinsic median.

A second measure of centrality of primary interest is the intrinsic median as in Fletcher et al. (2009), which is the point of maximum depth in the geodesic distance depth defined in Section 3.4. The set of intrinsic medians minimizes the first moment with respect to the Riemannian distance,

[TABLE]

On $(\mathcal{M},\delta_{R})$ , a geodesically complete manifold with non-positive curvature, the intrinsic median exists and is unique for any distribution $\nu\in P_{1}(\mathcal{M})$ . This follows by the proof of (Fletcher et al., 2009, Theorem 1) combined with an application of Leibniz’s integral rule. Furthermore, the intrinsic median is uniquely characterized by the point $m\in\mathcal{M}$ that satisfies,

[TABLE]

Remark.

If the distribution $\nu$ of a random variable $X$ is centrally symmetric around $\mu\in\mathcal{M}$ in the sense that $\textnormal{Log}_{\mu}(X)\overset{d}{=}-\textnormal{Log}_{\mu}(X)$ , then the intrinsic mean and median coincide and are equal to $\mu$ . Here, equality in distribution ( $\overset{d}{=}$ ) is read as equality in terms of the joint distribution of all matrix components. The claim for the intrinsic mean follows by the fact that $\boldsymbol{E}_{\nu}[\textnormal{Log}_{\mu}(X)]=\boldsymbol{0}$ , which implies that $\mu$ is the intrinsic mean of the random variable $X$ . For the intrinsic median, if $X$ is centrally symmetric around $\mu$ , then $X$ is also angularly symmetric around $\mu$ in the sense that $\textnormal{Log}_{\mu}(X)/\|\textnormal{Log}_{\mu}(X)\|_{\mu}\overset{d}{=}-\textnormal{Log}_{\mu}(X)/\|\textnormal{Log}_{\mu}(X)\|_{\mu}$ . Substituting $\|\textnormal{Log}_{\mu}(X)\|_{\mu}=\delta_{R}(\mu,X)$ , we observe that $\boldsymbol{E}_{\nu}[\textnormal{Log}_{\mu}(X)/\delta_{R}(\mu,X)]=\boldsymbol{0}$ , which implies that $\mu$ is also the intrinsic median of the random variable $X$ .

3 Data depth for HPD matrices

Before introducing the manifold data depth functions, we present the desired properties a proper intrinsic data depth function –acting directly on the space of HPD matrices– should satisfy. These requirements are the natural intrinsic generalizations of the properties in Zuo and Serfling (2000) for depth functions acting on vectors in a Euclidean space $\mathbb{R}^{d}$ . We also consider integrated analogs for depth functions acting on curves of HPD matrices $y(t)\in\mathcal{M}$ with $t\in\mathcal{I}\subset\mathbb{R}$ , such as spectral density matrices in the Fourier domain.

3.1 Depth properties

Below, we denote $\textnormal{D}(\nu,y)$ for the depth of a matrix $y\in\mathcal{M}$ with respect to a distribution $\nu\in P(\mathcal{M})$ ; or $\textnormal{iD}(\nu,y)$ for the integrated depth of a matrix curve $y:=(y(t))_{t\in\mathcal{I}}$ with respect to a curve of marginal measures $\nu:=(\nu(t))_{t\in\mathcal{I}}$ , such that $\nu(t)\in P(\mathcal{M})$ for each $t\in\mathcal{I}$ . If a nonnegative bounded function $\textnormal{D}(\cdot,\cdot)$ or $\textnormal{iD}(\cdot,\cdot)$ satisfies the pointwise (resp. integrated) properties P.1 to P.4, we say that it is a proper data depth function on the Riemannian manifold $(\mathcal{M},g_{R})$ .

P.1 (Congruence invariance) The depth function should be invariant under matrix congruence transformation of the form $x\mapsto a\ast x$ , with $a\in\textnormal{GL}(d,\mathbb{C})$ . That is, for each $a\in\textnormal{GL}(d,\mathbb{C})$ ,

[TABLE]

where $\nu_{a}$ is the distribution of the transformed random variable $a\ast X$ , such that $X$ is distributed according to $\nu$ . Generalizing this property for an integrated depth function $\textnormal{iD}(\nu,y)$ , we require that the same property holds pointwise for each $t\in\mathcal{I}$ . In this case, $a:=(a(t))_{t\in\mathcal{I}}$ is a curve of invertible matrices, with $a(t)\in\textnormal{GL}(d,\mathbb{C})$ for each $t\in\mathcal{I}$ .

In a standard Euclidean context, for a depth function acting on vectors in the Euclidean space $\mathbb{R}^{d}$ , it is desirable that the depth is affine-invariant $\textnormal{D}(\nu,y)=\textnormal{D}(\nu_{a,b},ay+b)$ for each $y\in\mathbb{R}^{d}$ , where $\nu_{a,b}$ is the distribution of the random vector $aX+b$ , with $a\in\textnormal{GL}(d,\mathbb{R})$ , $b\in\mathbb{R}^{d}$ and $X$ distributed according to $\nu$ . In the current setup, we are concerned with covariance or correlation matrices, corresponding to the second-order behavior of a random vector. For a random vector $X$ with covariance matrix $\Sigma$ , the covariance matrix of the affine transformation $aX+b$ is given by $a^{T}\ast\Sigma=a\Sigma a^{T}$ . A natural requirement for the depth functions acting on symmetric or Hermitian PD matrices is therefore invariance under congruence transformations of the data. Another way to view this is that a depth function acting on the covariance matrix of a data vector $X$ should be invariant under a change of basis in the data space of $X$ .

P.2 (Maximality at center) The depth function should attain its maximum value, i.e., deepest point, at a well-defined unique center of the distribution, such as the intrinsic mean or median, which are characterized as the points of central and angular symmetry respectively. Let $\mu\in\mathcal{M}$ be a unique central point of the distribution $\nu$ , then,

[TABLE]

Similarly, for an integrated depth function, the maximum value should be attained at a well-defined unique central curve $\mu(t)$ with $t\in\mathcal{I}$ , such as the curve of intrinsic means or medians.

P.3 (Monotonicity relative to center) As $y\in\mathcal{M}$ moves away from the deepest point $\mu$ along a geodesic curve emanating from $\mu$ , the depth of the point $y$ with respect to the distribution $\nu$ should be monotonically non-increasing. Let $\textnormal{Exp}_{\mu}(th)$ , $t\geq 0$ , be the geodesic emanating from $\mu$ with unit tangent vector $h$ . Then,

[TABLE]

For an integrated depth function, let $s_{1}(t),s_{2}(t)$ be real-valued curves over $\mathcal{I}$ , such that $0\leq s_{1}(t)\leq s_{2}(t)$ for each $t\in\mathcal{I}$ . Denote $y_{1}(t):=\textnormal{Exp}_{\mu(t)}(s_{1}(t)h(t))$ and $y_{2}(t):=\textnormal{Exp}_{\mu(t)}(s_{2}(t)h(t))$ , where $h(t)\in T_{\mu(t)}(\mathcal{M})$ is a curve of unit tangent vectors. Then,

[TABLE]

P.4 (Vanishing at infinity) The depth of a point $y\in\mathcal{M}$ should approach zero as the point $y$ converges to a singular matrix, i.e., a matrix with zero or infinite eigenvalues,

[TABLE]

Similarly, for an integrated depth function, if the curve $y(t)$ converges to a curve of singular matrices for each $t\in\mathcal{I}$ , then the integrated depth should approach zero.

Below, we give two additional continuity properties, which although not strictly required are nonetheless useful to derive asymptotic results in subsequent applications, such as rank-based hypothesis testing or the construction of depth-based confidence sets as in Section 5.

(P.5) (Continuity in $y$ ) Let $(y_{n})_{n\in\mathbb{N}}$ be a convergent sequence with $y_{n}\in\mathcal{M}$ for each $n\in\mathbb{N}$ , such that $\delta_{R}(y_{n},y)\to 0$ . Then the depth function is continuous in $y$ in the sense that,

[TABLE]

(P.6) (Uniform continuity in $\nu$ ) The depth function is uniformly continuous in terms of the probability measure $\nu$ in the sense that if $(\nu_{n})_{n\in\mathbb{N}}$ is a uniformly integrable sequence of probability measures, such that $\nu_{n}\overset{w}{\to}\nu$ . Then,

[TABLE]

3.2 Intrinsic zonoid depth

As geodesic convex hulls are well-defined on the Riemannian manifold $(\mathcal{M},g_{R})$ , there exist natural manifold generalizations of the simplicial depth or convex hull peeling depth (Liu et al. (1999)) for Euclidean vectors. However, the simplicial depth requires the computation of possibly many convex hulls, which quickly becomes computationally infeasible, especially for higher-dimensional matrices. Instead, we propose a straightforward manifold generalization of another depth measure based on trimmed convex depth regions, the zonoid depth (e.g., Mosler (2002)). The intrinsic manifold zonoid depth can be computed with the same tools as the standard zonoid depth for Euclidean vectors and its computation remains efficient, also for higher-dimensional HPD matrices.

In a Euclidean context, let $\zeta$ be a probability measure on $(\mathbb{R}^{d},\mathcal{B}^{d})$ with finite first moment, then the zonoid $\alpha$ -trimmed region, with $0<\alpha\leq 1$ , is defined as the set,

[TABLE]

If $\alpha=0$ , we set $D_{0}(\zeta)=\mathbb{R}^{d}$ . By (Mosler, 2002, Chapter 3), $D_{\alpha}(\zeta)$ is convex and monotone decreasing in $\alpha$ , creating a nested sequence of convex sets for decreasing values $\alpha_{1}\geq\ldots\geq\alpha_{n}$ . If $\alpha=1$ , $D_{\alpha}(\zeta)$ consists of the single point $\boldsymbol{E}_{\zeta}[X]$ , the Euclidean mean of the distribution $\zeta$ . The Euclidean zonoid depth of a point $y\in\mathbb{R}^{d}$ with respect to a distribution $\zeta$ is characterized by the smallest $\alpha$ -trimmed region still containing $y$ ,

[TABLE]

The zonoid data depth is extended to the Riemannian manifold as follows.

Definition 3.1.

(Intrinsic zonoid depth) Let $\nu\in P_{2}(\mathcal{M})$ and let $\zeta_{y}$ be the probability measure on $(\mathbb{R}^{d^{2}},\mathcal{B}(\mathbb{R}^{d^{2}}))$ of the random variable $\textnormal{Log}_{y}(X)\in T_{y}(\mathcal{M})\cong\mathbb{R}^{d^{2}}$ as a $d^{2}$ -dimensional random real basis component vector, where $X$ has probability measure $\nu$ . The intrinsic zonoid depth of a point $y\in\mathcal{M}$ with respect to the distribution $\nu$ is defined as:

[TABLE]

where $\vec{0}$ is a $d^{2}$ -dimensional zero vector, and $D_{\alpha}(\zeta_{y})$ is the Euclidean zonoid $\alpha$ -trimmed region of the distribution of the normal coordinate vector $\zeta_{y}$ on $(\mathbb{R}^{d^{2}},\mathcal{B}(\mathbb{R}^{d^{2}}))$ . Equivalently, the intrinsic zonoid depth can be written as,

[TABLE]

where $D^{\mathcal{M}}_{\alpha}(\nu)$ is the intrinsic zonoid $\alpha$ -trimmed region defined as,

[TABLE]

with $w$ a measurable function.

Remark.

Computation of the intrinsic zonoid depth is straightforward via the definition $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\textnormal{ZD}_{\mathbb{R}^{d^{2}}}(\zeta_{y},0)$ and can be calculated directly by the Euclidean zonoid depth as in (Mosler, 2002, Chapter 4). Note that if $(e_{1},\ldots,e_{d^{2}})$ is an orthonormal basis of the vector space $(\mathcal{H},\langle\cdot,\cdot\rangle_{F})$ , then an orthonormal basis of $(T_{y}(\mathcal{M}),\langle\cdot,\cdot\rangle_{y})$ is simply $(y^{1/2}\ast e_{1},\ldots,y^{1/2}\ast e_{d^{2}})$ . In fact, the basis components of $\textnormal{Log}_{y}(x)\in T_{y}(\mathcal{M})$ can be computed directly using only an orthonormal basis of $(\mathcal{H},\langle\cdot,\cdot\rangle_{F})$ , since $\langle\textnormal{Log}_{y}(x),y^{1/2}\ast e_{i}\rangle_{y}=\langle\textnormal{Log}(y^{-1/2}\ast x),e_{i}\rangle_{F}$ .

Theorem 3.1.

The intrinsic zonoid depth is a proper data depth function in the sense of Section 3.1, satisfying properties P.1–P.4 for distributions in $P_{2}(\mathcal{M})$ . The unique point of maximum depth coincides with the intrinsic mean of the distribution.

In order to show that the continuity properties P.5 and P.6 also hold for the intrinsic zonoid depth, we need the following lemma.

Lemma 3.2.

Let $\nu\in P_{2}(\mathcal{M})$ . Then, $\bigcup_{0<\alpha\leq 1}D_{\alpha}^{\mathcal{M}}(\nu)\>=\>\textnormal{conv}(\nu)$ . In particular, for each $y\in\textnormal{conv}(\nu)$ , $\textnormal{ZD}_{\mathcal{M}}(\nu,y)>0$ by definition of the intrinsic zonoid depth.

Theorem 3.3.

The intrinsic zonoid depth is continuous in $y$ as in P.5 for $y\in\textnormal{conv}(\nu)$ and $\nu\in P_{2}(\mathcal{M})$ , i.e., if $\delta_{R}(y_{n},y)\to 0$ with $y_{n}\in\mathcal{M}$ for all $n\in\mathbb{N}$ , then,

[TABLE]

The intrinsic zonoid depth is uniformly continuous in $\nu$ as in P.6 for $y\in\textnormal{rint}(\textnormal{conv}(\nu))$ and $(\nu_{n})_{n\in\mathbb{N}}$ in $P_{2}(\mathcal{M})$ uniformly integrable. If $\nu_{n}\overset{w}{\to}\nu$ , then,

[TABLE]

Example 3.1.

In Figure 1, we display several $100(1-\alpha)\%$ central intrinsic zonoid depth regions for generated i.i.d. samples of $(2\times 2)$ -dimensional SPD matrices $x_{1},\ldots,x_{500}$ from a distribution $\nu_{\mu}\in P_{2}(\mathcal{M})$ with intrinsic mean $\mu$ . Denoting $\nu_{500}$ for the empirical distribution of $x_{1},\ldots,x_{500}$ , the $100(1-\alpha)\%$ central depth-region $\textnormal{DR}_{1-\alpha}$ is given by the set of SPD matrices:

[TABLE]

In the left-hand image, data matrices are sampled from a Riemannian log-normal distribution $\nu_{\mu}$ as in e.g., Yuan et al. (2012), with intrinsic mean $\mu$ equal to the identity matrix. That is, $X_{i}\overset{d}{=}\textnormal{Exp}(\sum_{k}Z_{ki}e^{k})$ , with $(Z_{ki})_{k}\overset{\textnormal{iid}}{\sim}N(0,1/2)$ , where $(e^{1},\ldots,e^{4})\in\mathbb{H}_{2\times 2}^{4}$ is an orthonormal basis of ( $\mathbb{H}_{2\times 2},\langle\cdot,\cdot\rangle_{F}$ ). In the right-hand image, $\nu_{\mu}$ is a rescaled Wishart distribution with intrinsic mean $\mu=\left(\begin{smallmatrix}0.5&0.25\\ 0.25&0.5\end{smallmatrix}\right)$ , such that $X_{i}\overset{d}{=}e^{-c(2,8)}W$ , with $W\sim W_{2}^{c}(8,\mu/8)$ a complex Wishart distribution with $8$ degrees of freedom and $c(d,B)=-\log(B)+\frac{1}{d}\sum_{i=1}^{d}\psi(B-(d-i))$ the intrinsic bias-correction in (Chau and von Sachs, 2017, Theorem 5.1). The $(x,y,z)$ -axes in Figure 1 correspond to the three independent components in the symmetric matrix $\left(\begin{smallmatrix}x&z\\ z&y\end{smallmatrix}\right)$ .

3.3 Integrated intrinsic zonoid depth

A straightforward generalization of the pointwise intrinsic zonoid depth in Definition 3.1 to compute the depth of a curve $y(t)\in\mathcal{M}$ with respect to a collection of marginal measures $\nu(t)$ for $t\in\mathcal{I}\subset\mathbb{R}$ is to consider the integrated intrinsic zonoid depth given by,

[TABLE]

where $\zeta_{y}(t)$ is the probability of the components of the random variable $\textnormal{Log}_{y(t)}(X(t))\in T_{y(t)}(\mathcal{M})\cong\mathbb{R}^{d^{2}}$ , such that $X(t)$ has probability measure $\nu(t)$ . This is similar to the construction of the modified band depth (MBD) in a functional data context, where the pointwise Euclidean simplicial depths $y(t)$ are integrated over a functional domain $t\in\mathcal{I}$ , (López-Pintado and Romo (2009) or Sun and Genton (2012)). The integrated versions of the properties P.1 to P.6 continue to hold for the integrated intrinsic zonoid depth and are straightforward generalizations of their pointwise analogs.

Theorem 3.4.

The integrated intrinsic zonoid depth is a proper integrated depth function in the sense of Section 3.1, satisfying the integrated versions of properties P.1–P.4 for collections of marginal distributions $\nu(t)\in P_{2}(\mathcal{M})$ for $t\in\mathcal{I}$ . The unique curve of maximum depth coincides with the curve of pointwise intrinsic means of the marginal distributions.

Proposition 3.5.

Let $y(t)\in\textnormal{conv}(\nu(t))$ , $\nu(t)\in P_{2}(\mathcal{M})$ and $y_{n}(t)\in\mathcal{M}$ for each $t\in\mathcal{I}$ , such that $y_{n}(t)\to y(t)$ uniformly in $t$ , i.e., $\sup_{t\in\mathcal{I}}\delta_{R}(y_{n}(t),y(t))\to 0$ . Then the integrated manifold zonoid depth is continuous in $y$ as in P.5 in the sense that,

[TABLE]

If $y(t)\in\textnormal{rint}(\textnormal{conv}(\nu))$ , $(\nu_{n}(t))_{n\in\mathbb{N}}$ in $P_{2}(\mathcal{M})$ is a uniformly integrable sequence of measures uniform in $t$ , and $\nu_{n}(t)\overset{w}{\to}\nu(t)$ uniformly in $t$ . Then,

[TABLE]

Here, $y\in\textnormal{rint}(\textnormal{conv}(\nu))$ means that $y(t)\in\textnormal{rint}(\textnormal{conv}(\nu(t)))$ for each $t\in\mathcal{I}$ , and the uniform weak convergence $\nu_{n}(t)\overset{w}{\to}\nu(t)$ is read as $\sup_{t\in\mathcal{I}}|\mathbb{E}_{\nu_{n}(t)}[\phi(X)]-\mathbb{E}_{\nu(t)}[\phi(X)]|\to 0$ for every continuous and bounded function $\phi:\mathcal{M}\to\mathbb{R}$ .

3.4 Geodesic distance depth

As a second notion of data depth on the geodesically complete manifold $(\mathcal{M},g_{R})$ , we consider the geodesic distance depth, the natural analog on the metric space $(\mathcal{M},\delta_{R})$ of the arc distance depth in Liu and Singh (1992) for data observations on circles and spheres. The geodesic distance depth is straightforward to calculate, also for high-dimensional matrices, as the only required operation is the computation of Riemannian distances between HPD matrices.

Definition 3.2.

(Geodesic distance depth) Let $\nu\in P_{1}(\mathcal{M})$ , then the geodesic distance depth of a point $y\in\mathcal{M}$ with respect to the distribution $\nu$ is defined as:

[TABLE]

Theorem 3.6.

The geodesic distance depth is a proper data depth function in the sense of Section 3.1, satisfying P.1–P.4 for distributions in $P_{1}(\mathcal{M})$ . The unique point of maximum depth coincides with the intrinsic median of the distribution.

Theorem 3.7.

The geodesic distance depth is continuous in $y$ as in P.5 for $y\in\textnormal{cl}(\mathcal{M})$ , the closure of $\mathcal{M}$ , and $\nu\in P_{1}(\mathcal{M})$ . That is, if $\delta_{R}(y_{n},y)\to 0$ with $y_{n}\in\mathcal{M}$ for all $n\in\mathbb{N}$ , then,

[TABLE]

The geodesic distance depth is uniformly continuous in $\nu$ as in P.6 for $y\in\mathcal{M}$ and $(\nu_{n})_{n\in\mathbb{N}}$ uniformly integrable. If $\nu_{n}\overset{w}{\to}\nu$ , then,

[TABLE]

Remark.

In order to compute the empirical depth $\textnormal{GDD}(\nu_{n},y)$ of each observation in a sample $y\in\{x_{1},\ldots,x_{n}\}$ with respect to the empirical distribution $\nu_{n}$ of the sample $\{x_{1},\ldots,x_{n}\}$ , it suffices to compute the $(n\times n)$ -dimensional distance matrix with $(i,j)$ -th entry $\delta_{R}(x_{i},x_{j})$ . This matrix is fully determined by $n(n-1)/2$ components, as the diagonal entries are zero and $\delta_{R}(x_{i},x_{j})=\delta_{R}(x_{j},x_{i})$ . In particular, in online applications where the depths need to be updated each time a new observation enters the database, we simply add one extra column and row to the distance matrix and update the depth values.

Remark.

A third notion of data depth on the Riemannian manifold $(\mathcal{M},g_{R})$ , closely related to the geodesic distance depth, is the intrinsic spatial depth. This is the natural manifold generalization of the spatial depth in Vardi and Zhang (2000) or Serfling (2002). For a distribution $\nu\in P_{1}(\mathcal{M})$ and a point $y\in\mathcal{M}$ , the intrinsic spatial depth is given by:

[TABLE]

The intrinsic spatial depth attains its maximum value $\textnormal{SD}(\nu,m)=1$ at the intrinsic median, since $\boldsymbol{E}_{\nu}\left[\frac{\textnormal{Log}_{m}(x)}{\delta_{R}(m,x)}\right]=\boldsymbol{0}$ by definition of the intrinsic median, and the depth is lower bounded by zero, which is a direct consequence of the triangle inequality combined with the fact that $\|\textnormal{Log}_{y}(x)\|_{y}=\delta_{R}(y,x)$ . The intrinsic spatial depth is closely associated to the geodesic distance depth in the sense that it is based on the gradient of the distance function, i.e., the gradient of $f_{x}(y)=\delta_{R}(y,x)$ for fixed $x$ is given by $\textnormal{grad}f_{x}(y)=\frac{\textnormal{Log}_{y}(x)}{\delta_{R}(y,x)}$ , see Fletcher et al. (2009).

3.5 Integrated geodesic distance depth

In order to generalize the pointwise geodesic distance depth to the depth of a curve $y(t)\in\mathcal{M}$ , with respect to a collection of marginal measures $\nu_{t}=\nu(t)$ for $t\in\mathcal{I}\subset\mathbb{R}$ , we replace the pointwise expected distance in Definition 3.2 by an integrated expected distance as:

[TABLE]

The integrated versions of the properties P.1 to P.6 continue to hold for the integrated geodesic distance depth and are straightforward generalizations of their pointwise analogs as in the case of the integrated intrinsic zonoid depth.

Theorem 3.8.

The integrated geodesic distance depth is a proper function depth function in the sense of Section 3.1, satisfying the integrated versions of properties P.1–P.4 for collections of marginal distribution $\nu(t)\in P_{1}(\mathcal{M})$ for $t\in\mathcal{I}$ . The unique curve of maximum depth coincides with the curve of pointwise intrinsic medians of the marginal distributions.

Proposition 3.9.

Let $y(t)\in\textnormal{cl}(\mathcal{M})$ and $\nu(t)\in P_{1}(\mathcal{M})$ for each $t\in\mathcal{I}$ , such that $y_{n}(t)\to y(t)$ uniformly in $t$ , i.e., $\sup_{t\in\mathcal{I}}\delta_{R}(y_{n}(t),y(t))\to 0$ . Then the integrated geodesic distance depth is continuous in $y$ as in P.5 in the sense that,

[TABLE]

If $y(t)\in\mathcal{M}$ , $(\nu_{n}(t))_{n\in\mathbb{N}}$ in $P_{1}(\mathcal{M})$ is a uniformly integrable sequence of measures uniform in $t$ , and $\nu_{n}(t)\overset{w}{\to}\nu(t)$ uniformly in $t$ . Then,

[TABLE]

where $y\in\mathcal{M}$ is read as $y(t)\in\mathcal{M}$ for each $t\in\mathcal{I}$ .

4 Aspects of robustness and efficiency

Depth-median breakdown.

An intuitive measure of robustness of the intrinsic depth functions is given by their breakdown points according to Hampel et al. (1986). In order to assess the robustness of the depth functions, a first step is to compute the breakdown point of the location estimator that maximizes the depth, i.e., the depth-median, as in Donoho and Gasko (1992) or Liu and Singh (1992), which we explain as follows. Let $X^{(n)}=\{x_{1},\ldots,x_{n}\}\in\mathcal{M}^{n}$ be an initial set of HPD observations and let $Y^{(m)}=\{y_{1},\ldots,y_{m}\}\in\mathcal{M}^{m}$ be a set of contaminating HPD observations. Denote $Z^{(n,m)}=X^{(n)}\cup Y^{(m)}$ and consider the –not necessarily in-sample– depth-median $T_{D}(Z^{(n,m)})=\min_{y\in\mathcal{M}}D(y,\nu_{n,m})$ , with $\nu_{n,m}$ the empirical distribution of $Z^{(n,m)}$ . The breakdown point of the depth-median is the smallest fraction of arbitrarily large contaminating observations that breaks down the estimator:

[TABLE]

Note that $\|\textnormal{Log}(x)\|_{F}=\delta_{R}(x,\textnormal{Id})$ , such that $\|\textnormal{Log}(x)\|_{F}<\infty$ for all $x\in\mathcal{M}$ , and $\|\textnormal{Log}(x)\|_{F}=\infty$ if $x$ is a singular matrix lying on the boundary of the metric space $(\mathcal{M},\delta_{R})$ . The breakdown point of the depth-median for the intrinsic zonoid depth is $\epsilon_{1}(X)=1/(n+1)$ as the depth-median coincides with the sample intrinsic mean and it requires only a single large contaminating observation to make the sample intrinsic mean arbitrarily large. The intrinsic zonoid depth-median is therefore not robust against outlying observations in terms of the depth-median breakdown point, analogous to the Euclidean case, as discussed in Mosler (2002). For the geodesic distance depth, the depth-median coincides with the intrinsic median and the intrinsic median in a geodesically complete manifold is known to have maximum breakdown point $\epsilon_{1}(X)=1/2$ , as shown in (Fletcher et al., 2011, Theorem 2).

Simultaneous depth-rank breakdown.

The above definition of the breakdown point gives us an intuitive measure of robustness for the depth-median. However, it does not tell us how robust the depth function is with respect to the depth-ranked observations in the sample itself. As a more informative robustness measure, we study the breakdown point simultaneously over all the depth-ranked observations in an initial sample of size $n$ . Let us write $z^{(n,m)}_{[i]}$ for the $i$ -th center-to-outward order statistic (or $i$ -th depth-ranked observation) with respect to a given depth measure. The simultaneous breakdown point is the smallest fraction of arbitrarily large contaminating observations that breaks down at least one of the first $n$ depth-ranked observations:

[TABLE]

For the intrinsic zonoid depth, if we break ties by assigning the same rank to each observation with equal depth, the simultaneous breakdown point is $\epsilon_{2}(X)=1/(n+1)$ . If we break ties by assigning increasing ranks based on increasing Frobenius norms $\|\textnormal{Log}(z_{i}^{(n,m)})\|_{F}$ , then the simultaneous breakdown point is $\epsilon_{2}(X)=2/(n+2)$ . This is illustrated as follows. Let $y_{1}$ be a first contaminating observation with $\|\textnormal{Log}(y_{1})\|_{F}>N_{M}$ , such that $\|\textnormal{Log}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}^{(n,1)})\|_{F}>M$ , where $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}^{(n,1)}$ denotes the intrinsic mean of the contaminated sample $Z^{(n,1)}$ . Assuming without loss of generality that $\|\textnormal{Log}(x_{i})\|_{F}\ll N_{M}$ for each $i=1,\ldots,n$ , the contaminating observation $y_{1}$ will be assigned depth-rank $n+1$ and the first $n$ depth-ranked observations do not break down. Let $y_{2}=\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}^{(n,1)}$ be a second contaminating observation, then $y_{2}$ has maximum depth by Theorem 3.1, and thus $z^{(n,2)}_{[1]}=y_{2}$ . Since we can choose $N_{M}>0$ , such that $\|\textnormal{Log}(y_{2})\|_{F}>M$ for any $M>0$ , it follows that $\epsilon_{2}(X)=2/(n+2)$ .

Proposition 4.1.

For the geodesic distance depth, the depth-ranked observations have maximum simultaneous breakdown point $\epsilon_{2}(X)=1/2$ equal to the median breakdown point $\epsilon_{1}(X)$ .

The above proposition asserts that if we observe a number of (large) contaminating observations $m$ smaller than the initial sample size $n$ , the geodesic distance depth will assign the contaminating observations to the ranks $n+1,\ldots,n+m$ . The depth-rankings with respect to the geodesic distance depth are therefore highly robust against arbitrarily large contaminating observations, in contrast to the intrinsic zonoid depth-rankings, also illustrated in Figure 4.

Example 4.1.

The above depth measures share the same robustness properties in terms of their depth-median and simultaneous depth-rank breakdown point. In general, this does not have to be the case. For instance, consider the simplicial or convex hull peeling depth on the real line (e.g., Liu et al. (1999)), which are highly robust in terms of their depth-median breakdown point $\epsilon_{1}(X)=1/2$ , as argued in Chen (1995) for the simplicial depth. In contrast, both data depths have simultaneous breakdown points $\epsilon_{2}(X)\leq 2/(n+2)$ as two well-placed large contaminating observations $y_{1},y_{2}\in\mathbb{R}$ can ensure that $\|z_{[n]}^{(n,m)}\|>M$ for any $M>0$ .

Remark.

The definitions of the depth-median and simultaneous breakdown points for the integrated depth functions are straightforward generalizations of the pointwise definitions above and it is easily verified that the breakdown points for the integrated depth functions coincide with their pointwise analogs.

Depth-median efficiency.

The robustness of the depth functions may result in a loss of efficiency of the depth-median as an intrinsic location estimator on the Riemannian manifold $(\mathcal{M},g_{R})$ . Figure 5 displays the relative efficiency of the geodesic distance depth-median $\hat{\mu}_{\textnormal{GDD}}$ , (i.e., the intrinsic median), relative to the intrinsic zonoid depth-median $\hat{\mu}_{\textnormal{ZD}}$ , (i.e., the intrinsic mean), in terms of the Riemannian mean squared error. That is,

[TABLE]

The depth-medians are computed from simulated samples $\boldsymbol{X}=X_{1},\ldots,X_{n}\overset{\textnormal{iid}}{\sim}\nu^{p}_{\textnormal{Id}}$ , where $\nu^{p}_{\textnormal{Id}}\in P_{2}(\mathcal{M})$ is a centrally symmetric distribution, such that the intrinsic mean and median coincide and are equal to the identity matrix. In particular, $X_{i}\overset{d}{=}\textnormal{Exp}(\sum_{k}Z_{k}e_{k})$ , where $(e^{1},\ldots,e^{d^{2}})\in\mathbb{H}_{d\times d}^{d^{2}}$ is an orthonormal basis of ( $\mathbb{H}_{d\times d},\langle\cdot,\cdot\rangle_{F}$ ), and $(Z_{k})_{k}$ are i.i.d. random variables from a $p$ -generalized normal distribution (Sinz et al. (2009)), with mean zero and standard deviation $\sigma_{p}=p^{1/p}\sqrt{\Gamma(3/p)/\Gamma(1/p)}$ , such that $\sigma_{2}=1$ . The family of $p$ -generalized normal distributions ( $p$ -GNDs) allows us to generate tail behavior that is either heavier $(p<2)$ or lighter $(p>2)$ than the normal distribution. For $p=2$ , the $p$ -GND coincides with the normal distribution. As shown in Figure 5, for random variables generated from a light-tailed $p$ -GND ( $p=5$ and $p=2$ and in particular small dimensions $d$ ), the intrinsic zonoid depth regions are better centered around the population mean of the generating distributions than the geodesic distance depth regions; for a heavier-tailed $p$ -GND ( $p=1.5$ ), the efficiency gain of the intrinsic zonoid depth-median relative to the geodesic distance depth-median diminishes.

4.1 Computational effort

To demonstrate the computational effort of the depth calculations in practice, Figure 6 displays median computation times in milliseconds (single-core Intel Xeon E5-2650, 2.40Ghz) of the intrinsic depths of a single $(d\times d)$ -dimensional HPD matrix with respect to a sample of $n$ HPD matrices calculated with the function pdDepth() in the accompanying R-package pdSpecEst, (including the intrinsic spatial depth computation times). On the left, the sample size is fixed at $n=500$ , and on the right the matrix-dimensions are fixed at $d=6$ . The displayed times are the median computation times of $100$ depth calculations for $100$ simulated samples, i.e., a total of $10^{4}$ depth calculations per scenario. The intrinsic zonoid depth requires that $d^{2}<n$ and for this reason there are several missing values in the left-hand image. Changing the default affine-invariant metric in the intrinsic depth computations to e.g., the Log-Euclidean, Cholesky, root-Euclidean or Euclidean metric –all are available in the function pdDepth()– the depth computation times are either similar or faster than the times displayed in Figure 6.

5 Application: Confidence sets for HPD matrices

As an illustrating application of the intrinsic depth functions, we construct intrinsic matrix confidence regions in the space of HPD or SPD matrices, such as confidence regions for estimated covariance or spectral density matrices. In the context of spectral density matrix estimation, a common approach is to construct asymptotic or bootstrapped confidence regions individually for each element of the spectral matrix, as demonstrated in Dai and Guo (2004) or Fiecas and von Sachs (2014) among others. Although this is a suitable approach to assess the variability of the estimator in each of the individual matrix components, this does not allow for the construction of simultaneous confidence regions across matrix elements, as the combined elementwise confidence intervals do not take the positive definite constraints of the full matrix object into account. In contrast, the intrinsic depth regions provide a natural way to construct joint matrix confidence regions taking into account the geometric constraints of the target space. This is illustrated by the construction of depth-based confidence regions for the intrinsic mean of a sample of i.i.d. HPD random matrices.

Consider $X_{1},\ldots,X_{n}\overset{\textnormal{iid}}{\sim}\nu_{\mu}$ , with $\nu_{\mu}\in P_{2}(\mathcal{M})$ centered around a population intrinsic mean $\mu\in\mathcal{M}$ . Denote $\bar{m}$ for the sample intrinsic mean, i.e., $\bar{m}:=\arg\min_{y}\sum_{i=1}^{n}\delta_{R}(y,X_{i})^{2}$ , then the intrinsic central limit theorem in (Said et al., 2015, Proposition 11) tells us that,

[TABLE]

where $Z$ is a random Hermitian matrix, such that $Z\overset{d}{=}\sum_{i}z_{i}e^{i}$ , with $(z_{1},\ldots,z_{d^{2}})^{\prime}\sim N_{d^{2}}(\boldsymbol{0},\Lambda)$ and $(e^{1},\ldots,e^{d^{2}})$ an orthonormal basis of $T_{\mu}(\mathcal{M})$ equipped with the associated metric $\langle\cdot,\cdot\rangle_{\mu}$ .

To cast this into a standard Euclidean framework, the Euclidean logarithmic map is given by $\textnormal{Log}_{\mu}(\bar{m})=\bar{m}-\mu$ . If $\sqrt{n}(\bar{m}-\mu)=Z$ for some fixed matrices $\bar{m},\mu,Z$ , then $\mu=\bar{m}-\frac{1}{\sqrt{n}}Z$ , and in the random setting, the construction of asymptotic confidence sets for $\mu$ is straightforward based on an estimate $\bar{m}$ and knowledge of the limiting distribution of $Z$ . In a curved Riemannian manifold, if $\sqrt{n}\textnormal{Log}_{\mu}(\bar{m})=Z$ , with $\bar{m},\mu,Z$ fixed, then in general $\mu\neq\textnormal{Exp}_{\bar{m}}\big{(}-\frac{1}{\sqrt{n}}Z\big{)}$ . Instead, $\mu=\textnormal{Exp}_{\bar{m}}\big{(}-\frac{1}{\sqrt{n}}\widetilde{Z}_{\mu}\big{)}$ , where $\widetilde{Z}_{\mu}$ is the parallel transport of the matrix $Z$ from the tangent space $T_{\mu}(\mathcal{M})$ at $\mu$ to the tangent space $T_{\bar{m}}(Z)$ at $\bar{m}$ . In the Euclidean setting $\widetilde{Z}_{\mu}=Z$ , as the parallel transport in a Euclidean or flat space equals the identity map, but on the Riemannian manifold $(\mathcal{M},g_{R})$ the parallel transport is nontrivial due to the nonzero curvature of the space and it depends on the unknown population mean $\mu$ . One working solution is to approximate the parallel transport using a plug-in estimator for $\mu$ , such as $\bar{m}$ , in which case the parallel transport is approximated by the identity map. Another approach that is considered here, is to construct approximate confidence sets for the intrinsic mean through resampling, which does not require knowledge of the population mean $\mu$ . That is, (i) generate bootstrap intrinsic sample means $\bar{m}^{*}_{1},\ldots,\bar{m}^{*}_{B}$ by resampling with replacement from $X_{1},\ldots,X_{n}$ , (ii) define a percentile $100(1-\alpha)\%$ confidence region for $\mu$ in the same fashion as Yeh and Singh (1997) or Wei and Lee (2012) through the trimmed depth-region:

[TABLE]

where $\bar{\nu}^{*}_{B}$ is the empirical distribution of $\bar{m}_{1}^{*},\ldots,\bar{m}^{*}_{B}$ . First-order convergence of the percentile confidence regions to the asymptotically correct confidence regions, as $n$ and $B$ tend to infinity, follows in the same fashion as in Yeh and Singh (1997). The proof relies on the uniform continuity property P.6, satisfied by both the intrinsic zonoid and geodesic distance depth.

Remark.

Note that the depth-based confidence regions are equivariant under matrix congruence transformations of the sample $a\ast\boldsymbol{X}=\{a\ast X_{1},\ldots,a\ast X_{n}\}$ , with $a\in\textnormal{GL}(d,\mathbb{C})$ , in the sense that $\textnormal{CR}_{1-\alpha}(a\ast\boldsymbol{X})=\{a\ast x\,:\,x\in\textnormal{CR}_{1-\alpha}(\boldsymbol{X})\}$ . This is an immediate consequence of property P.1 and the fact that the intrinsic mean is general linear congruence equivariant, i.e., $\mathbb{E}_{\nu}[a\ast X]=a\ast\mathbb{E}_{\nu}[X]$ .

Table 1 displays the empirical coverage of the percentile bootstrap confidence regions for simulated samples $X_{1},\ldots,X_{n}\overset{\textnormal{iid}}{\sim}\nu^{p}_{\textnormal{Id}}$ , with $\nu^{p}_{\textnormal{Id}}\in P_{2}(\mathcal{M})$ a centrally symmetric distribution around the identity matrix simulated from a $p$ -generalized normal distribution ( $p$ -GND) equivalent to the data generating processes in Figure 5. The column Ave.- $\beta_{*}$ displays the average lower depth confidence bounds, using the notation for $\beta_{*}$ as in eq.(LABEL:eq:5.1). The column Ave.-Size displays the distance of the center of the confidence ball to the furthest boundary, i.e., $\max_{\{i:D(\bar{m}_{i}^{*},\bar{\nu}_{B}^{*})\geq\beta_{*}\}}\delta_{R}(\bar{m},\bar{m}_{i}^{*})$ , averaged across simulations, and the coverage is the proportion of times the identity matrix has a depth value larger or equal to the lower depth bound $\beta_{*}$ .

6 Analysis of multicenter clinical trial data

The intrinsic data depth functions provide a fast and intuitive procedure to explore samples of covariance matrices by identifying most central or most outlying covariance matrices, based on the Riemannian geometry of the space. This is illustrated by the exploratory analysis of a collection of sample covariance matrices obtained from 246 clinical centers (C1-C246), which have been anonymized for reasons of confidentiality. For each clinical center, medical analysts have recorded the height (ht), weight (wt), systolic blood pressure (systol) and diastolic blood pressure (diastol) for a number of clinical center patients. As part of a broader analysis, we explore the variability among clinical centers in terms of the second-order behavior, i.e., the variance-covariance structure, of the measured variables. On the one hand, we wish to identify outlying clinical centers to be flagged for further inspection or removal in subsequent data analysis. On the other hand, we are interested in the average or mean behavior of the sample covariance matrices across clinical centers.

Addressing the first objective, the left image in Figure 7 displays the 15 most central depth-ranked clinical centers (from left to right, with most central clinic C107) based on the geodesic distance depth applied to the collection of 246 $(4\times 4)$ -dimensional symmetric positive definite covariance matrices. The bottom rows display the six symmetric cross-correlations ht-wt, systol-wt, diastol-ht, systol-wt, diastol-wt and diastol-systol. In addition, the top rows display the four variances ht-ht, wt-wt, systol-systol and diastol-diastol, providing information about the scale of the covariance matrices. The right image in Figure 7 displays the 15 most outlying depth-ranked clinical centers (from right to left, with most outlying clinic C224) based on the geodesic distance depth in the same fashion. The center-to-outward orderings obtained via the intrinsic zonoid depth are comparable and can be found in the supplementary material. We point out that the data depth functions capture clinical centers that are outlying primarily in terms of the correlation- or covariance-structure, (e.g., center C191), primarily in terms of the variance-structure, (e.g., center C170), or both, (e.g., center C71). Regarding the second objective, to assess the average behavior across covariance matrices, we display in Figure 8 the intrinsic sample mean of the set of 246 sample covariance matrices across clinical centers, including a 95- $\%$ intrinsic geodesic distance depth percentile bootstrap confidence region. Here, the left-hand image displays the four variances and the right-hand image displays the six cross-correlations analogous to the decomposition in Figure 7. The grey confidence region displays the bootstrapped sample means contained in the confidence region $\textnormal{CR}_{0.95}(\boldsymbol{X})$ . In particular, a covariance matrix $y\in\mathbb{P}_{4\times 4}$ is included in the confidence region $\textnormal{CR}_{0.95}(\boldsymbol{X})$ if and only if $\textnormal{GDD}(y,\bar{\nu}^{*}_{B})\geq\beta_{*}$ , where $\bar{\nu}^{*}_{B}$ is the empirical distribution of the bootstrapped sample means and $\beta_{*}$ denotes the lower depth-bound as in Section 5.

7 Concluding remarks

In this paper, we studied intrinsic data depth measures acting on the Riemannian manifold of symmetric or Hermitian PD matrices. The primary focus of this work is on the Riemannian manifold equipped with the affine-invariant metric, as this is the only metric that is invariant under congruence transformation of the data as described in property P.1 in Section 3.1. However, the construction of the depth functions does not fundamentally rely on the affine-invariant metric and the equivalent notions of properties P.2 to P.6 are expected to hold for any Riemannian metric that constitutes a geodesically complete manifold, such as the Log-Euclidean metric as discussed in Arsigny et al. (2006) among others. For each of the proposed intrinsic depth functions, (including the intrinsic spatial depth), the sample data depth values are straightforward to compute and remain computationally efficient also for relatively high-dimensional matrices, with implementations directly available in the R-package pdSpecEst, Chau (2017). As such, the data depths serve as an easy-to-use data exploration tool, but also provide a practical framework for inference in the context of random samples of HPD matrices, as illustrated in Section 5 through the construction of depth-based confidence regions.

Additional material available in the package pdSpecEst includes implementations of several intrinsic rank-based hypothesis tests, replacing the ordinary ranks by the depth-induced ranks analogous to Liu and Singh (1993), Chenouri and Small (2012), or (Mosler, 2002, Chapter 5) for samples of Euclidean vectors. Another interesting application of the intrinsic data depth is depth-based classification or clustering for groups or samples of covariance matrices analogous to e.g., Li et al. (2012). To conclude, Hermitian or symmetric positive definite matrices play an important role in many different fields of statistical research, see Pennec et al. (2006), and it is of interest to apply the intrinsic data depths in other contexts than demonstrated in this paper. For instance, applied to diffusion tensor imaging, the depth functions show potential for fast detection of anomalies or artifacts in large collections of SPD diffusion tensors.

Acknowledgements

The authors gratefully acknowledge the financial support from the following agencies and projects: the Belgian Fund for Scientific Research FRIA/FRS-FNRS (J. Chau), the contract ‘Projet d’Actions de Recherche Concertées’ (ARC) No. 12/17-045 of the ‘Communauté française de Belgique’ (J. Chau and R. von Sachs), IAP research network P7/06 of the Belgian government (R. von Sachs), the US National Science Foundation and KAUST (H. Ombao). We thank Lieven Desmet and the SMCS/UCL for providing access to the anonymized clinical trial data. Computational resources have been provided by the supercomputing facilities of the CISM/UCL and the CÉCI funded by the FRS-FNRS under convention 2.5020.11.

8 Appendix I: Proofs

8.1 Proof of Theorem 3.1

Proof.

P.1 This is a direct consequence of the claim that the following two events are equivalent:

[TABLE]

with $\zeta_{y}$ the probability measure of $\textnormal{Log}_{y}(X)$ and $\zeta_{a,y}$ the probability measure of $\textnormal{Log}_{a\ast y}(a\ast X)$ , where $X$ has probability measure $\nu$ . Here, the Euclidean zonoid trimmed region $D_{\alpha}(\zeta_{y})$ is represented as a set of $(d\times d)$ -dimensional Hermitian matrices, instead of an equivalent set of $d^{2}$ -dimensional real basis component vectors, as in Section 3.2, and $\boldsymbol{0}_{d\times d}$ is the zero matrix. For $\alpha=0$ , the equivalence in eq.(8.1) is true by definition, since $D_{0}(\zeta_{y})=D_{0}(\zeta_{a,y})=\mathbb{R}^{d\times d}$ .

Suppose that $\boldsymbol{0}_{d\times d}\in D_{\alpha}(\zeta_{y})$ for some $0<\alpha\leq 1$ . Noting that $T_{y}(\mathcal{M})$ can be identified by the real vector space of Hermitian matrices $\mathcal{H}$ for each $y\in\mathcal{M}$ , by definition of the zonoid $\alpha$ -trimmed region, there exists a measurable function $\tilde{g}:\mathcal{H}\to[0,\frac{1}{\alpha}]$ , such that,

[TABLE]

It is straightforward to verify that for each $a\in\textnormal{GL}(d,\mathbb{C})$ and $x,y\in\mathcal{M}$ , $\textnormal{Log}_{a\ast y}(a\ast X)=a\ast\textnormal{Log}_{y}(x)$ . Define $g(z)=\tilde{g}(a^{-1}\ast z)$ , then $g:\mathcal{H}\to[0,\frac{1}{\alpha}]$ is a measurable function such that,

[TABLE]

and,

[TABLE]

Therefore $\boldsymbol{0}_{d\times d}\in D_{\alpha}(\zeta_{a,y})$ . The other direction follows by a similar argument, using that $a\neq\boldsymbol{0}_{d\times d}$ .

P.2 The zonoid trimmed region $D_{1}(\zeta_{y})$ contains the single point $\boldsymbol{E}_{\nu}[\textnormal{Log}_{y}(X)]$ by construction. The deepest point $y\in\mathcal{M}$ is therefore characterized by the point that satisfies $\boldsymbol{E}_{\nu}[\textnormal{Log}_{y}(X)]=\boldsymbol{0}_{d\times d}$ . By eq.(2.4) in the main document, on the Riemannian manifold $\mathcal{M}$ with $\nu\in P_{2}(\mathcal{M})$ , this point is the uniquely existing geometric expectation of the distribution $\nu$ .

P.3 Using the equivalent definition $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\sup\{\alpha:y\in D_{\alpha}^{\mathcal{M}}(\nu)\}$ , by construction $D_{\alpha}^{\mathcal{M}}(\nu)$ is a geodesically convex set that contains the geometric mean $\mu:=\mathbb{E}_{\nu}[X]$ for each $\alpha\in[0,1]$ . Also, $D_{\alpha_{1}}^{\mathcal{M}}(\nu)\subseteq D_{\alpha_{2}}^{\mathcal{M}}(\nu)$ for each $1\geq\alpha_{1}\geq\alpha_{2}\geq 0$ . Combining the above arguments, it follows that a geodesic curve $\textnormal{Exp}_{\mu}(th)$ , with $t\geq 0$ increasing, has monotone non-increasing depth as it moves further away from the center $\mu$ .

P.4 With the same notation as above, for $\alpha\in(0,1]$ we claim that the sets $D_{\alpha}^{\mathcal{M}}(\nu)$ are closed and bounded, and therefore also compact by the Hopf-Rinow theorem. The fact that the sets are closed follows directly from the definition of $D_{\alpha}^{\mathcal{M}}(\nu)$ . The fact that they are bounded is seen as follows; for $\alpha>0$ , by construction $D_{\alpha}^{\mathcal{M}}(\nu)\subset\mathcal{M}$ . Therefore, if $y\in D_{\alpha}^{\mathcal{M}}(\nu)$ , necessarily $\delta_{R}(\textnormal{Id},y)<\infty$ , which follows by the fact that both Id and $y$ are elements of $\mathcal{M}$ , combined with (Bhatia, 2009, Theorem 6.1.6). Let $(y_{n})_{n\in\mathbb{N}}$ be an unbounded sequence, such that $\|\textnormal{Log}(y_{n})\|_{F}\to\infty$ as $n\to\infty$ . The divergence $\|\textnormal{Log}(y_{n})\|_{F}\to\infty$ implies in particular also that $\delta_{R}(\textnormal{Id},y_{n})\to\infty$ , which violates the boundedness (or compactness) of $D_{\alpha}^{\mathcal{M}}(\nu)$ for $a\in(0,1]$ , and therefore we must have $\lim_{n\to\infty}\textnormal{ZD}_{\mathcal{M}}(\nu,y_{n})=\lim_{n\to\infty}\sup\{\alpha:y_{n}\in D^{\mathcal{M}}_{\alpha}(\nu)\}=0$ . ∎

8.2 Proof of Lemma 3.2

Proof.

By definition of the intrinsic zonoid trimmed regions $D_{\alpha}^{\mathcal{M}}(\nu)=\{y\in\mathcal{M}:\boldsymbol{0}_{d\times d}\in D_{\alpha}(\zeta_{y})\}$ with $D_{\alpha}(\zeta_{y})$ as in eq.(8.1). The distribution $\zeta_{y}$ has finite first moment with respect to the Riemannian metric in $T_{y}(\mathcal{M})$ , since

[TABLE]

using eq.(2.3) in the main document and the fact that $\nu\in P_{2}(\mathcal{M})\subset P_{1}(\mathcal{M})$ . By (Mosler, 2002, Theorem 3.13) for a probability measure $\zeta_{y}$ defined on $T_{y}(\mathcal{M})\cong\mathbb{R}^{d^{2}}$ with finite first moments,

[TABLE]

where $\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y})$ denotes the convex hull of the support of $\zeta_{y}$ in $T_{y}(\mathcal{M})\cong\mathbb{R}^{d^{2}}$ , based on the Riemannian metric on $T_{y}(\mathcal{M})$ , i.e., a rescaled Euclidean metric. Using the above result, we write out:

[TABLE]

where the last step follows by definition $\textnormal{conv}(\nu)$ as the geodesic convex hull of the support of $\nu$ on the manifold. ∎

8.3 Proof of Theorem 3.3

8.3.1 Continuity in $y$ (P.5)

Proof.

We argue that the map $y\mapsto\textnormal{ZD}_{\mathcal{M}}(\nu,y)$ is both upper- and lower-semicontinuous for $y\in\textnormal{conv}(\nu)$ .

Upper-semicontinuity: the map is upper-semincontinuous if and only if for each $\alpha\in[0,1]$ the sets $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)<\alpha\}$ are open in $\textnormal{conv}(\nu)$ or equivalently the sets $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)\geq\alpha\}$ are closed in $\textnormal{conv}(\nu)$ . If $\alpha=0$ , $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)\geq\alpha\}=\textnormal{conv}(\nu)$ , and $\textnormal{conv}(\nu)$ is closed in itself. If $\alpha>0$ , note that we can rewrite $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)\geq\alpha\}=\{y\in\textnormal{conv}(\nu):y\in D^{\mathcal{M}}_{\alpha}(\nu)\}$ , since on the one hand, if $y\in D^{\mathcal{M}}_{\alpha}(\nu)$ , then $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\sup\{\beta:y\in D^{\mathcal{M}}_{\beta}(\nu)\}\geq\alpha$ , and on the other hand, if $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\beta\geq\alpha$ , then $y\in D^{\mathcal{M}}_{\beta}(\nu)\subseteq D^{\mathcal{M}}_{\alpha}(\nu)$ by nestedness of the intrinsic zonoid trimmed regions. For each $\alpha>0$ , by construction $D_{\alpha}^{\mathcal{M}}(\nu)$ is closed, therefore $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)\geq\alpha\}$ is also closed.

Lower-semicontinuity: the map is lower-semicontinuous if and only if for each $\alpha\in[0,1]$ the sets $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)\leq\alpha\}$ are closed in $\textnormal{conv}(\nu)$ or equivalently the sets $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)>\alpha\}$ are open in $\textnormal{conv}(\nu)$ . If $\alpha=1$ , $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)>\alpha\}=\emptyset$ , and the empty set is open in $\textnormal{conv}(\nu)$ . If $\alpha=0$ , $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)>\alpha\}=\textnormal{conv}(\nu)$ by Lemma 3.2, and $\textnormal{conv}(\nu)$ is open in itself. If $0<\alpha<1$ , note that we can rewrite $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)>\alpha\}=\{y\in\textnormal{conv}(\nu):y\in D^{\mathcal{M}}_{\alpha+}(\nu)\}$ , where,

[TABLE]

with $g$ measurable. To see that the set-equivalence is true: on the one hand, if $y\in D^{\mathcal{M}}_{\alpha+}(\nu)$ , then $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\sup\{\beta:y\in D_{\beta}^{\mathcal{M}}(\nu)\}>\alpha$ , since $[0,1/\alpha)\subset[0,1/\alpha]$ . On the other hand, if $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\beta>\alpha$ , take $\epsilon>0$ sufficiently small such that $\beta>\beta-\epsilon>\alpha$ , then $[0,\frac{1}{\beta}]\subset[0,\frac{1}{\beta-\epsilon})\subset[0,\frac{1}{\alpha})$ . As a consequence, $y\in D^{\mathcal{M}}_{\beta}(\nu)\subseteq D^{\mathcal{M}}_{\alpha+}(\nu)$ by nestedness of the intrinsic zonoid trimmed regions. For $0<\alpha<1$ , distinguish between two cases: (i) $D^{\mathcal{M}}_{\alpha+}(\nu)=\textnormal{conv}(\nu)$ , then the set is open as $\textnormal{conv}(\nu)$ is open in itself, (ii) $D^{\mathcal{M}}_{\alpha+}(\nu)\subset\textnormal{conv}(\nu)$ . In this case, writing $r\partial D^{\mathcal{M}}_{\alpha+}(\nu)$ for the relative boundary of the geodesic convex set $D_{\alpha+}^{\mathcal{M}}(\nu)$ in $\textnormal{conv}(\nu)$ , we note that $r\partial D^{\mathcal{M}}_{\alpha+}(\nu)=r\partial D^{\mathcal{M}}_{\alpha}(\nu)$ . Here, the relative boundary of $D_{\alpha}^{\mathcal{M}}(\nu)$ is characterized by those points in $D_{\alpha}^{\mathcal{M}}(\nu)$ for which the weighting function $g$ attains the maximum value $\frac{1}{\alpha}$ . Since $D^{\mathcal{M}}_{\alpha+}(\nu)\cap r\partial D^{\mathcal{M}}_{\alpha+}(\nu)=D^{\mathcal{M}}_{\alpha+}(\nu)\cap r\partial D^{\mathcal{M}}_{\alpha}(\nu)=\emptyset$ , it follows that $D^{\mathcal{M}}_{\alpha+}(\nu)$ is open. By combining the above arguments, we conclude that $\{y\in\textnormal{conv}(\nu):\textnormal{ZD}_{\mathcal{M}}(\nu,y)>\alpha\}$ is open for each $\alpha\in[0,1]$ .

Since the map $y\mapsto\textnormal{ZD}_{\mathcal{M}}(\nu,y)$ is both upper- and lower-semicontinuous on $\textnormal{conv}(\nu)$ it is also continuous on $\textnormal{conv}(\nu)$ . ∎

8.3.2 Uniform continuity in $\nu$ (P.6)

Proof.

Pointwise convergence of depths: first, we show pointwise convergence of $\textnormal{ZD}_{\mathcal{M}}(\nu_{n},y)$ to $\textnormal{ZD}_{\mathcal{M}}(\nu,y)$ for each $y\in\textnormal{rint}(\textnormal{conv}(\nu))$ , where $\textnormal{rint}(\textnormal{conv}(\nu))$ denotes the relative interior of the geodesic convex set $\textnormal{conv}(\nu)$ . We note that $y\in\textnormal{rint}(\textnormal{conv}(\nu))$ if and only if $\boldsymbol{0}_{d\times d}\in\textnormal{rint}(\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y}))$ , where $\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y})$ is the convex hull of the support of $\zeta_{y}$ in $T_{y}(\mathcal{M})$ as in the proof of Lemma 3.2. This is seen as follows: by Lemma 3.2, $y\in\textnormal{conv}(\nu)$ if and only if $\exists\>\alpha>0$ , such that $y\in D_{\alpha}^{\mathcal{M}}(\nu)$ , but this is equivalent to $\boldsymbol{0}_{d\times d}\in D_{\alpha}(\zeta_{y})$ which holds if and only if $\boldsymbol{0}_{d\times d}\in\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y})$ by (Mosler, 2002, Theorem 3.13). Since the sets $\{y:y\in\textnormal{conv}(\nu)\}$ and $\{y:\boldsymbol{0}_{d\times d}\in\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y})\}$ are equivalent their relative interiors are equivalent as well. By Definition 3.1, $\textnormal{ZD}_{\mathcal{M}}(\nu_{n},y)=\textnormal{ZD}_{\mathbb{R}^{d^{2}}}(\zeta^{n}_{y},\vec{0})$ , where $\zeta^{n}_{y}$ is the distribution of $\textnormal{Log}_{y}(X)$ as a $d^{2}$ -dimensional real basis component vector, with $X\sim\nu_{n}$ , such that $\zeta_{y}^{n}\overset{w}{\to}\zeta_{y}$ . Similarly, $\textnormal{ZD}_{\mathcal{M}}(\nu,y)=\textnormal{ZD}_{\mathbb{R}^{d^{2}}}(\zeta_{y},\vec{0})$ . By the same argument as in the proof of Lemma 3.2, we know that $\zeta^{n}_{y},\zeta_{y}\in P_{1}(T_{y}(\mathcal{M}))$ for each $n\in\mathbb{N}$ , where $P_{1}(T_{y}(\mathcal{M}))$ denotes the set of probability measures on $T_{y}(\mathcal{M})$ with finite first moment, i.e., if $\zeta\in P_{1}(T_{y}(\mathcal{M}))$ then $\int_{T_{y}(\mathcal{M})}\|z\|_{y}\>d\zeta_{y}(z)<\infty$ . Furthermore, the sequence of measures $(\zeta^{n}_{y})_{n\in\mathbb{N}}$ is uniformly integrable with respect to the Riemannian metric in $T_{y}(\mathcal{M})$ , since for any $y\in\mathcal{M}$ ,

[TABLE]

By (Mosler, 2002, Theorem 4.6), under these conditions, for $y\in\textnormal{rint}(\textnormal{conv}(\nu))$ or equivalently $\boldsymbol{0}_{d\times d}\in\textnormal{rint}(\textnormal{conv}_{T_{y}(\mathcal{M})}(\zeta_{y}))$ , it follows that,

[TABLE]

Uniform convergence of depths: uniform depth convergence now follows from the pointwise depth convergence above by a generalized version of the proof of (Dyckerhoff, 2016, Theorem 4.8) for the complete metric space $(\mathcal{M},\delta_{R})$ , using Lemma 3.2 and the fact that $\textnormal{ZD}_{\mathcal{M}}(\nu,y)$ is a normed geodesically convex depth, continuous in $y$ by the first part of Theorem 3.3. Since the proof is completely analogous to the proof of (Dyckerhoff, 2016, Theorem 4.8), we omit the details here. Note that the only required modification is to replace the Euclidean metric space by the complete metric space $(\mathcal{M},\delta_{R})$ . In particular, Euclidean open balls, convex sets and convergence are replaced by geodesic open balls, geodesic convex sets and convergence in the Riemannian distance function respectively.

By the generalized proof of (Dyckerhoff, 2016, Theorem 4.8), the depths $(\textnormal{ZD}_{\mathcal{M}}(\nu_{n},y_{0}))_{n\in\mathbb{N}}$ are continuously convergent for $y_{0}\in\textnormal{rint}(\textnormal{conv}(\nu))$ . That is, for $y_{n}\to y_{0}$ in the sense that $\delta_{R}(y_{n},y_{0})\to 0$ , also $\lim_{n\to\infty}\textnormal{ZD}_{\mathcal{M}}(\nu_{n},y_{n})=\textnormal{ZD}(\nu,y_{0})$ . By (Dyckerhoff, 2016, Proposition A.1), since $\mathcal{M}$ is a metric space, continuous convergence of the depths implies compact convergence, i.e., for every compact set $M\subseteq\textnormal{rint}(\textnormal{conv}(\nu))$ ,

[TABLE]

Consequently, by (Dyckerhoff, 2016, Theorem 4.4), compact convergence implies uniform convergence, since the arguments in the proof of (Dyckerhoff, 2016, Theorem 4.4) continue to hold for the intrinsic zonoid depth defined on the complete metric space $\mathcal{M}$ , where closed and bounded subsets are compact. ∎

8.4 Proof of Theorem 3.4 and Proposition 3.5

Proof.

Properties P.1–P.4 follow directly by Theorem 3.1, using the definition of the depth as the integrated pointwise zonoid depth (integrated over $t\in\mathcal{I}$ ).

For the first part (P.5) of Proposition 3.5: using that $\sup_{t\in\mathcal{I}}\delta_{R}(y_{n}(t),y(t))\to 0$ , by the first part of Theorem 3.3, $\textnormal{ZD}_{\mathcal{M}}(\nu(t),y_{n}(t))\to\textnormal{ZD}_{\mathcal{M}}(\nu(t),y(t))$ uniformly over $t\in\mathcal{I}$ . By definition of the integrated intrinsic zonoid depth also,

[TABLE]

by the pointwise convergence and the fact that the depth function $\textnormal{ZD}_{\mathcal{M}}(\cdot,\cdot)\in[0,1]$ is bounded.

For the second part (P.6) in Proposition 3.5: under the given assumptions, by the second part of Theorem 3.3,

[TABLE]

and similarly as above,

[TABLE]

using the pointwise convergence and the fact that the depth function $\textnormal{ZD}_{\mathcal{M}}(\cdot,\cdot)\in[0,1]$ is bounded. ∎

8.5 Proof of Theorem 3.6

Proof.

P.1 This follows directly from the definition of the depth by the fact that the map $x\mapsto a\ast x$ with $a\in\textnormal{GL}(d,\mathbb{C})$ is distance preserving, i.e., $\delta_{R}(a\ast x,a\ast y)=\delta_{R}(x,y)$ for each $x,y\in\mathcal{M}$ .

P.2 Since $\int_{\mathcal{M}}\delta_{R}(y,x)\>\nu(dx)\geq 0$ and $\exp(-z)$ is strictly decreasing in $z\geq 0$ , the point of maximum depth is attained at $y=\arg\min_{z\in\textnormal{supp}(\nu)}\int_{\mathcal{M}}\delta_{R}(z,x)\>\nu(dx)$ . By eq.(2.5) in the main document, on the Riemannian manifold $\mathcal{M}$ with $\nu\in P_{1}(\mathcal{M})$ , this point is the uniquely existing geometric median of the distribution $\nu$ .

P.3 By the proof of (Fletcher et al., 2009, Theorem 1) and an application of Leibniz’s integral rule, $y\mapsto\boldsymbol{E}_{\nu}[\delta_{R}(y,X)]$ is a (strictly) convex function, and by P.2 it attains its unique minimum at $m:=\textnormal{GM}\nu(X)$ . This implies that $\boldsymbol{E}_{\nu}[\delta_{R}(\textnormal{Exp}_{m}(th),X)]$ is a nondecreasing function for $t\geq 0$ , where $\textnormal{Exp}_{m}(th)$ is a geodesic curve emanating from $m$ with unit tangent vector $h$ . As a consequence $\textnormal{GDD}(\nu,\textnormal{Exp}_{m}(th))=\exp\left(-\boldsymbol{E}_{\nu}[\delta_{R}(\textnormal{Exp}_{m}(th),X)]\right)$ is monotone non-increasing for $t\geq 0$ .

P.4 Let $(y_{n})_{n\in\mathbb{N}}$ be an unbounded sequence such that $\|\textnormal{Log}(y_{n})\|_{F}\to\infty$ as $n\to\infty$ , then also $\delta_{R}(y_{n},x)=\|\textnormal{Log}(x^{-1/2}\ast y_{n})\|_{F}\to\infty$ for each $x\in\mathcal{M}$ , and as a consequence $\textnormal{GDD}(\nu,y_{n})=\exp(-\boldsymbol{E}_{\nu}[\delta_{R}(y_{n},X)])\to 0$ . ∎

8.6 Proof of Theorem 3.7

8.6.1 Continuity in $y$ (P.5)

Proof.

First, suppose that $(y_{n})_{n\in\mathbb{N}}$ is an unbounded sequence $\|\textnormal{Log}(y_{n})\|_{F}\to\infty$ as $n\to\infty$ , i.e., $y_{n}\to y$ , where $y$ is a singular matrix. Since $\textnormal{GDD}(\nu,y)=0$ , by P.4 in Theorem 3.6, $\lim_{n\to\infty}\textnormal{GDD}(\nu,y_{n})=\textnormal{GDD}(\nu,y)$ . Second, suppose that $(y_{n})_{n\in\mathbb{N}}$ is a bounded sequence, i.e., $\sup_{n\in\mathbb{N}}\|\textnormal{Log}(y_{n})\|_{F}=\sup_{n\in\mathbb{N}}\delta_{R}(y_{n},\textnormal{Id})<\infty$ . Since $\nu\in P_{1}(\mathcal{M})$ , there exists an $y_{0}\in\mathcal{M}$ such that $\int_{\mathcal{M}}\delta_{R}(y_{0},x)\>\nu(dx)<\infty$ . By the triangle inequality,

[TABLE]

using that $\delta_{R}(y_{0},\textnormal{Id})<\infty$ as both Id and $y_{0}$ are elements of $\mathcal{M}$ , (see (Bhatia, 2009, Theorem 6.1.6)). We show continuity directly from the definition of the geodesic distance depth. The function $z\mapsto\exp(-z)$ is continuous in $z$ , also the function $z\mapsto\delta_{R}(z,x)$ is continuous in $z$ , since $\delta_{R}(z,x)=\|\textnormal{Log}(x^{-1/2}\ast z)\|_{F}$ is a composition of continuous functions. Furthermore, by the dominated convergence theorem, $\lim_{n\to\infty}\int_{\mathcal{M}}\delta_{R}(y_{n},x)\>\nu(dx)=\int_{\mathcal{M}}\lim_{n\to\infty}\delta_{R}(y_{n},x)\>\nu(dx)$ , since $\int_{\mathcal{M}}\sup_{n\in\mathbb{N}}\delta_{R}(y_{n},x)\>\nu(dx)<\infty$ . Combining these arguments, $\lim_{n\to\infty}\textnormal{GDD}(\nu,y_{n})=\textnormal{GDD}(\nu,y)$ . ∎

8.6.2 Uniform continuity in $\nu$ (P.6)

Proof.

We start by noting that the uniform integrability condition implies in particular that $\nu_{n}\in P_{1}(\mathcal{M})$ for each $n\in\mathbb{N}$ . Also, since $z\mapsto\delta_{R}(y,z)$ is continuous in $z$ , by the continuous mapping theorem $\delta_{R}(y,X_{n})\overset{d}{\to}\delta_{R}(y,X)$ , with $X_{n}\sim\nu_{n}$ and $X\sim\nu$ , and by Vitali’s convergence theorem $\int_{\mathcal{M}}\delta_{R}(y,x)\>\nu_{n}(dx)\to\int_{\mathcal{M}}\delta_{R}(y,x)\>\nu(dx)$ for any $y\in\mathcal{M}$ . Note that the convergence implies in particular also that $\nu\in P_{1}(\mathcal{M})$ . For two measures $\mu,\nu\in P_{1}(\mathcal{M})$ define their $L^{1}$ -Wasserstein distance as:

[TABLE]

where $\Gamma(\mu,\nu)$ denotes the collection of all probability measures on $\mathcal{M}\times\mathcal{M}$ with marginal measures $\mu$ and $\nu$ . Substituting $\mu=\delta_{y}$ , the point measure in $y$ , it follows that $W_{1}(\delta_{y},\nu)=\int_{\mathcal{M}}\delta_{R}(y,x)\ \nu(dx)$ . Therefore, a sufficient condition to ensure uniform convergence in $y\in\mathcal{M}$ of $\int_{\mathcal{M}}\delta_{R}(y,x)\>\nu_{n}(dx)$ to $\int_{\mathcal{M}}\delta_{R}(y,x)\>\nu(dx)$ , is $W(\nu_{n},\nu)\to 0$ , since

[TABLE]

where the last step follows by the reverse triangle inequality for the $L^{1}$ -Wasserstein distance. The manifold $\mathcal{M}$ is a complete separable metric space, and therefore by (Villani, 2009, Theorem 6.9) a necessary and sufficient condition for $W_{1}(\nu_{n},\nu)\to 0$ is that the sequence of probability measures $\nu_{n}$ converges weakly in $P_{1}(\mathcal{M})$ to $\nu$ , i.e., (i) $\nu_{n}\overset{w}{\to}\nu$ and (ii) $\int_{\mathcal{M}}\delta_{R}(y,x)\ \nu_{n}(dx)\to\int_{\mathcal{M}}\delta_{R}(y,x)\ \nu(dx)$ for any $y\in\mathcal{M}$ . Condition (i) holds by assumption, and condition (ii) has already been shown above.

The function $z\to\exp(-z)$ is uniformly continuous for $z\geq 0$ , therefore the uniform convergence of the geodesic distance depth follows as well since,

[TABLE]

∎

8.7 Proof of Theorem 3.8 and Proposition 3.9

Proof.

Properties P.1–P.4 follow directly by the pointwise depth properties in Theorem 3.6, using the definition of the depth in terms of the integrated Riemannian distance (integrated over $t\in\mathcal{I}$ ).

For the first part (P.5) of Proposition 3.9: using that $\sup_{t\in\mathcal{I}}(\delta_{R}(y_{n}(t),y(t))\to 0$ , by the first part of the proof in Theorem 3.7 also,

[TABLE]

and as a direct consequence $\lim_{n\to\infty}\int_{\mathcal{I}}\boldsymbol{E}_{\nu(t)}[\delta_{R}(y_{n}(t),X)]\>dt=\int_{\mathcal{I}}\boldsymbol{E}_{\nu(t)}[\delta_{R}(y(t),X)]\>dt$ . Using again that $z\mapsto\exp(-z)$ is continuous in $z$ , the composition converges as well and we conclude that $\lim_{n\to\infty}\textnormal{iGDD}(\nu,y_{n})=\textnormal{iGDD}(\nu,y)$ .

For the second part (P.6) of Proposition 3.9. Denote by $\xi_{n,y}(t)$ and $\xi_{y}(t)$ respectively the distributions of $\delta_{R}(y(t),X_{n}(t))$ and $\delta_{R}(y(t),X(t))$ , such that $X_{n}(t)\sim\nu_{n}(t)$ and $X(t)\sim\nu(t)$ . Let $\phi:\mathbb{R}\to\mathbb{R}$ be a continuous and bounded function and write $y\in\mathcal{M}$ for a curve with $y(t)\in\mathcal{M}$ for each $t\in\mathcal{I}$ . Then for any curve $y\in\mathcal{M}$ ,

[TABLE]

where the last step follows by the fact that for each $t\in\mathcal{I}$ the composition $x\mapsto\phi(\delta_{R}(y(t),x))$ is again a continuous and bounded function, and the fact that $\nu_{n}(t)\overset{w}{\to}\nu(t)$ uniformly in $t$ . Thus, for any curve $y\in\mathcal{M}$ , the weak convergence $\xi_{n,y}(t)\overset{w}{\to}\xi_{y}(t)$ holds as well uniformly in $t$ . By the uniform integrability of $(\nu_{n}(t))_{n\in\mathbb{N}}$ uniformly in $t$ , combined with Vitali’s convergence theorem, it follows that for each curve $y\in\mathcal{M}$ ,

[TABLE]

By the same argument as in the second part of the proof of Theorem 3.7, a sufficient condition for uniform convergence in $y\in\mathcal{M}$ of $\int_{\mathcal{I}}\boldsymbol{E}_{\nu_{n}(t)}[\delta_{R}(y(t),X)]\>dt$ to $\int_{\mathcal{I}}\boldsymbol{E}_{\nu(t)}[\delta_{R}(y(t),X)]\>dt$ is the condition $\sup_{t\in\mathcal{I}}W_{1}(\nu_{n}(t),\nu(t))\to 0$ . Again by (Villani, 2009, Theorem 6.9), the convergence $\sup_{t\in\mathcal{I}}W_{1}(\nu_{n}(t),\nu(t))\to 0$ is implied by the conditions (i) $\nu_{n}(t)\overset{w}{\to}\nu(t)$ uniformly in $t$ , which holds by assumption and (ii) the convergence in eq.(8.5) pointwise in $y\in\mathcal{M}$ .

The function $z\to\exp(-z)$ is uniformly continuous for $z\geq 0$ , therefore the uniform convergence of the integrated geodesic distance depth follows as well,

[TABLE]

∎

8.8 Proof of Proposition 4.1

Proof.

First, we verify that $e_{2}(X)\leq 1/2$ .

Let $Y_{1}=\ldots=Y_{n}=p\in\mathcal{M}$ be $n$ contaminating observations, such that $\|\textnormal{Log}(p)\|_{F}\geq N$ for some $N>0$ . Denote $\nu_{n,n}$ for the empirical distribution of the contaminated sample $Z^{(n,n)}=\{X_{1},\ldots,X_{n}\}\cup\{Y_{1},\ldots,Y_{n}\}$ . For each $x\in\{X_{1},\ldots,X_{n}\}$ ,

[TABLE]

using the triangle inequality $\delta_{R}(p,X_{i})\leq\delta_{R}(p,x)+\delta_{R}(x,X_{i})$ for each $i=1,\ldots,n$ . Since $Y_{1}=\ldots=Y_{n}$ , $D(Y_{1},\nu_{n,n})=\ldots=D(Y_{n},\nu_{n,n})\geq D(x,\nu_{n,n})$ for each $x\in\{X_{1},\ldots,X_{n}\}$ . Therefore, $\|\textnormal{Log}(Z^{(n,n)}_{[1]})\|_{F}=\ldots=\|\textnormal{Log}(Z^{(n,n)}_{[n]})\|_{F}=\|\textnormal{Log}(p)\|_{F}\geq N$ , with $Z_{[i]}^{(n,n)}$ the $i$ -th depth ranked observation in the sample $Z^{(n,n)}$ . As we can choose $p\in\mathcal{M}$ , such that $\|\textnormal{Log}(p)\|_{F}\geq N$ for $N$ arbitrarily large, $\|\textnormal{Log}(Z^{(n,n)}_{[i]})\|_{F}$ with $1\leq i\leq n$ can be made arbitrarily large by adding $n$ contaminating observations. This implies that $\epsilon_{2}(X)\leq n/(2n)=1/2$ .

Second, we verify that $\epsilon_{2}(X)\geq 1/2$ .

Consider the contaminated sample $Z^{(n,m)}=\{X_{1},\ldots,X_{n}\}\cup\{Y_{1},\ldots,Y_{m}\}$ , with $m<n$ . If we can show that $D(y,\nu_{n,m})<D(x,\nu_{n,m})$ for each $y\in\{Y_{1},\ldots,Y_{m}\}$ and each $x\in\{X_{1},\ldots,X_{n}\}$ . Then $\forall\,i\in\{1,\ldots,n\}$ , $\exists j\in\{1,\ldots,n\}$ , such that $Z^{(n,m)}_{[i]}=X_{j}$ and consequently $\max_{i}\|\textnormal{Log}(Z_{[i]}^{(n,m)})\|_{F}\leq M$ , denoting $M:=\max_{i}\|\textnormal{Log}(X_{i})\|_{F}$ . The latter implies that it takes at least $m\geq n$ contaminating observations to make $\|\textnormal{Log}(Z^{(n,m)}_{[i]})\|_{F}$ arbitrarily large for $1\leq i\leq n$ , i.e., $\epsilon_{2}(X)\geq 1/2$ . It remains to show that $D(y,\nu_{n,m})<D(x,\nu_{n,m})$ for each $y\in\{Y_{1},\ldots,Y_{m}\}$ and each $x\in\{X_{1},\ldots,X_{n}\}$ .

Let $y\in\{Y_{1},\ldots,Y_{n}\}$ and $x\in\{X_{1},\ldots,X_{n}\}$ arbitrary, then:

[TABLE]

Let us denote $R:=\max_{i}\delta_{R}(x,X_{i})$ , $B:=\{p\in\mathcal{M}\,:\,\delta_{R}(p,x)\leq 2R\}$ and $\rho=\inf_{p\in B}\delta_{R}(p,y)$ .

First, by the triangle inequality $\delta_{R}(x,y)\leq 2R+\rho$ . Therefore, by the reverse triangle inequality, $\forall\,i=1,\ldots,m$ ,

[TABLE]

Also, by definition of $R$ and $\rho$ , $\forall\,i=1,\ldots,n$ ,

[TABLE]

Without loss of generality, assume that $\min_{i}\|\textnormal{Log}(Y_{i})\|_{F}\geq N$ , where $N\geq 2(n+1)R+M$ . Denoting Id for the identity matrix, it follows that,

[TABLE]

Here, we used two triangle inequalities and the fact that $\|\textnormal{Log}(z)\|_{F}=\delta_{R}(z,\textnormal{Id})$ by definition of the Riemannian distance. Combining eq.(8.7-8.9) above yields:

[TABLE]

where we used that $-2mR+(n-m)\rho>-2nR+\rho\geq 0$ by the fact that $m<n$ and $\rho\geq 2nR$ by eq.(8.9). Returning to eq.(8.8), it follows that $D(y,\nu_{n,m})<D(x,\nu_{n,m})$ . As this result holds for any $y\in\{Y_{1},\ldots,Y_{m}\}$ and $x\in\{X_{1},\ldots,X_{n}\}$ , we conclude that $\epsilon_{2}(X)\geq 1/2$ . Since also $\epsilon_{2}(X)\leq 1/2$ , it follows that $\epsilon_{2}(X)=1/2$ , which finishes the proof. ∎

9 Appendix II: Additional figures

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Arsigny et al. (2006) Arsigny, V., P. Fillard, X. Pennec, and N. Ayache (2006). Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine 56 (2), 411–421.
2Bhatia (2009) Bhatia, R. (2009). Positive Definite Matrices . Princeton University Press.
3Chau (2017) Chau, J. (2017). pd Spec Est: An Analysis Toolbox for Hermitian Positive Definite Matrices (v 1.2.1) .
4Chau and von Sachs (2017) Chau, J. and R. von Sachs (2017). Intrinsic wavelet regression for curves of Hermitian positive definite matrices. Ar Xiv preprint 1701.03314 .
5Chen (1995) Chen, Z. (1995). Bounds for the breakdown point of the simplicial median. Journal of Multivariate Analysis 55 (1), 1–13.
6Chenouri and Small (2012) Chenouri, S. and C. Small (2012). A nonparametric multivariate multisample test based on data depth. Electronic Journal of Statistics 6 , 760–782.
7Dai and Guo (2004) Dai, M. and W. Guo (2004). Multivariate spectral analysis using Cholesky decomposition. Biometrika 91 (3), 629–643.
8Donoho and Gasko (1992) Donoho, D. and M. Gasko (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. The Annals of Statistics , 1803–1827.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Intrinsic data depth for Hermitian positive definite matrices

Abstract

1 Introduction

2 Preliminaries

2.1 Geometry of HPD matrices

2.2 Probability distributions and random variables

2.3 Measures of centrality

Intrinsic mean.

Intrinsic median.

3 Data depth for HPD matrices

3.1 Depth properties

3.2 Intrinsic zonoid depth

Definition 3.1**.**

Theorem 3.1**.**

Lemma 3.2**.**

Theorem 3.3**.**

Example 3.1**.**

3.3 Integrated intrinsic zonoid depth

Theorem 3.4**.**

Proposition 3.5**.**

3.4 Geodesic distance depth

Definition 3.2**.**

Theorem 3.6**.**

Theorem 3.7**.**

3.5 Integrated geodesic distance depth

Theorem 3.8**.**

Proposition 3.9**.**

4 Aspects of robustness and efficiency

Depth-median breakdown.

Simultaneous depth-rank breakdown.

Proposition 4.1**.**

Example 4.1**.**

Depth-median efficiency.

4.1 Computational effort

5 Application: Confidence sets for HPD matrices

6 Analysis of multicenter clinical trial data

7 Concluding remarks

Acknowledgements

8 Appendix I: Proofs

8.1 Proof of Theorem 3.1

Proof.

8.2 Proof of Lemma 3.2

Proof.

8.3 Proof of Theorem 3.3

8.3.1 Continuity in yyy (P.5)

Proof.

8.3.2 Uniform continuity in ν\nuν (P.6)

Proof.

8.4 Proof of Theorem 3.4 and Proposition 3.5

Proof.

8.5 Proof of Theorem 3.6

Proof.

8.6 Proof of Theorem 3.7

8.6.1 Continuity in yyy (P.5)

Proof.

8.6.2 Uniform continuity in ν\nuν (P.6)

Proof.

8.7 Proof of Theorem 3.8 and Proposition 3.9

Proof.

8.8 Proof of Proposition 4.1

Proof.

9 Appendix II: Additional figures

Definition 3.1.

Theorem 3.1.

Lemma 3.2.

Theorem 3.3.

Example 3.1.

Theorem 3.4.

Proposition 3.5.

Definition 3.2.

Theorem 3.6.

Theorem 3.7.

Theorem 3.8.

Proposition 3.9.

Proposition 4.1.

Example 4.1.

8.3.1 Continuity in $y$ (P.5)

8.3.2 Uniform continuity in $\nu$ (P.6)

8.6.1 Continuity in $y$ (P.5)

8.6.2 Uniform continuity in $\nu$ (P.6)