Geometrical Smeariness -- A new Phenomenon of Fr\'echet Means

Benjamin Eltzner

arXiv:1908.04233·math.ST·October 8, 2020

Geometrical Smeariness -- A new Phenomenon of Fr\'echet Means

Benjamin Eltzner

PDF

TL;DR

This paper introduces the concept of geometrical smeariness, a phenomenon where the fluctuation scale of Fréchet means on spheres depends on geometry rather than density, revealing new asymptotic behaviors.

Contribution

It demonstrates that smeariness on higher-dimensional spheres depends solely on curvature, not density, and extends the concept to deformed manifolds, with applications to shape and geomagnetic data.

Findings

01

Smeariness depends on curvature, not density.

02

Deformed manifolds can exhibit smeariness similar to spheres.

03

Empirical evidence from shape data and geomagnetic positions.

Abstract

In the past decades, the central limit theorem (CLT) has been generalized to non-Euclidean data spaces. Some years ago, it was found that for some random variables on the circle, the sample Fr\'echet mean fluctuates around the population mean asymptotically at a scale $n^{- τ}$ with exponent $τ < 1/2$ with a non-normal distribution if the probability density at the antipodal point of the mean is $\frac{1}{2 π}$ . The author and his collaborator recently discovered that $τ = 1/6$ for some random variables on higher dimensional spheres. In this article we show that, even more surprisingly, the phenomenon on spheres of higher dimension is qualitatively different from that on the circle, as it depends purely on geometrical properties of the space, namely its curvature, and not on the density at the antipodal point. This gives rise to the new concept of geometrical smeariness. In…

Tables1

Table 1. Table 1 : Age of the magnetized mineral, number of magnetic samples and source for all data sets which display smeariness. The upper part of the table lists data sets with strong finite sample smeariness, where n Var [ μ ^ n ] 𝑛 Var delimited-[] subscript ^ 𝜇 𝑛 n\textnormal{Var}[\widehat{\mu}_{n}] increases up to large bootstrap sample size k 𝑘 k and the lower part lists data sets with less pronounced finite sample smeariness, mostly for small k 𝑘 k .

Number	Age [Ma]	n	Source
006	0.78	107	Clement and Kent (1987)
012	0.78	40	Kawai et al. (1973)
044	1.77	145	Clement and Kent (1985)
062	1.77	154	Holt and Kirschvink (1995)
063	3.11	84	Linssen (1991)
107	33.7	50	Wellman et al. (1969)
122	1.07	239	Rolph (1993)
141	8.07	60	Gurariy (1988)
004	0.78	35	Clement et al. (1982)
005	0.78	44	Clement et al. (1982)
043	1.77	181	Clement and Kent (1987)
061	3.04	193	Van Hoof and Langereis (1992)
077	4.8	177	Linssen (1988)
109	180	16	van Zijl et al. (1962)
129	33.7	31	Wellman et al. (1969)
133	4.18	193	Clement et al. (1996)
138	9.74	170	Gurariy (1988)

Equations261

S^{m} := {x \in R^{m + 1} : ∥ x ∥ = 1}

S^{m} := {x \in R^{m + 1} : ∥ x ∥ = 1}

E

E

ρ

ρ

x \in V, ∥ x ∥ < δ sup ∣ F (x) - F (0) ∣

x \in V, ∥ x ∥ < δ sup ∣ F (x) - F (0) ∣

τ

τ

0 = \mbox Hess_{V} F (0) = > 0 \mbox Hess_{V} G (0) + < 0 \mbox Hess_{V} (F - G) (0)

0 = \mbox Hess_{V} F (0) = > 0 \mbox Hess_{V} G (0) + < 0 \mbox Hess_{V} (F - G) (0)

L_{m, β} := {q \in S^{m} : arccos ⟨ p, μ ⟩ \in [π /2, π - β]} .

L_{m, β} := {q \in S^{m} : arccos ⟨ p, μ ⟩ \in [π /2, π - β]} .

\frac{\partial ^{4} F}{\partial ψ ^{4}} (α_{β}, β, ψ) \geq \frac{\partial ^{4} F}{\partial ψ ^{4}} (α_{0}, 0, ψ) - L_{4} β \geq \frac{c _{m}}{2} - L_{4} β .

\frac{\partial ^{4} F}{\partial ψ ^{4}} (α_{β}, β, ψ) \geq \frac{\partial ^{4} F}{\partial ψ ^{4}} (α_{0}, 0, ψ) - L_{4} β \geq \frac{c _{m}}{2} - L_{4} β .

\frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{0}, 0, ψ) - \frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{β}, β, ψ)

\frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{0}, 0, ψ) - \frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{β}, β, ψ)

β < β_{0} = min (\frac{c _{m}}{2 L _{4}}, \frac{1}{L _{2}} \frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{0}, 0, π /3))

β < β_{0} = min (\frac{c _{m}}{2 L _{4}}, \frac{1}{L _{2}} \frac{\partial ^{2} F}{\partial ψ ^{2}} (α_{0}, 0, π /3))

θ > \frac{π}{2} + \frac{16}{π ( m - 3 )} \geq θ_{m, 4} \geq θ_{m, 2} .

θ > \frac{π}{2} + \frac{16}{π ( m - 3 )} \geq θ_{m, 4} \geq θ_{m, 2} .

β < \frac{π}{2} - \frac{6 ( 6 + π )}{π ( m - 3 )} \leq β_{m, 4} \leq β_{m, 2}

β < \frac{π}{2} - \frac{6 ( 6 + π )}{π ( m - 3 )} \leq β_{m, 4} \leq β_{m, 2}

E_{n} (ω)

E_{n} (ω)

n Var [μ_{n}] > Var [X]

n Var [μ_{n}] > Var [X]

S_{fs} = n \in N max \frac{n Var [ μ _{n} ]}{Var [ X ]} > 1,

S_{fs} = n \in N max \frac{n Var [ μ _{n} ]}{Var [ X ]} > 1,

S_{fs} \geq n \to \infty lim \frac{n Var [ μ _{n} ]}{Var [ X ]} > K,

S_{fs} \geq n \to \infty lim \frac{n Var [ μ _{n} ]}{Var [ X ]} > K,

d P (θ) := (1 - α) 1_{[0, 0.05 π]} (θ) d θ + α d δ_{0.95 π} (θ) .

d P (θ) := (1 - α) 1_{[0, 0.05 π]} (θ) d θ + α d δ_{0.95 π} (θ) .

F : S^{m} \to [0, \infty), p \mapsto \int_{S^{m}} d_{S^{m}}^{2} (p, q) d P^{X} (q),

F : S^{m} \to [0, \infty), p \mapsto \int_{S^{m}} d_{S^{m}}^{2} (p, q) d P^{X} (q),

F (p) = F (arccos ⟨ p, μ ⟩) .

F (p) = F (arccos ⟨ p, μ ⟩) .

L_{m, β} := {q \in S^{m} : arccos ⟨ p, μ ⟩ \in [π /2, π - β]} .

L_{m, β} := {q \in S^{m} : arccos ⟨ p, μ ⟩ \in [π /2, π - β]} .

u : Θ^{m - 1} \to [0, 1], θ \mapsto j = 1 \prod m - 1 cos^{m - j} θ_{j} v (θ) = j = 1 \prod m - 1 cos θ_{j},

u : Θ^{m - 1} \to [0, 1], θ \mapsto j = 1 \prod m - 1 cos^{m - j} θ_{j} v (θ) = j = 1 \prod m - 1 cos θ_{j},

V_{m} = \mbox vol (S^{m}) = \frac{2 π ^{\frac{m + 1}{2}}}{Γ ( \frac{m + 1}{2} )} .

V_{m} = \mbox vol (S^{m}) = \frac{2 π ^{\frac{m + 1}{2}}}{Γ ( \frac{m + 1}{2} )} .

F (ψ) =

F (ψ) =

F^{'} (ψ)

F^{'} (ψ)

F^{''} (ψ)

F^{(3)} (ψ)

F^{(4)} (ψ)

F^{''} (ψ)

F^{''} (ψ)

F^{(3)} (ψ)

F^{(4)} (ψ)

F^{(4)} (ψ)

q

q

F (α, δ, ψ) :=

F (α, δ, ψ) :=

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Geometrical Smeariness – A new Phenomenon of Fréchet Means

Benjamin Eltzner111Felix-Bernstein-Institut für Mathematische Statistik in den Biowissenschaften, Georg-August-Universität Göttingen

Abstract

In the past decades, the central limit theorem (CLT) has been generalized to non-Euclidean data spaces. Some years ago, it was found that for some random variables on the circle, the sample Fréchet mean fluctuates around the population mean asymptotically at a scale $n^{-\tau}$ with exponent $\tau<1/2$ with a non-normal distribution if the probability density at the antipodal point of the mean is $\frac{1}{2\pi}$ . The author and his collaborator recently discovered that $\tau=1/6$ for some random variables on higher dimensional spheres. In this article we show that, even more surprisingly, the phenomenon on spheres of higher dimension is qualitatively different from that on the circle, as it depends purely on geometrical properties of the space, namely its curvature, and not on the density at the antipodal point. This gives rise to the new concept of geometrical smeariness. In consequence, the sphere can be deformed, say, by removing a neighborhood of the antipodal point of the mean and gluing a flat space there, with a smooth transition piece. This yields smeariness on a manifold, which is diffeomorphic to Euclidean space. We give an example family of random variables with 2-smeary mean, i.e. with $\tau=1/6$ , whose range has a hole containing the cut locus of the mean. The hole size exhibits a curse of dimensionality as it can increase with dimension, converging to the whole hemisphere opposite a local Fréchet mean. We observe smeariness in simulated landmark shapes on Kendall pre-shape space and in real data of geomagnetic north pole positions on the two-dimensional sphere.

1 Introduction

The central limit theorem is a cornerstone of statistics. Building on this fundamental theorem for real random variables, asymptotic theory has been developed to encompass random variables in a wide variety of data spaces including vector spaces (presented in many textbooks, e.g. Mardia et al. (1979)) and spaces like manifolds, e.g. Bhattacharya and Patrangenaru (2003, 2005); Bhattacharya and Bhattacharya (2012), and stratified spaces, e.g. Barden et al. (2013); Hotz et al. (2013); Huckemann et al. (2015); Barden et al. (2018). The last decades have especially seen the development of asymptotic theory for the Fréchet mean (also called barycenter) and also for more general data descriptors on non-Euclidean data spaces Bhattacharya and Patrangenaru (2003, 2005); Bhattacharya and Bhattacharya (2008); Huckemann (2011b, a); Bhattacharya and Bhattacharya (2012). Determining necessary and sufficient conditions for standard asymptotic rates of the mean on non-Euclidean spaces is an ongoing endeavor considered by several recent publications, e.g. Bhattacharya and Lin (2017); Schötz (2019); Ahidar-Coutrix et al. (2019); Le Gouic et al. (2019); Eltzner et al. (2019).

The seminal work by Sturm (2003) showed that the Fréchet mean is unique on metric spaces which are non-positively curved in the sense of Alexandrov. (Afsari, 2009, Theorem 2.4.1) showed that in simply connected spaces of non-positive curvature the Hessian of the squared geodesic distance, i.e. the squared length of a shortest geodesic between two points, is strictly positive definite. This leads to a CLT with rate $n^{-1/2}$ under some technical assumptions which are fairly straightforward in finite dimension. On the other hand, research into asymptotics on positively curved spaces has lead to the discovery of the phenomenon called “smeariness” by Hotz and Huckemann (2015), where the asymptotic rate of the mean on the circle is lower than $n^{-1/2}$ . It is clear that this is an obstacle to hypothesis testing, firstly because table quantiles based on asymptotic considerations, as in the $T^{2}$ -test, cannot be used and secondly because much larger sample sizes are required to improve the power of hypothesis tests.

Lower rates of convergence than $n^{-1/2}$ are known for many estimators. A simple example is the center of an interval of fixed length containing the largest possible fraction of data points, as described by van der Vaart (2000) and some recent examples include Liu and Yang (2012); Li and Ma (2015); Chen and Christensen (2015). These rates are usually independent of the random variables within a given class and are a property of the estimated descriptor and the data and descriptor spaces.

Smeariness of the mean, however, depends on the random variable: the asymptotic rate differs for different random variables, as shown for the circle by Hotz and Huckemann (2015); Hundrieser (2017) and for the sphere by Eltzner and Huckemann (2019). The dependence on the random variable exacerbates the problem of hypothesis testing, since it is not known in advance for a certain data set if smeariness may play a complicating role or not.

Smeariness on the circle occurs, assuming a probability density in a neighborhood of the antipodal, if and only if the value of the probability density at the antipodal point of the mean corresponds to that of the uniform distribution, i.e. $\frac{1}{2\pi}$ if the circle is parametrized in arc length. A general framework for such phenomena has been derived by Eltzner and Huckemann (2019) using empirical process theory. The examples for arbitrary-dimensional spheres given by Eltzner and Huckemann (2019) also feature a non-zero probability density at the antipodal point of the mean, but it was not shown whether this feature is necessary for smeariness. Here we identify fundamental differences between smeariness on the circle and on spheres of dimension $m\geq 2$ . The difference can be traced back to the question whether geodesics can circumvent the cut locus. If they can, we call this geometrical smeariness, otherwise cut locus smeariness.

We show

•

that cut locus smeariness occurs on the circle and the torus and can therefore be understood exhaustively by studying the properties of the random variable in a neighborhood of the cut locus.

•

that geometrical smeariness occurs on every m-dimensional sphere

[TABLE]

with $m\geq 2$ , which is a novel phenomenon.

•

in Theorem 2.12 that in contrast to cut locus smeariness, a unique mean with standard $n^{-1/2}$ asymptotic rate is compatible with arbitrarily high probability density at the cut locus.

•

in Theorem 3.1 explicit examples of random variables for $m\geq 5$ , whose ranges feature a finite sized spherical hole around the antipodal point of the mean, which exhibit a smeary asymptotic rate of $n^{-1/6}$ .

•

in a corollary of Theorem 3.1 that the sphere can be deformed on the hemisphere opposite of the mean to eliminate the cut locus of the mean altogether, for example by cutting out a ball around the cut locus and gluing the resulting boundary to a Euclidean space with a smooth transition, as illustrated in Figure 1. The resulting manifold is then diffeomorphic to Euclidean space.

•

in Theorem 3.3 a curse of dimensionality due to the increase with dimension $m$ of the maximal hole radius $\frac{\pi}{2}-\frac{K}{m}$ which still allows for an $n^{-1/6}$ rate, where $K$ is a constant: in the limit of infinite dimensional spheres, random variables with range barely larger than a hemisphere feature smeariness.

In Section 2, we give precise definitions of cut locus and geometrical smeariness. In Section 3 we give an example of random variables on spheres of dimension $m\geq 5$ whose range has a hole containing the cut locus of the mean, and whose mean displays geometrical smeariness. In Sections 2 and 3 we state conjectures beyond this paper. In Section 4 we investigate simulated landmark shapes to illustrate some implications of smeariness in that setting. Furthermore, we analyze 151 real data sets of geomagnetic pole orientations on $\mathbb{S}^{2}$ and find smeariness in 17 of the data sets.

2 Cut Locus Smeariness and Geometrical Smeariness

We start with geometrical context and previous results on smeariness and move on to new definitions.

2.1 Basic Notions

First, we introduce some basic notions, which will be used throughout the text. Let $\Omega$ be a probability space and let $Q$ be a Riemannian manifold called the data space with the corresponding geodesic distance $d_{Q}(q,q^{\prime}):Q\times Q\to\mathbb{R}$ . Let $X:\Omega\to Q$ be a $Q$ -valued random variable and $X_{1},\dots,X_{n}\operatorname{\stackrel{{\scriptstyle i.i.d.}}{{\sim}}}X$ . In order to formulate a CLT in terms of random vectors, we map to the tangent space $T_{q_{0}}Q$ of some point $q_{0}\in Q$ using the exponential map.

Definition 2.1.

Consider a point in a Riemannian manifold $p\in Q$ . The cut locus $\textnormal{Cut}(p)$ of $p$ is the closure of the set of all points $q\in Q$ such that there is more than one shortest geodesic from $p$ to $q$ .

Definition 2.2.

For a Riemannian manifold $Q$ and a point $q\in Q$ we define the exponential map $\exp_{q}:T_{q}Q\to Q$ . This is the unique map with $\exp_{q}(0)=q$ and for any $v\in T_{q}Q$ , considering the arc length parametrized geodesic $\gamma$ with $\gamma(0)=q$ and $\gamma^{\prime}(0)=\frac{v}{|v|}$ we have $\exp_{q}(v)=\gamma(|v|)$ . The inverse of the exponential map $\exp_{q}$ , which exists outside of $\textnormal{Cut}(q)$ , is called the logarithm map and is denoted by $\log_{q}$ . It maps the point $p\in Q\setminus\textnormal{Cut}(q)$ to a vector in the tangent space $T_{q}Q$ whose length is the same as the geodesic distance between $q$ and $p$ .

For the circle and spheres of arbitrary dimension, the cut locus of a point $p$ is simply its antipodal point. In general, the cut locus of a manifold of dimension $m$ has dimension at most $m-1$ .

Definition 2.3 (Population Fréchet mean).

The set of population Fréchet means of the random variable $X$ in $Q$ is defined as

[TABLE]

Definition 2.4 (Local and Global Fréchet mean).

Local minima of the function $q\mapsto\mathbb{E}[d^{2}_{Q}(q,X)]$ are called local Fréchet means. For clearer distinction, the Fréchet means as defined in Definition 2.3 will sometimes be called global Fréchet means in the following.

For readers from different fields it seems in order to give some historical context of the terminology used here. Kendall (1990) has introduced the term Karcher mean for local minima of the Fréchet function $q\mapsto\mathbb{E}[d^{2}_{Q}(q,X)]$ , a fact criticized by the inadvertent name patron in Karcher (2014), who points out that the term Riemannian Center of Mass is common in differential geometry. Karcher (2014) also criticizes the use of the term Fréchet mean for global minima, but since it is established and widely used in statistical literature, the present article will continue using it and use the term local Fréchet mean for local minima.

Assumption 2.5.

In all of the following, we assume the random variables $X$ to have

(i)

a unique population Fréchet mean $E=\{\mu\}$ . 2. (ii)

a density in a neighborhood of $\textnormal{Cut}(\mu)$ .

Providing conditions for uniqueness of the Fréchet mean is a difficult and ongoing issue not further discussed here, cf. Karcher (1977); Kendall (1990); Le (2001); Groisser (2005); Afsari (2011); Arnaudon and Miclo (2014); Hotz and Huckemann (2015).

Definition 2.6 (Fréchet function in exponential chart).

Consider a neighborhood $\widetilde{U}$ of $\mu$ , $m\in\mathbb{N}$ , such that with a neighborhood $P\subset T_{\mu}Q$ of the origin in $T_{\mu}Q\cong\mathbb{R}^{m}$ the exponential map $\exp_{\mu}:P\to\widetilde{U}$ , $\exp_{\mu}(0)=\mu$ , is a diffeomorphism. We set for $q\in Q$ , $x\in P$ ,

[TABLE]

The function $F$ is called population Fréchet function in the exponential chart at $\mu$ . Since $\log_{\mu}(\mu)=0$ , $F$ has a global minimum at $x=0$ .

Notation 2.7.

For a point $q\in Q$ and $\varepsilon>0$ let $B_{\varepsilon}(q)=\{q^{\prime}\in Q:d_{Q}(q,q^{\prime})<\varepsilon\}$ .

2.2 Smeariness

A definition of smeariness was given in Eltzner and Huckemann (2019). Here, we provide a definition which highlights the point that smeariness is dependent on the random variable. Different asymptotic rates for different random variables due to different leading orders of the Fréchet function at the population mean can be understood in the context underlying van der Vaart (2000) Theorem 5.52.

Definition 2.8 (Smeariness of Random Variables).

Consider a random variable $X$ on $Q$ with Fréchet function $F$ and Fréchet mean $\mu$ . Assume that there is $\zeta>0$ such that for every $x\in B_{\zeta}(0)\setminus\{0\}$ one has $F(x)>F(0)$ . Suppose that for fixed constants $C_{X}>0$ and $2<\kappa\in\mathbb{R}$ and a linear subspace $\mathcal{V}\subseteq T_{\mu}Q$ we have for every sufficiently small $\delta>0$

[TABLE]

Then we say that the Fréchet mean of $X$ is smeary on the linear subspace $\mathcal{V}$ and that $Q$ admits smeariness. If $\mathcal{V}=T_{\mu}Q$ , we simply say that $X$ is smeary.

Note that we do not require the Fréchet function to be analytic. The smeary asymptotic theory relies on the stricter (Eltzner and Huckemann, 2019, Assumption 2.6) in order to get a closed form for the asymptotic distribution. However, it is applicable e.g. to the Fréchet function $F(x)=|x|^{4}+\exp(-|x|^{-1})$ , which is smooth but non-analytic.

For the definition of geometrical and cut locus smeariness, we introduce some auxiliary notation. Let $P=Q\setminus\textnormal{Cut}(\mu)$ and $d_{P}$ the geodesic distance on $P$ . Then we write

[TABLE]

Here, $d^{2}_{P}(q,p)\geq d^{2}_{Q}(q,p)$ is given by the infimum over the length of all curves in $P$ connecting $q$ and $p$ . Using this, we can define

Definition 2.9 (Cut Locus Smeariness and Geometrical Smeariness).

Assume that $Q$ admits smeariness and $X$ is a random variable with smeary Fréchet mean $\mu\in Q$ on the linear subspace $\mathcal{V}\subseteq T_{\mu}Q$ . The restriction of the Hesse matrix to $\mathcal{V}$ is denoted by $\mbox{\rm Hess\,}_{\mathcal{V}}$ .

(i)

If for every linear subspace neighborhood $U\subset\mathcal{V}$ of [math], there is an $x\in U$ such that $F(x)\neq G(x)$ and $\mbox{\rm Hess\,}_{\mathcal{V}}(F-G)(0)<0$ , then the mean of $X$ is called cut locus smeary on the linear subspace $\mathcal{V}$ and we say that $Q$ admits cut locus smeariness. 2. (ii)

If there is a linear subspace neighborhood $U\subset\mathcal{V}$ of [math], such that for every $x\in U$ one has $F(x)=G(x)$ or if $\mbox{\rm Hess\,}_{\mathcal{V}}(F-G)(0)\geq 0$ , the mean of $X$ is called geometrically smeary on the linear subspace $\mathcal{V}$ and we say that $Q$ admits geometrical smeariness.

To motivate the term cut locus smeariness, first note that smeariness always implies that $\mbox{\rm Hess\,}_{\mathcal{V}}F(0)$ . On Euclidean space, the Hesse matrix of the Fréchet function is always positive definite. In reference to this, smeariness depends on a negative contribution to the Hessian that leads to a vanishing Hessian overall. The terms cut locus smeariness and geometrical smeariness point to the origin of this negative contribution to the Hessian. Note that one can write

[TABLE]

to illustrate that cut locus smeariness crucially hinges on $F\neq G$ . Since the only difference between these two functions is that in $G$ the geodesics crossing $\textnormal{Cut}(\mu)$ are excluded, the negative term $\mbox{\rm Hess\,}(F-G)(0)$ can be understood as the contribution of the cut locus.

The author is not aware of any random variables on any space, where $\mbox{\rm Hess\,}(F-G)(0)>0$ . If such exist and their means exhibits smeariness, we would classify their means as geometrically smeary, since the defining property of random variables with a smeary mean is a negative contribution to the Hessian. However, in such a case, the cut locus contribution to the Hessian would be positive thus the negative contribution to the Hessian which causes smeariness is from a different source.

Definition 2.9 does not exclude a space admitting both geometrical and cut locus smeariness both on orthogonal subspaces as well as on a common linear subspace. However, Theorem 2.10 shows that the circle and the spheres $\mathbb{S}^{m}$ with $m\geq 2$ do not admit both. There is no known example of a space admitting geometrical and cut locus smeariness on a common linear subspace. Geometrical and cut locus smeariness on orthogonal subspaces can be realized on product spaces like $\mathbb{S}^{1}\times\mathbb{S}^{5}$ .

Theorem 2.10.

**

(i)

The circle only admits cut locus smeariness. 2. (ii)

The spheres $\mathbb{S}^{m}$ with $m\geq 2$ only admit geometrical smeariness.

Proof.

(i)

On the circle, the cut locus of the mean contains only one point, namely its antipode, which we denote by $\overline{\mu}$ . In Hotz and Huckemann (2015) it was shown that $\mbox{\rm Hess\,}G(0)=2$ and $\mbox{\rm Hess\,}(F-G)(0)=-4\pi f(\overline{\mu})$ where $f$ is the probability density that exists in the neighborhood of $\overline{\mu}$ . Thus, only cut locus smeariness exists, namely if $f(\overline{\mu})=\frac{1}{2\pi}$ . 2. (ii)

The set of geodesic segments connecting two points of $P$ consists of all geodesic segments of $Q$ which do not cross $\textnormal{Cut}(\mu)$ . Thus, $d_{P}(q,p)=d_{Q}(q,p)$ unless the shortest geodesic segment connecting $q$ and $p$ crosses $\overline{\mu}$ . In that case, there is no shortest curve connecting $q$ and $p$ , but curves can get arbitrarily close in length to the geodesic segment in $Q$ , thus also for these points $d_{P}(q,p)=d_{Q}(q,p)$ . In consequence, there is a neighborhood $U$ of [math], such that for every $x\in U$ one has $F(x)=G(x)$ and thus only geometrical smeariness occurs. ∎

Note that the argument in part (ii) of the above proof can be generalized to show that $d_{P}(q,p)=d_{Q}(q,p)$ for all $p,q\in Q$ whenever the dimension of the cut locus of the mean is smaller than $m-1$ , since then the “infinitesimal circumvention argument” employed in the proof is applicable. In consequence, cut locus smeariness is only possible if the cut locus has dimension $m-1$ .

Conversely, one may expect cut locus smeariness to be possible if the cut locus dimension is $m-1$ , observing the following heuristic argument. Let $\gamma$ be the point set of a geodesic segment from $\mu$ to an internal point of $\textnormal{Cut}(\mu)$ , then for all points $q\in B_{\varepsilon}(\mu)\cap\gamma\setminus\{\mu\}$ there are points $p$ such that $d_{P}(q,p)>d_{Q}(q,p)$ and in consequence $\mbox{\rm Hess\,}_{\textnormal{span}\{\dot{\gamma}(\mu)\}}(F-G)(0)<0$ for random variables $X$ with a non-vanishing density in a neighborhood of $\textnormal{Cut}(\mu)$ , using a variant of the argument used on the circle. This leads to the following conjecture.

Conjecture 2.11.

In the real projective spaces $\mathbb{R}P^{m}$ with $m\geq 2$ , for every neighborhood $U$ of [math] there is an $x\in U$ such that $F(x)\neq G(x)$ . This indicates that the $\mathbb{R}P^{m}$ admit cut locus smeariness.

For smeariness on the circle, a probability density at $\overline{\mu}$ of $f(\overline{\mu})=\frac{1}{2\pi}$ is necessary for smeariness. Theorem 2.10 shows that this is closely tied to the fact that smeariness on the circle is cut locus smeariness. One may therefore suspect that on $\mathbb{S}^{m}$ with $m\geq 2$ there is no such value for the probability density at $\overline{\mu}$ which leads to smeariness. In fact, we show the following:

Theorem 2.12.

For every real $\rho\geq 0$ and every integer $m\geq 2$ one can define a random variable $X_{\rho}$ on $\mathbb{S}^{m}$ which has a probability density with value $\rho$ at the south pole and a non-smeary mean at the north pole.

Proof.

See supplement A.2.1. ∎

3 Geometrical Smeariness on Spheres

In Theorem 2.12 it was shown that the value of the probability density at $\textnormal{Cut}(\mu)$ is not sufficient to achieve smeariness on $\mathbb{S}^{m}$ for $m\geq 2$ , in contrast to the situation on $\mathbb{S}^{1}$ . In this Section, we address the converse question, whether the probability density which the random variables presented by Eltzner and Huckemann (2019) exhibit at $\textnormal{Cut}(\mu)$ are necessary for smeariness. Investigating this question aims to shed light on the assumption that the Hessian of the Fréchet function be non-singular, which is still a staple of efforts such as Bhattacharya and Lin (2017); Eltzner et al. (2019) to generalize prerequisites for the CLT. Let $\mu:=e_{m+1}$ be the north pole of $\mathbb{S}^{m}$ and write the spherical annulus as

[TABLE]

This set is the southern hemisphere with a hole of radius $\beta$ around the south pole cut out.

Theorem 3.1.

Consider a random variable $X$ on the $m$ -dimensional unit sphere $\mathbb{S}^{m}$ ( $m\geq 5$ ) that is uniformly distributed on $\mathbb{L}_{m,\beta}$ with total mass $0<\alpha<1$ and assuming $\mu$ with probability $1-\alpha$ . Then there is a radius $\beta_{0}>0$ of a spherical hole such that the random variable has a unique 2-smeary Fréchet mean, i.e. with asymptotic rate $n^{-1/6}$ , at the north pole for any $\beta\leq\beta_{0}$ .

Proof.

Due to rotation symmetry, the Fréchet function only depends on the angle $\psi$ between $\mu$ and $\exp_{\mu}(x)$ . Including the parameters $\alpha$ and $\beta$ we write $F(\alpha,\beta,\psi)$ . For given $\beta\leq\frac{\pi}{2}$ we define $\alpha_{\beta}$ as the value of $\alpha$ such that $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,0)=0$ .

The proof proceed in two steps. First, we show that there is a $\beta_{m,4}>0$ such that the Hessian of the Fréchet function vanishes at the north pole but the fourth derivative is positive, such that we have a local Fréchet mean. This is established by Lemmas A.3 and A.5 in the supplement and further elaborated in Theorem 3.3 below.

Secondly, we show that there is a $\beta_{0}>0$ such that for all $\beta\leq\beta_{0}$ the local Fréchet mean at the north pole is the unique Fréchet mean. For this we will utilize Lemmas A.1 and A.8 from the supplement. From Eltzner and Huckemann (2019) we recall $\frac{\partial^{4}F}{\partial\psi^{4}}(\alpha_{0},0,0)=\frac{\alpha_{0}v_{m+1}}{v_{m}}\,\frac{m-1}{m+2}=c_{m}>0$ . Furthermore, from Lemma A.1, one gets $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{0},0,\psi)>0$ for $\psi\neq 0,\pi$ . Hence we infer that $\frac{\partial F}{\partial\psi}(\alpha_{0},0,\psi)$ is strictly increasing in $\psi$ from $\frac{\partial F}{\partial\psi}(\alpha_{0},0,0)=0$ , yielding that there is no stationary point for $F$ other than $p=\mu$ .

Due to Lemmas A.1 and A.8 we know for all $\psi\leq\pi/3$

[TABLE]

Thus we can pick $\beta\leq\frac{c_{m}}{2L_{4}}$ to get $\frac{\partial^{4}F}{\partial\psi^{4}}(\alpha_{\beta},\beta,\psi)\geq 0$ for all $\psi\leq\pi/3$ . Since $\frac{\partial^{3}F}{\partial\psi^{3}}(\alpha_{\beta},\beta,0)=0$ and $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,0)=0$ it follows that $\frac{\partial^{3}F}{\partial\psi^{3}}(\alpha_{\beta},\beta,\psi)>0$ and $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,\psi)>0$ for all $0<\psi\leq\pi/3$ .

From Lemma A.8 we note that

[TABLE]

Thus we can pick $\beta<\frac{1}{L_{2}}\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{0},0,\pi/3)$ to achieve $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,\pi/3)\geq\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{0},0,\pi/3)-L_{2}\beta>0$ . Since $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{0},0,\psi)$ is monotonously growing in $\psi$ , it follows that $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,\psi)>0$ for all $\psi\geq\pi/3$ . Thus for all

[TABLE]

the minimum at $\psi=0$ is unique. ∎

In the case of geometrical smeariness, it is possible to have smeary means even for random variables whose range excludes a neighborhood of the cut locus of the mean. This illustrates an especially drastic consequence of geometrical smeariness which distinguishes it from cut locus smeariness.

Corollary 3.2.

There are manifolds which are diffeomorphic to $\mathbb{R}^{m}$ ( $m\geq 5$ ) on which the mean can be smeary.

Proof.

Consider the random variable on $\mathbb{S}^{m}$ used in Theorem 3.1, cut out a ball of radius $r<\beta_{0}$ around the south pole and glue the resulting boundary to a Euclidean space with a smooth connection. The resulting manifold is then diffeomorphic to Euclidean space. ∎

This result points out particularly clearly that smeariness can be completely independent of the behavior of the random variable at the cut locus of the mean because smeariness can even occur in cases where the mean does not have a cut locus and the manifold is topologically trivial.

For the statement of the following Theorem, note that Definitions 2.8 and 2.9 can be extended to smeary local Fréchet means.

Theorem 3.3 (Curse of dimensionality).

For every radius $\frac{\pi}{2}<r<\pi$ there is a dimension $m$ such that one can construct a random variable $X$ on $\mathbb{S}^{m}$ whose range has radius $r$ and which has a smeary local Fréchet mean. For $m\to\infty$ the minimal radius of the range of $X$ sufficient for smeariness of local Fréchet means approaches $\frac{\pi}{2}$ with a rate of $m^{-1}$ . For arbitrarily small $\epsilon>0$ ,

(i)

it is possible to have random variables on $\mathbb{S}^{m}$ whose range is restricted to $\theta\in\left[0,\frac{\pi}{2}+\frac{16+\epsilon}{\pi(m-3)}\right]$ and which exhibit a smeary local Fréchet mean at the north pole. 2. (ii)

for the random variables used in Theorem 3.1 for hole sizes $\beta\leq\frac{\pi}{2}-\frac{6(6+\pi)+\epsilon}{\pi(m-3)}$ there are parameter values $\alpha=\alpha_{\beta}$ which lead to local smeary means at the north pole.

Proof.

(i)

From supplement A.3 we see that $\frac{\partial^{2}F_{\theta}}{\partial\psi^{2}}|_{\psi=0}<0$ if and only if $\theta>\theta_{m,2}$ and $\frac{\partial^{4}F_{\theta}}{\partial\psi^{4}}|_{\psi=0}>0$ if and only if $\theta>\theta_{m,4}$ . From Lemma A.4 we can thus derive the sufficient condition

[TABLE]

Adding a point mass at the north pole, one can thus achieve $\frac{\partial^{2}F}{\partial\psi^{2}}|_{\psi=0}=0$ and $\frac{\partial^{4}F}{\partial\psi^{4}}|_{\psi=0}>0$ , which yields a smeary local Fréchet mean. The claim follows. 2. (ii)

From Lemma A.7 we can derive the sufficient condition

[TABLE]

for $\frac{\partial^{2}F}{\partial\psi^{2}}|_{\psi=0}=0$ and $\frac{\partial^{4}F}{\partial\psi^{4}}|_{\psi=0}>0$ , which yields a smeary local Fréchet mean. The claim follows.

∎

Conjecture 3.4.

Numerical calculations show that the local Fréchet means of Theorem 3.3 are actually global Fréchet means. This indicates that Theorem 3.3 holds also for global Fréchet means.

Summarizing the results, we find that there is neither a necessary nor sufficient probability density of a random variable $X$ at $\textnormal{Cut}(\mu)$ for smeariness of $\mu$ on $\mathbb{S}^{5}$ . In fact, for very large dimension, smeariness of the mean of $X$ is still possible when the support of $X$ only minimally exceeds a hemisphere.

4 Consequences for Applications

Smeariness is a serious problem for statistics on spheres because if a data set is sampled from a population with smeary mean, this greatly reduces the power of hypothesis tests and makes the sample Fréchet mean less reliable. Smeariness of the underlying population leads to finite sample smeariness of samples, as elaborated in Hundrieser et al. (2020). We provide a definition for finite sample smeariness and show that it can occur under fairly general conditions and consider two applications. The first is a set of simulated oriented landmark shapes as an example of a smeary distribution. The second application concerns real data, namely north pole positions on the earth during periods of pole reversal which have been determined from magnetite rock samples.

In the following we use sample Fréchet means, which are defined as follows.

Definition 4.1 (Sample Fréchet mean).

The set of sample Fréchet means of the random variable $X$ in $Q$ is defined as

[TABLE]

where the argument $\omega$ will be suppressed in the following.

4.1 Finite Sample Smeariness

As illustrated in Eltzner and Huckemann (2019), random variables which satisfy a CLT with standard $n^{-1/2}$ asymptotic rate can still exhibit higher than expected variance of sample means for finite sample sizes. This phenomenon is called finite sample smeariness and is discussed in depth on the circle and the torus in Hundrieser et al. (2020). For any random variable $Y$ , denote $\textnormal{Var}[Y]:=\mathbb{E}\left[d(Y,\mu)^{2}\right]$ .

Definition 4.2.

Given constants $C_{+},C_{-},K>0$ , $0<r_{-}<r_{+}<1$ and integers $1<n_{-}<n_{+}<n_{0}$ satisfying $C_{+}n_{-}^{r_{+}}\leq C_{-}n_{+}^{r_{-}}$ , the Fréchet mean $\mu$ of a random variable $X$ is called finite sample smeary if the following holds for the Fréchet sample mean $\widehat{\mu}_{n}$

(i)

$\forall n\in(n_{-},n_{+}]\cap\mathbb{N}\,:\quad\textnormal{Var}[X]\leq C_{-}n^{r_{-}}\leq n\textnormal{Var}[\widehat{\mu}_{n}]\leq C_{+}n^{r_{+}}$ .

(ii)

$\forall n\in(n_{0},\infty)\cap\mathbb{N}\,:\quad\textnormal{Var}[\widehat{\mu}_{n}]\leq Kn^{-1}$ .

Since finite sample smeariness is a non-asymptotic property, which is strictly weaker than smeariness, alternative terminology like lethargic means has been proposed for clearer distinction. Here, we decide to stick with the terminology by Hundrieser et al. (2020). On the circle, finite sample smeariness occurs whenever the range of the random variable exceeds a half circle. For positively curved spaces, already the results by Bhattacharya and Patrangenaru (2005); Afsari (2009) suggest that finite sample smeariness is a generic feature. Recent results by Pennec (2019) on small samples from highly concentrated measures show that in positively curved spaces

[TABLE]

holds for all random variables except point masses. However, for concentrated random variables, the magnitude of finite sample smeariness, which we define as

[TABLE]

is mostly close to $1$ . Here, we give a sufficient condition for $S_{\text{fs}}$ to be arbitrarily high.

Theorem 4.3.

For $\mathbb{S}^{m}$ with $m\geq 4$ , and for any $K>1$ and any $\theta_{*}>\theta_{m,4}$ there is a random variable $X$ , whose range is restricted to $\theta\in[0,\theta_{*}]$ , which has a local Fréchet mean at the north pole and for which

[TABLE]

if the north pole is the unique global Fréchet mean.

Proof.

See supplement A.5. ∎

In consequence of this theorem, one can expect far reaching consequences for data which are moderately spread out over the sphere. Due to the curse of dimensionality for $\theta_{m,4}$ , arbitrarily high magnitude of finite sample smeariness is possible for random variable whose range is only slightly larger than a hemisphere in high dimension.

4.2 Landmark Pre-Shapes

In this example, we will look at contours described by 4 or 6 landmarks in $\mathbb{R}^{2}$ . The contours are considered oriented with respect to some reference, as for example image elements in a computer image, which are oriented with respect to the frame, and the orientation is considered significant information. Therefore, we will factor out translations and scalings, thus mapping shapes with $k$ landmarks onto spheres $R^{2k}\to\mathbb{S}^{2k-3}$ . Here, we consider $k=4$ and $k=6$ and thus $\mathbb{S}^{5}$ and $\mathbb{S}^{9}$ . The shapes considered here are quadrangles, so 4 landmarks are sufficient to characterize them, as illustrated by the example shapes in Figure 2.

The distribution on the 4 landmark $\mathbb{S}^{5}$ is rotation symmetric and given by

[TABLE]

where the factor $\alpha$ is chosen such that the Hessian at the mean is slightly positive.

Looking at variances of means, it is apparent from Figure 3 that the shapes with 6 landmarks exhibit a higher magnitude of finite sample smeariness and appear even compatible with smeariness. Looking at the 4-landmark shapes for increasing sample size, one can see that the variance the mean scales with close to $n^{-1/3}$ , the asymptotic rate of 2-smeariness, for $n\leq 1000$ . For larger $n$ , the scaling settles into the standard $n^{-1}$ rate. One might intuitively think that adding two additional landmarks, thus receiving the 6-landmark shapes, could improve the scaling behavior. However, Figure 3 shows that the opposite is true in this case, as the variance scaling for 6-shapes remains slower than $n^{-1}$ for much larger sample sizes.

4.3 Examples of Smeariness in Magnetic Pole Transitions

The earth’s magnetic field is dynamic and its north and south pole position on the surface move over time. As the magnetic field is generated by the earth’s rotation, the poles are close to the rotation axis at most times. In the history of the earth, the positions of magnetic north and south poles have switched on the scale of few to several $100,000$ years, as evidenced by magnetization of certain minerals. Research in the field of paleomagnetism aims to discern magnetic polarization changes as well as past continental drift from the mineral record. We use a database of 151 data sets of reconstructed geomagnetic north pole positions (VGP, meaning “Virtual Geomagnetic Pole”) around periods of polar transition presented in McElhinny and Lock (1996) and downloadable from ftp://ftp.ngdc.noaa.gov/geomag/Paleomag/access/ver3.5.

For each of these data sets we performed $k$ -out-of- $n$ bootstrap for $1\leq k\leq 1000$ with $B=1000$ bootstrap replicates to get bootstrap sample means. From these we determined the variances of the means and looked at the power law behavior thereof. The expected behavior, if a standard CLT holds, would be $\mathrm{Var}[\mu^{*}_{k}]\propto k^{-1}$ . Of the 151 data sets, 8 very clearly depart from this scaling behavior, exhibiting a clearly reduced variance of bootstrap sample means even for $k>n$ . Another 9 data sets exhibit slower scaling at least for small $k$ . A rough geological classification by age, sample sizes and sources for these 17 data sets are given in Table 1. Example data sets are displayed in Figure 4 and bootstrap scaling behavior are shown in Figure 5.

For samples of finite size, one cannot determine with certainty, whether the underlying random variable has a finite density at the cut locus of the mean or not. As shown by Le and Barden (2014), the cut locus of the mean cannot contain a data point, which means that a neighborhood of the cut locus is always free of probability mass. In consequence, smeariness with or without a hole at the cut locus cannot be distinguished in the present data.

In summary, the Fréchet mean exhibits finite sample smeariness of considerable magnitude $S_{\text{fs}}\gg 10$ for more than $5\%$ of the data sets and at least moderate finite sample smeariness of magnitude $S_{\text{fs}}>3$ for more than $10\%$ of the data sets. This shows that finite sample smeariness of the spherical mean is a phenomenon which is abundant in data sets describing geomagnetic north pole positions around pole transition periods. In such situations, asymptotic inference on the mean has to be treated with great care.

Acknowledgments

The author gratefully acknowledges funding by DFG SFB 755, project B8, DFG SFB 803, project Z2, DFG HU 1575/7 and DFG GK 2088. I am very grateful to Stephan Huckemann for many helpful discussions and detailed comments to the manuscript, Andrew Wood for an inspiring discussion and John Kent and Kanti Mardia for helpful pointers in terms of data application. I would like to thank the anonymous reviewers, whose suggestions helped improve the manuscript.

Appendix A Smeariness with Holes

In this supplement, we present the details of all calculations needed to prove the results presented in the article. In all cases discussed here, the random variable is invariant under rotation around the polar axis and the mean will always be at the north pole $\mu:=e_{m+1}$ . In consequence, the Fréchet function

[TABLE]

involving the squared spherical distance $d^{2}_{\mathbb{S}^{m}}(p,q)=\arccos\langle p,q\rangle^{2}$ based on the standard inner product $\langle\cdot,\cdot\rangle$ of $\mathbb{R}^{m+1}$ , only depends on the polar angle $\psi:=\arccos\left<p,\mu\right>\in[0,\pi]$ between the point $p$ and the north pole $\mu$ , such that $\psi=0$ corresponds to $\mu$ . This means that there is a function $F:[0,¸\pi]\to[0,\infty)$ such that

[TABLE]

Furthermore, due to the rotation symmetry around $\mu$ derivatives of the Fréchet function are always diagonal tensors where all diagonal entries are equal. It is therefore sufficient to calculate derivatives $\frac{d^{k}F}{d\psi^{k}}$ .

In the following, we write the spherical annulus

[TABLE]

This set is the southern hemisphere with a hole of radius $\beta$ around the south pole cut out.

A.1 The Basic Hemisphere Model

We now recall the key calculation results from Eltzner and Huckemann (2019). Consider a random variable $X$ distributed on the $m$ -dimensional unit sphere $\mathbb{S}^{m}$ ( $m\geq 2$ ) that is uniformly distributed on the lower half sphere $\mathbb{L}_{m,0}$ with total mass $0<\alpha<1$ and assuming the north pole $\mu=$ with probability $1-\alpha$ .

Setting $\Theta=[-\pi/2,\pi/2]$ and $\Theta^{m-1}\ni\theta=(\theta_{1},\ldots,\theta_{m-1})$ , defining the functions

[TABLE]

we have the spherical volume element $u(\theta)\,d\theta\,d\phi$ . The volume of the full $\mathbb{S}^{m}$ is given by

[TABLE]

Using suitable coordinates, Eltzner and Huckemann (2019) show that

[TABLE]

Consider the derivatives

[TABLE]

By direct calculation, we can conclude

Lemma A.1.

[TABLE]

We recall $F^{(4)}(0)=\frac{\alpha V_{m+1}}{V_{m}}\,\frac{m-1}{m+2}=c_{m}>0$ and that the inequality in (2) is strict for $\psi\neq 0$ , due to (3). Hence we infer that $F^{\prime}(\psi)$ is strictly increasing in $\psi$ from $F^{\prime}(0)=0$ , yielding that there is no minimum of $F$ other than $\psi=0$ , which corresponds to $p=\mu$ .

A.2 Rotation Symmetric Random Variables

We use coordinates

[TABLE]

where $\mu=e_{m+1}=(0,\ldots,0)$ in these coordinates. Note that we exploit rotation symmetry here to define $p$ in especially simple way that eliminates all $\theta_{k}$ with $k\geq 3$ from all following calculations.

Now we have introduced the necessary notation to prove Theorem 2.14.

A.2.1 Proof of Theorem 2.14

Consider a probability measure with a point mass at the north pole with weight $1-\alpha$ and a uniform distribution with weight $\alpha$ in a ball of radius $\delta$ around the south pole. Then, using notation $h(\psi,\theta,\phi):=\cos\psi\cos\theta+\sin\psi\sin\theta\cos\phi$ and $I_{m}:=\int_{0}^{\pi}\sin^{m}\phi\,d\phi$ the Fréchet function is

[TABLE]

No note that due to convexity of the function $x\mapsto(\arccos x)^{2}$ ,

[TABLE]

Thus, using $\int_{0}^{\pi}\cos\phi\sin^{m-2}\phi\,d\phi=0$ we get

[TABLE]

This leads to the lower bound

[TABLE]

It is clear that $F(\alpha,\delta,\psi)$ takes its global minimum at $\psi=0$ if

[TABLE]

We pick $\alpha=\frac{\sin\delta}{4\pi}$ , which satisfies the above inequality. Therefore, we have a non-smeary Fréchet mean at the north pole for any $\delta$ , if we choose this $\alpha$ . The value of the probability density at the south pole can be lower bounded for $\delta\leq\pi/2$ by

[TABLE]

As $\delta\to 0$ the density diverges, thus any arbitrarily high probability density can be achieved at the south pole by choosing a suitable $\delta$ and $\alpha=\frac{\sin\delta}{4\pi}$ . Note that this result holds for any $m\geq 2$ , since we do not need to differentiate under the integrals here because of the simplification achieved by the shown lower bound. ∎

A.3 Derivatives of the Fréchet Function

To keep the calculations readable, we introduce some shorthand notation

[TABLE]

where we will suppress the arguments in the following, and we note

[TABLE]

In the following calculations we use

[TABLE]

With the notation $I_{m}:=\int_{0}^{\pi}\sin^{m}\phi\,d\phi$ , we have $I_{m-2}=\frac{m}{m-1}I_{m}$ .

Since we restrict attention to rotation invariant random variables, we first consider on uniform distribution on the subsphere $\mathbb{S}^{m-1}_{\psi}$ given by fixed polar angle $\psi$ . The Fréchet function of a random variable with this distribution can be written as

[TABLE]

and we can calculate derivatives, writing $f_{j}(\theta,\psi):=\frac{1}{2g(\theta)}\frac{\partial^{j}F_{\theta}}{\partial\psi^{j}}$

[TABLE]

Above, we differentiate under the integral. In Lemma A.2, we show that for sufficiently high dimension the derivatives with respect to $\psi$ can be interchanged with the integrals over $\theta$ and $\phi$ .

Lemma A.2.

We can differentiate under the integral in the following sense, for arbitrary integral bounds of the $\theta$ -integral in $[0,\pi]$

[TABLE]

If for an arbitrarily small $\varepsilon>0$ , one restricts to $\theta\in[0,\pi-2\varepsilon]$ and $\psi\in[0,\varepsilon]$ , the bounds on the dimension $m$ in these equations can be lowered by one to $m\geq 2$ , $m\geq 3$ and $m\geq 4$ , respectively.

Proof.

For the assertion to hold, it suffices to show, that the $f_{j}(\theta,\psi)$ are integrable for the respective values of $m$ . Since the numerators can all be easily bounded, the only problem is to bound the denominators under the integrals. Recall that $1-h^{2}=(h^{\prime})^{2}+s^{2}$ and use

[TABLE]

thus we get

[TABLE]

We see that these bounds are finite for the required dimensions.

To see that the final claim holds, note that the function $x\mapsto\arccos^{2}(x)$ is $C^{\infty}$ on $(-1,1]$ and $h$ is $C^{\infty}$ in the whole domain. Since $\lim_{x\to 1}\frac{\arccos(x)}{\sqrt{1-x^{2}}}=1$ , the poles at $h=1$ in the integrals are one order lower than the poles at $h=-1$ . Since $h=-1$ only holds if $\theta=\pi-\psi$ and $\phi=\pi$ , these poles can be excluded by restricting to $\theta+\psi\leq\pi-\varepsilon$ for some $\varepsilon>0$ . Then, $K(\varepsilon):=\frac{\arccos h(\psi,\pi-\psi-\varepsilon,-\pi)}{\sqrt{1-h^{2}(\psi,\pi-\psi-\varepsilon,-\pi)}}$ replaces $\pi$ on the right hand side of the inequalities above and the powers of $s$ are higher by one. ∎

Since we will be interested in random variables, which exhibit a local minimum of the Fréchet function at the north pole $\theta=0$ , we first consider $f_{j}(\theta,0)$ . We note that

[TABLE]

so we can restrict attention to $f_{2}(\theta,0)$ and $f_{4}(\theta,0)$ .

[TABLE]

The function $\theta\mapsto f_{2}(\theta,0)$ is plotted for several $m$ in Figure 6 and $\theta\mapsto f_{4}(\theta,0)$ is given in Figure 7. From equation (5) we can see that $f_{2}(\theta,0)$ starts out positive at $\theta$ near [math] and has exactly one sign change at $\frac{1}{m-1}\sin\theta+\theta\,\cos\theta=0$ . We denote the position of the zero by $\theta_{m,2}$ . From equation (6) we can see that $f_{4}(\theta,0)$ starts out negative at $\theta$ near [math] and has exactly one sign change at a point we call $\theta_{m,4}$ . Furthermore, the Figures show that the contributions to both the second and fourth derivative are increasingly pronounced close to the equator with increasing dimension. Note that $f_{4}(\pi/2,0)=-4$ independent of dimension, as can be seen in Figure 7, so we immediately see $\theta_{m,4}>\frac{\pi}{2}$ .

Lemma A.3.

The position $\theta_{m,2}$ of the zero of $\theta\mapsto f_{2}(\theta,0)$ is bounded by

[TABLE]

Proof.

This is positive while $\frac{1}{m-1}\sin\theta+\theta\,\cos\theta>0$ . The point where it switches sign is determined by $\theta=-\frac{1}{m-1}\tan\theta$ , where $\theta>\pi/2$ . Using $\theta=\pi/2+\delta$ and $1-\delta^{2}/2\leq\cos\delta\leq 1-\delta^{2}/3$ and $\delta/2\leq\sin\delta\leq\delta$ , which hold on $[0,\pi/2]$

[TABLE]

Furthermore, observe

[TABLE]

∎

To prove similar bounds for $\theta_{m,4}$ we need a technical auxiliary result.

Lemma A.4.

We have the following bound

[TABLE]

Proof.

For $m=2$ , the inequality reads

[TABLE]

Inserting $\delta=0$ the inequality holds. Now consider the derivative

[TABLE]

which is obviously true and thus proves the claim for $m=2$ .

For the induction step, we must show

[TABLE]

To this end, we will show the following sequence of inequalities

[TABLE]

For the first inequality, note that

[TABLE]

where the left inequalities reflect the values for $\delta=0$ and we perform derivatives from row to row. This proves the first inequality. Next we note

[TABLE]

which proves the second inequality. This last estimate has the important property that its second derivative is monotonically growing on $[0,\pi/2]$ . Therefore, once $3\cos^{2}\delta+\frac{1}{2}\sin^{2}\delta>3-\delta^{2}$ for some delta, it would also hold for all larger $\delta$ , particularly $\delta=\pi/2$ . Since

[TABLE]

we have finally shown $g_{0}(\delta)\leq 3-\delta^{2}$ as desired. ∎

Lemma A.5.

The position $\theta_{m,4}$ of the zero of $\theta\mapsto f_{4}(\theta,0)$ is bounded by

[TABLE]

Proof.

To get a lower bound on the region of positive fourth derivative, we write again $\theta=\pi/2+\delta$ and define

[TABLE]

We can immediately see

[TABLE]

Using Lemma A.4 we can write

[TABLE]

and plugging in $\theta_{m,4}-\pi/2=\frac{16}{\pi(m-3)}$ , we get

[TABLE]

Next, we aim to prove $\theta_{m,2}\leq\theta_{m,4}$ . Using

[TABLE]

it suffices to show for $\delta_{m,2}:=\theta_{m,2}-\frac{\pi}{2}$

[TABLE]

which is equivalent to

[TABLE]

This is satisfied if

[TABLE]

Since we have shown the upper bound $\delta_{m,2}\leq\frac{1}{m-1}$ , this always holds which proves the claim. ∎

A.4 Hemisphere Model with Hole at the Cut Locus

The Fréchet function of any rotation symmetric random variable on $\mathbb{S}^{m}$ can be expressed as

[TABLE]

by using a probability measure $d\mathbb{P}(\theta)$ supported on $[0,\pi]$ , satisfying $\int g(\theta)\,d\mathbb{P}(\theta)=1$ . The results of the previous subsection can then be used to calculate the second and fourth derivative of the Fréchet function at the north pole.

We now present a family of random variables which exhibit a local Fréchet mean at the north pole while the range of the random variable has a hole containing the south pole. Consider a random variable $X$ distributed on the $m$ -dimensional unit sphere $\mathbb{S}^{m}$ ( $m\geq 4$ ) that is uniformly distributed on $\mathbb{L}_{m,\beta}$ with total mass $0<\alpha<1$ and assuming $\mu$ with probability $1-\alpha$ . Then we have the Fréchet function

[TABLE]

In order to see that this family of random random variables can exhibit smeariness for suitably chosen values of $\alpha$ and $\beta$ , we now calculate its second and fourth derivative at the north pole. The second derivative is

[TABLE]

The fourth derivative is

[TABLE]

Recall that due to Lemma A.2 the result for the second derivative holds for $m\geq 2$ if $\beta>0$ and for $m\geq 3$ otherwise. The result for the fourth derivative holds for $m\geq 4$ if $\beta>0$ and for $m\geq 5$ otherwise.

In the following, for a fixed value of $\beta$ the value of $\alpha$ for which the second derivative at $\psi=0$ vanishes will be denoted by $\alpha_{\beta}$ .

Lemma A.6.

For every $\beta<\beta_{m,2}$ there is an $0\leq\alpha_{\beta}\leq 1$ such that $\frac{\partial^{2}F}{\partial\psi^{2}}(\alpha_{\beta},\beta,0)=0$ . This $\beta_{m,2}$ satisfies the bounds

[TABLE]

Proof.

From the calculated second derivative we get

[TABLE]

To have a valid random variable, we need $0\leq\alpha_{\beta}\leq 1$ . Defining the function

[TABLE]

the condition $0\leq\alpha_{\beta}\leq 1$ is equivalent to $b_{m,2}(\beta)\geq 0$ . Let $\beta_{m,2}$ be the first zero of $b_{m,2}$ . To see that $\beta_{m,2}<\pi/2$ note that

[TABLE]

Plugging $\beta=\frac{\pi}{2}-\frac{1}{2(m-1)}$ into the right hand side, we get for $m\geq 2$

[TABLE]

Furthermore, note that using $\delta=\frac{\pi}{2}-\beta$ ,

[TABLE]

and therefore

[TABLE]

∎

Lemma A.7.

There is a $\beta_{m,4}$ such that every $\beta<\beta_{m,4}$ satisfies $\frac{\partial^{4}F}{\partial\psi^{4}}(\alpha_{\beta},\beta,0)>0$ . This $\beta_{m,4}$ satisfies the bounds

[TABLE]

Proof.

Using

[TABLE]

we can give the necessary condition $b_{m,4}(\beta)>0$ for a local minimum of the Fréchet function at $\psi=0$ . From this relation we can determine a minimal $\beta_{m,4}$ such that $b_{m,4}(\beta_{m,4})\geq 0$ for every dimension $m\geq 4$ giving a dimension dependent maximal hole size. Note that

[TABLE]

which implies $\beta_{m,4}\leq\beta_{m,2}$ .

Let $\delta=\frac{\pi}{2}-\beta$ and use $1-\delta^{2}/2\leq\cos\delta\leq 1-\delta^{2}/3$ and $\delta/2\leq\sin\delta\leq\delta$ , which hold on $[0,\pi/2]$ , then, assuming $m\geq 3$

[TABLE]

Now, plugging in $\delta=\frac{1}{2(m-3)}$ we get for $m\geq 4$

[TABLE]

From this, we get the upper bound

[TABLE]

Analogously, we show the lower bound, by first noting

[TABLE]

and then calculating

[TABLE]

which establishes the lower bound

[TABLE]

∎

Numerically determined values of $\beta_{m,2}$ and $\beta_{m,4}$ along with the lower bounds from Lemmas A.6 and A.7 are displayed in Figure 9.

An $\alpha_{\beta}$ , such that the Hessian vanishes, exists for ${\left(\pi/2-(\pi-\beta)\,\sin^{m-1}\beta\right)>0}$ , which can indeed be satisfied, as evidenced by the lower bound $\beta_{m,2}$ determined here. Note that this result is valid for all $m\geq 2$ , but we only get a positive fourth derivative and thus a local minimum in the case of vanishing Hessian for dimension $m\geq 4$ .

To show that the local minimum at $\psi=0$ is indeed a global minimum at least for some $\beta>0$ , we use a Lipschitz argument as follows. Recall that for $\beta=0$

[TABLE]

To show that we have a global minimum at $\psi=0$ , we need for every $\psi\in(0,\pi]$ that $\frac{\partial^{2}F}{\partial\psi^{2}}>0$ . In order to show this, we prove the following Lipschitz conditions, where $\alpha_{i}$ denotes the $\alpha_{\beta}$ corresponding to $\beta_{i}$ .

Lemma A.8.

There are dimension dependent constants $L_{2}$ , $L_{3}$ and $L_{4}$ such that

[TABLE]

Proof.

Note that

[TABLE]

are valid Lipschitz constants. Thus we note for $j=2,3,4$

[TABLE]

We know $|\alpha|=\alpha<1$ and $|g|=g\leq g(0)$ . $\left|\frac{\partial\alpha}{\partial\beta}\right|$ and $\left|\frac{\partial g}{\partial\beta}\right|$ can also trivially be bounded, since $\beta_{m,4}<\pi/2$ . So only $f_{j}(\theta,\psi)$ and their $\theta$ integrals remain to be bounded. Since the numerators can all be easily bounded, the only problem is to bound the denominators under the integrals. Using the boundedness shown in Lemma A.2 we see that we need $m\geq 5$ for these bounds to be finite.

Collecting all the estimates, we get the desired Lipschitz constants. ∎

Using the Lipschitz constants, we can now show that the local minima are global for suitably small $\beta>0$ . Since we need these bounds to hold on the full range $\psi\in[0,\pi]$ , we cannot improve dimension as simply as in the case of the derivatives at $\psi=0$ . Note that the estimates in Equation (4) which these calculations rely on are very generous and might be improved by a more careful treatment.

A.5 Proof of Theorem 4.2

Consider a probability measure with a point mass at the north pole with weight $1-\alpha$ and a uniform distribution with weight $\alpha$ on the $\mathbb{S}^{m-1}$ at $\theta=\theta_{*}$ . Since the contribution of the second term to the second derivative of the Fréchet function at the north pole is negative and the contribution to the fourth derivative is positive, there is an $\alpha_{0}>0$ such that the Hessian of the Fréchet function at the north pole vanishes and the fourth derivative is positive, such that there is a local Fréchet mean at the north pole. For general $\alpha\in[0,1]$ the Hessian of the Fréchet function at the north pole is given by

[TABLE]

Since $\textnormal{Cov}[\mbox{\rm grad}\rho(0,X)]=\frac{4\alpha}{m}\theta_{*}^{2}\textnormal{Id}_{m}$ , we get

[TABLE]

Since $\alpha<\alpha^{0}$ can be freely chosen, the claim follows.

Bibliography49

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Afsari (2009) Afsari, B. (2009). Means and averaging on Riemannian manifolds . University of Maryland.
2Afsari (2011) Afsari, B. (2011). Riemannian L p superscript 𝐿 𝑝 {L}^{p} center of mass: existence, uniqueness, and convexity. Proceedings of the American Mathematical Society 139 , 655–773.
3Ahidar-Coutrix et al. (2019) Ahidar-Coutrix, A., T. Le Gouic, and Q. Paris (2019). Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics. Probability Theory and Related Fields 177 , 323–368.
4Arnaudon and Miclo (2014) Arnaudon, M. and L. Miclo (2014). Means in complete manifolds: uniqueness and approximation. ESAIM: Probability and Statistics 18 , 185–206.
5Barden et al. (2013) Barden, D., H. Le, and M. Owen (2013). Central limit theorems for Fréchet means in the space of phylogenetic trees. Electronic Journal of Probability 18 (25), 1–25.
6Barden et al. (2018) Barden, D., H. Le, and M. Owen (2018). Limiting behaviour of fréchet means in the space of phylogenetic trees. Annals of the Institute of Statistical Mathematics 70 (1), 99–129.
7Bhattacharya and Bhattacharya (2008) Bhattacharya, A. and R. Bhattacharya (2008). Statistics on riemannian manifolds: asymptotic distribution and curvature. Proceedings of the American Mathematical Society 136 , 2959–2967.
8Bhattacharya and Bhattacharya (2012) Bhattacharya, A. and R. Bhattacharya (2012). Nonparametric Inference on Manifolds . Cambridge University Press.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Geometrical Smeariness – A new Phenomenon of Fréchet Means

Abstract

1 Introduction

2 Cut Locus Smeariness and Geometrical Smeariness

2.1 Basic Notions

Definition 2.1**.**

Definition 2.2**.**

Definition 2.3** (Population Fréchet mean).**

Definition 2.4** (Local and Global Fréchet mean).**

Assumption 2.5**.**

Definition 2.6** (Fréchet function in exponential chart).**

Notation 2.7**.**

2.2 Smeariness

Definition 2.8** (Smeariness of Random Variables).**

Definition 2.9** (Cut Locus Smeariness and Geometrical Smeariness).**

Theorem 2.10**.**

Proof.

Conjecture 2.11**.**

Theorem 2.12**.**

Proof.

3 Geometrical Smeariness on Spheres

Theorem 3.1**.**

Proof.

Corollary 3.2**.**

Proof.

Theorem 3.3** (Curse of dimensionality).**

Proof.

Conjecture 3.4**.**

4 Consequences for Applications

Definition 4.1** (Sample Fréchet mean).**

4.1 Finite Sample Smeariness

Definition 4.2**.**

Theorem 4.3**.**

Proof.

4.2 Landmark Pre-Shapes

4.3 Examples of Smeariness in Magnetic Pole Transitions

Acknowledgments

Appendix A Smeariness with Holes

A.1 The Basic Hemisphere Model

Lemma A.1**.**

A.2 Rotation Symmetric Random Variables

A.2.1 Proof of Theorem 2.14

A.3 Derivatives of the Fréchet Function

Lemma A.2**.**

Proof.

Lemma A.3**.**

Proof.

Lemma A.4**.**

Proof.

Lemma A.5**.**

Proof.

A.4 Hemisphere Model with Hole at the Cut Locus

Lemma A.6**.**

Proof.

Lemma A.7**.**

Proof.

Lemma A.8**.**

Proof.

A.5 Proof of Theorem 4.2

Definition 2.1.

Definition 2.2.

Definition 2.3 (Population Fréchet mean).

Definition 2.4 (Local and Global Fréchet mean).

Assumption 2.5.

Definition 2.6 (Fréchet function in exponential chart).

Notation 2.7.

Definition 2.8 (Smeariness of Random Variables).

Definition 2.9 (Cut Locus Smeariness and Geometrical Smeariness).

Theorem 2.10.

Conjecture 2.11.

Theorem 2.12.

Theorem 3.1.

Corollary 3.2.

Theorem 3.3 (Curse of dimensionality).

Conjecture 3.4.

Definition 4.1 (Sample Fréchet mean).

Definition 4.2.

Theorem 4.3.

Lemma A.1.

Lemma A.2.

Lemma A.3.

Lemma A.4.

Lemma A.5.

Lemma A.6.

Lemma A.7.

Lemma A.8.