The Logarithm Map, its Limits and Frechet Means in Orthant Spaces

Dennis Barden; Huiling Le

arXiv:1703.07081·math.PR·June 27, 2018

The Logarithm Map, its Limits and Frechet Means in Orthant Spaces

Dennis Barden, Huiling Le

PDF

TL;DR

This paper investigates the properties of the logarithm map in orthant spaces and uses these insights to analyze Frechet means, including their characterization and limiting distributions, advancing understanding in stratified metric spaces.

Contribution

It provides a detailed analysis of the logarithm map in orthant spaces and characterizes Frechet means, including their asymptotic behavior, which was previously not well-understood.

Findings

01

Derived explicit expressions for the logarithm map in orthant spaces.

02

Characterized the Frechet means in stratified spaces.

03

Established the limiting distribution of sample Frechet means.

Abstract

The first part of the paper studies the expression for, and the properties of, the logarithm map on an orthant space, which is a simple stratified space, with the aim of analysing Frechet means of probability measures on such a space. In the second part, we use these results to characterise Frechet means and to derive various of their properties, including the limiting distribution of sample Frechet means.

Equations252

\frac{1}{2} \int_{M} d (x, x^{'})^{2} d μ (x^{'})

\frac{1}{2} \int_{M} d (x, x^{'})^{2} d μ (x^{'})

O (E) = {i = 1 \sum m λ_{i} u_{l_{i}}^{A} ∣ λ_{i} > 0} .

O (E) = {i = 1 \sum m λ_{i} u_{l_{i}}^{A} ∣ λ_{i} > 0} .

∠_{x} (γ_{1}, γ_{2}) = t \to 0 lim \overline{∠}_{x} (γ_{1} (t), γ_{2} (t)),

∠_{x} (γ_{1}, γ_{2}) = t \to 0 lim \overline{∠}_{x} (γ_{1} (t), γ_{2} (t)),

≪ w_{1}, w_{2} ≫= ∥ w_{1} ∥ ∥ w_{2} ∥ cos ∠_{x} (γ_{1}, γ_{2}),

≪ w_{1}, w_{2} ≫= ∥ w_{1} ∥ ∥ w_{2} ∥ cos ∠_{x} (γ_{1}, γ_{2}),

ρ_{x} (w_{1}, w_{2}) = {∥ w_{1} ∥^{2} + ∥ w_{2} ∥^{2} - 2 ≪ w_{1}, w_{2} ≫}^{1/2}

ρ_{x} (w_{1}, w_{2}) = {∥ w_{1} ∥^{2} + ∥ w_{2} ∥^{2} - 2 ≪ w_{1}, w_{2} ≫}^{1/2}

\displaystyle\begin{array}[]{rcl}E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})&=&\left\{E(\mathop{\boldsymbol{x}}\nolimits_{1})\cap E(\mathop{\boldsymbol{x}}\nolimits_{2})\right\}\\ &&\bigcup\,\{e\in E(\mathop{\boldsymbol{x}}\nolimits_{1})\mid e\hbox{ is compatible with }E(\mathop{\boldsymbol{x}}\nolimits_{2})\}\\ &&\bigcup\,\{e\in E(\mathop{\boldsymbol{x}}\nolimits_{2})\mid e\hbox{ is compatible with }E(\mathop{\boldsymbol{x}}\nolimits_{1})\},\end{array}

\displaystyle\begin{array}[]{rcl}E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})&=&\left\{E(\mathop{\boldsymbol{x}}\nolimits_{1})\cap E(\mathop{\boldsymbol{x}}\nolimits_{2})\right\}\\ &&\bigcup\,\{e\in E(\mathop{\boldsymbol{x}}\nolimits_{1})\mid e\hbox{ is compatible with }E(\mathop{\boldsymbol{x}}\nolimits_{2})\}\\ &&\bigcup\,\{e\in E(\mathop{\boldsymbol{x}}\nolimits_{2})\mid e\hbox{ is compatible with }E(\mathop{\boldsymbol{x}}\nolimits_{1})\},\end{array}

O (B_{0} \cup B_{1} \cup \dots \cup B_{i - 1} \cup A_{i + 1} \cup \dots \cup A_{k}) .

O (B_{0} \cup B_{1} \cup \dots \cup B_{i - 1} \cup A_{i + 1} \cup \dots \cup A_{k}) .

A = (A_{0}, A_{1}, \dots, A_{k}) and B = (B_{0}, B_{1}, \dots, B_{k}),

A = (A_{0}, A_{1}, \dots, A_{k}) and B = (B_{0}, B_{1}, \dots, B_{k}),

O_{i} = O (A_{0} \cup B_{1} \cup \dots \cup B_{i} \cup A_{i + 1} \cup \dots \cup A_{k}), i = 0, 1, \dots, k,

O_{i} = O (A_{0} \cup B_{1} \cup \dots \cup B_{i} \cup A_{i + 1} \cup \dots \cup A_{k}), i = 0, 1, \dots, k,

O_{0}

O_{0}

O_{1}

O_{2}

O_{3}

O_{4}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}{\|P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits_{2})\|}<\frac{\|P_{A_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}{\|P_{B_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{2})\|};}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}{\|P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits_{2})\|}<\frac{\|P_{A_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}{\|P_{B_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{2})\|};}

O^{'} = O (B_{0} \cup B_{1} \cup \dots \cup B_{i - 1} \cup D_{i 1} \cup C_{i 2} \cup A_{i + 1} \cup \dots \cup A_{k})

O^{'} = O (B_{0} \cup B_{1} \cup \dots \cup B_{i - 1} \cup D_{i 1} \cup C_{i 2} \cup A_{i + 1} \cup \dots \cup A_{k})

\frac{∥ P _{C_{i 1}} ( x _{1} ) ∥}{∥ P _{D_{i 1}} ( x _{2} ) ∥} ⩾ \frac{∥ P _{C_{i 2}} ( x _{1} ) ∥}{∥ P _{D_{i 2}} ( x _{2} ) ∥} .

\frac{∥ P _{C_{i 1}} ( x _{1} ) ∥}{∥ P _{D_{i 1}} ( x _{2} ) ∥} ⩾ \frac{∥ P _{C_{i 2}} ( x _{1} ) ∥}{∥ P _{D_{i 2}} ( x _{2} ) ∥} .

Φ (x; x^{*}) = lo g_{x^{*}} (x) + x^{*}

Φ (x; x^{*}) = lo g_{x^{*}} (x) + x^{*}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\jmath\left(P_{B_{0}}(\mathop{\boldsymbol{x}}\nolimits),-\frac{\|P_{B_{1}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*}),\cdots,-\frac{\|P_{B_{k}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{k}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{k}}(\mathop{\boldsymbol{x}}\nolimits^{*})\right),}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\jmath\left(P_{B_{0}}(\mathop{\boldsymbol{x}}\nolimits),-\frac{\|P_{B_{1}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*}),\cdots,-\frac{\|P_{B_{k}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{k}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{k}}(\mathop{\boldsymbol{x}}\nolimits^{*})\right),}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}v_{0}=P_{B_{0}}(\mathop{\boldsymbol{x}}\nolimits)-P_{A_{0}}(\mathop{\boldsymbol{x}}\nolimits^{*})}\in\mathbb{R}(A_{0}).

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}v_{0}=P_{B_{0}}(\mathop{\boldsymbol{x}}\nolimits)-P_{A_{0}}(\mathop{\boldsymbol{x}}\nolimits^{*})}\in\mathbb{R}(A_{0}).

v_{i}=-{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|+\|P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})}.

v_{i}=-{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|+\|P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits)\|}{\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})}.

lo g_{x^{*}} : x \mapsto  (v_{0}, v_{1}, \dots, v_{k}),

lo g_{x^{*}} : x \mapsto  (v_{0}, v_{1}, \dots, v_{k}),

Φ (x; x^{*}) = (- x_{3}, x_{2}, 0, 0, 0);

Φ (x; x^{*}) = (- x_{3}, x_{2}, 0, 0, 0);

Φ (x, x^{*}) = (- x_{3}, - x_{4}, 0, 0, 0);

Φ (x, x^{*}) = (- x_{3}, - x_{4}, 0, 0, 0);

Φ (x, x^{*}) = - \frac{∥ x ∥}{∥ x ^{*} ∥} (x_{1}^{*}, x_{2}^{*}, 0, 0, 0) .

Φ (x, x^{*}) = - \frac{∥ x ∥}{∥ x ^{*} ∥} (x_{1}^{*}, x_{2}^{*}, 0, 0, 0) .

A^{'} = (A_{0}, A_{1}, \dots, A_{i_{0} - 1}, A_{i_{0}} \cup A_{i_{0} + 1}, A_{i_{0} + 2}, \dots, A_{k})

A^{'} = (A_{0}, A_{1}, \dots, A_{i_{0} - 1}, A_{i_{0}} \cup A_{i_{0} + 1}, A_{i_{0} + 2}, \dots, A_{k})

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\mathop{\mathcal{O}}\nolimits^{\prime\prime}=\mathop{\mathcal{O}}\nolimits(B_{0}\cup B_{1}\cup\cdots\cup B_{i_{0}-1}\cup B_{i_{0}+1}\cup A_{i_{0}}\cup A_{i_{0}+2}\cup\cdots\cup A_{k})}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\mathop{\mathcal{O}}\nolimits^{\prime\prime}=\mathop{\mathcal{O}}\nolimits(B_{0}\cup B_{1}\cup\cdots\cup B_{i_{0}-1}\cup B_{i_{0}+1}\cup A_{i_{0}}\cup A_{i_{0}+2}\cup\cdots\cup A_{k})}

A^{''} = (A_{0}, A_{1}, \dots, A_{i_{0} - 1}, A_{i_{0} + 1}, A_{i_{0}}, A_{i_{0} + 2}, \dots, A_{k})

A^{''} = (A_{0}, A_{1}, \dots, A_{i_{0} - 1}, A_{i_{0} + 1}, A_{i_{0}}, A_{i_{0} + 2}, \dots, A_{k})

O = O (B_{0} \cup \dots \cup B_{i_{0} - 1} \cup E_{1} \cup F_{1} \cup C_{2} \cup D_{2} \cup A_{i_{0} + 2} \cup \dots \cup A_{k})

O = O (B_{0} \cup \dots \cup B_{i_{0} - 1} \cup E_{1} \cup F_{1} \cup C_{2} \cup D_{2} \cup A_{i_{0} + 2} \cup \dots \cup A_{k})

\frac{∥ P _{C_{1} \cup D_{1}} ( x ^{*} ) ∥}{∥ P _{E_{1} \cup F_{1}} ( x _{2} ) ∥} < \frac{∥ P _{C_{2} \cup D_{2}} ( x ^{*} ) ∥}{∥ P _{E_{2} \cup F_{2}} ( x _{2} ) ∥} .

\frac{∥ P _{C_{1} \cup D_{1}} ( x ^{*} ) ∥}{∥ P _{E_{1} \cup F_{1}} ( x _{2} ) ∥} < \frac{∥ P _{C_{2} \cup D_{2}} ( x ^{*} ) ∥}{∥ P _{E_{2} \cup F_{2}} ( x _{2} ) ∥} .

\frac{∥ P _{C_{1} \cup D_{1}} ( x ^{*} ) ∥}{∥ P _{E_{1} \cup F_{1}} ( x _{2} ) ∥} < \frac{∥ P _{A_{i_{0}} \cup A_{i_{0} + 1}} ( x ^{*} ) ∥}{∥ P _{B_{i_{0}} \cup B_{i_{0} + 1}} ( x _{2} ) ∥} < \frac{∥ P _{C_{2} \cup D_{2}} ( x ^{*} ) ∥}{∥ P _{E_{2} \cup F_{2}} ( x _{2} ) ∥}

\frac{∥ P _{C_{1} \cup D_{1}} ( x ^{*} ) ∥}{∥ P _{E_{1} \cup F_{1}} ( x _{2} ) ∥} < \frac{∥ P _{A_{i_{0}} \cup A_{i_{0} + 1}} ( x ^{*} ) ∥}{∥ P _{B_{i_{0}} \cup B_{i_{0} + 1}} ( x _{2} ) ∥} < \frac{∥ P _{C_{2} \cup D_{2}} ( x ^{*} ) ∥}{∥ P _{E_{2} \cup F_{2}} ( x _{2} ) ∥}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{C_{1}\cup D_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{E_{1}\cup F_{1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}\leqslant\frac{\|P_{A_{i_{0}}\cup A_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{B_{i_{0}}\cup B_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}\leqslant\frac{\|P_{C_{2}\cup D_{2}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{E_{2}\cup F_{2}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}.}

\displaystyle{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{\|P_{C_{1}\cup D_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{E_{1}\cup F_{1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}\leqslant\frac{\|P_{A_{i_{0}}\cup A_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{B_{i_{0}}\cup B_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}\leqslant\frac{\|P_{C_{2}\cup D_{2}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|}{\|P_{E_{2}\cup F_{2}}(\mathop{\boldsymbol{x}}\nolimits_{1})\|}.}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The Logarithm Map, its Limits

and Fréchet Means in Orthant Spaces

D. Barden Girton College, University of Cambridge, Cambridge, CB3 0JG, UK ([email protected]).

H. Le School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, UK ([email protected]).

Abstract

The first part of the paper studies the expression for, and the properties of, the logarithm map on an orthant space, which is a simple stratified space, with the aim of analysing Fréchet means of probability measures on such a space. In the second part, we use these results to characterise Fréchet means and to derive various of their properties, including the limiting distribution of sample Fréchet means.

Keywords: Fréchet mean; limiting distribution of sample Fréchet means; logarithm map; stratified space.

AMS MSC 2010: 60B05; 60B10.

1 Introduction

Several papers have recently appeared concerning probabilistic and statistical analysis of data on certain stratified spaces (cf. [5], [2], [10], [1] and [11]). One such example is the analysis of phylogenetic trees on the BHV space introduced in [5] (cf. [9], [19], [17], [3], [12], [15] and [18]). The BHV space $\mathop{\boldsymbol{T}}_{\!m+2}\nolimits$ of metric trees with $m+2$ leaves is a stratified CAT(0)-space with each stratum being isometric with a positive Euclidean orthant that is at most $m$ -dimensional. It is already clear from these preliminary results that some fundamental statistics exhibit strikingly different features from the corresponding ones on Euclidean spaces or on manifolds and that one faces significant challenges in developing novel tools to analyse them, on account of the non-trivial topological structure of these spaces. It also becomes apparent that, although the topological and geometrical properties of stratified spaces have been extensively studied and are mostly well understood, many of the properties required for probabilistic and statistical analysis of data on these spaces have not been addressed.

This paper concentrates on orthant spaces introduced in [15], a relatively simple type of stratified space but more general than the space $\mathop{\boldsymbol{T}}_{\!m+2}\nolimits$ of phylogenetic trees. The latter has $(2m+1)!!$ $m$ -dimensional strata, together with their bounding strata, selected from among the $\begin{pmatrix}M\\ m\end{pmatrix}$ positive orthants in $\mathbb{R}^{M}$ where $M=2^{m+2}-m-4$ . In particular, each co-dimension one stratum bounds exactly three top-dimensional strata. Thus not only are the relevant dimensions sparse, but the percentage of the positive orthants occupied by the tree space of each dimension declines exponentially. These constraints, such as the restrictions on the dimension and the number of orthants involved in the space, no longer hold in a general orthant space, although we do have to make one restriction to ensure that it is a CAT(0)-space. We shall recall, in the next section, the concept of an orthant space, introducing the subsidiary concepts and definitions we use to describe the structure of such spaces and, in particular, of their tangent cones at the various points.

A fundamental concept for statistical analysis of non-Euclidean data is that of the Fréchet mean, which generalises the concept of the mean of Euclidean data. A point $\mathop{\boldsymbol{x}}\nolimits_{0}$ in a metric space $\mathop{\boldsymbol{M}}\nolimits$ is a Fréchet mean of a probability measure $\mu$ on $\mathop{\boldsymbol{M}}\nolimits$ if, at $\mathop{\boldsymbol{x}}\nolimits_{0}$ , the Fréchet function of $\mu$ defined by

[TABLE]

attains its global minimum. In order to characterise and locate Fréchet means, we need to take directional derivatives of the Fréchet function and hence, implicitly, of the distance function. The latter involves the logarithm map $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ which, analogous to the inverse of the exponential map on manifolds, is the initial tangent vector to the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ . This logarithm map is globally well-defined on CAT(0)-spaces and has been studied, for example, in [14] and [16]. However, these results do not cover all the properties required for our analysis, although naturally we do rely on some of their results. On the other hand, an algorithm for finding the geodesic between any two given trees in the tree space $\mathop{\boldsymbol{T}}_{\!m+2}\nolimits$ was given in [19] and, using the analysis behind that algorithm, the expression for the logarithm map $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ was obtained in [3] when $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum. Although this expression for $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ could be extended to more general orthant spaces, it is noted in [3] that these results are not adequate to provide a tool for analysing Fréchet means when they lie in any stratum of co-dimension at least two. The latter requires a better understanding of the behaviour of the logarithm map as the end points of the geodesics move within and between strata. To this end, we first re-examine geodesics directly from first principles in Section 3, in particular avoiding the implicit assumption that $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum. This leads, in Section 4, to an explicit expression, given in Theorem 1, for a version of the logarithm map that we shall use, valid for any point in an orthant space. Since the form of this expression is determined by the carrier of the geodesic, we analyse possible changes in that carrier, focussing on the set, specified in Definition 11, of points $\mathop{\boldsymbol{x}}\nolimits$ at which significant changes occur. This allows us, in Section 5, to derive the directional limits of the logarithm map as the reference point $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ approaches $\mathop{\boldsymbol{x}}\nolimits^{*}$ from a co-bounding stratum. We also study the projections of these limits, and the limits of the projections, onto the various strata related to the stratum in which $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies. This enables us to prove the existence of, and to identify, certain of their derivatives and directional derivatives.

With this understanding of the logarithm map, the second part of the paper turns its attention to the analysis of Fréchet means. In Section 6 we obtain, in Theorem 3, the necessary and sufficient conditions for a point $\mathop{\boldsymbol{x}}\nolimits^{*}$ to be the Fréchet mean of a probability measure on the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ . Two special sets arise in this analysis. Firstly, one of the criteria in Theorem 3 involves an inequality and the set, specified in Definition 12, of vectors in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at the Fréchet mean for which that is an equality is significant. Secondly, there is the set given by Definition 13. This is related to a limit of the logarithm map and, in a certain sense, encapsulates the ‘departure’ of this limit from the analogous behaviour of the logarithm map on a Euclidean space. Both of these sets are related to the limiting distribution of sample Fréchet means, which we establish in the final Section 7. There, in particular, we relate the limiting distribution with Euclidean Gaussian random variables. The covariance matrices of these random variables are related to the derivative of the projection of the logarithm map and to projections of the limits of the logarithm map.

Although we do not make it explicit, in view of our previous results for $\mathop{\boldsymbol{T}}_{\!m+2}\nolimits$ and the comments in [15], our interest in this paper is primarily in the case that $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum of local co-dimension at least two. The results, when restricted to a locally top-dimensional or co-dimension one stratum, do generalise those for tree spaces in [3] although the approach here is necessarily more complex in order to encompass all cases.

2 Orthant spaces

Throughout this paper, we shall use the term ‘positive’ to mean strictly positive. By an open positive orthant in the Euclidean space $\mathbb{R}^{M}$ we shall mean, for some subset $E=\left(u^{\phantom{A}}_{l_{1}},\cdots,u^{\phantom{A}}_{l_{m}}\right)$ of the standard ordered orthonormal basis $U=\left(u_{1},\cdots,u^{\phantom{A}}_{M}\right)$ of $\mathbb{R}^{M}$ , the relatively open set

[TABLE]

We denote by $\mathbb{R}(E)$ the subspace spanned by $E$ , and we shall refer to the $u^{\phantom{A}}_{l_{i}}\in E$ as the axes of $\mathbb{R}(E)$ or of $\mathop{\mathcal{O}}\nolimits(E)$ . Then, an orthant space is a union of open positive orthants in a common Euclidean space with certain natural constraints, as specified in the following definition, that ensure, for example, that such spaces are also CAT(0). Orthant spaces were first introduced in [15] as a generalisation of the tree spaces of [5].

Definition 1.

For two given integers $M\geqslant m$ , an orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ of dimension $m$ is a subspace of the Euclidean space $\mathbb{R}^{M}$ that is a union of open positive orthants, whose maximum dimension is $m$ , and has the intrinsic metric induced from the Euclidean metric on $\mathbb{R}^{M}$ . It satisfies the following conditions:

$(i)$

for every orthant $\sigma$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the orthants in the closure $\overline{\sigma}$ of $\sigma$ are also included in $\mathop{\boldsymbol{X}}\nolimits^{m}$ ;

$(ii)$

if, for any positive orthant $\sigma$ in $\mathbb{R}^{M}$ , all the $2$ -dimensional orthants in its closure are in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , then $\sigma$ itself is in $\mathop{\boldsymbol{X}}\nolimits^{m}$ .

The intrinsic metric on $\mathop{\boldsymbol{X}}\nolimits^{m}$ is the length metric as defined in [6]. It is the metric $d$ for which, for any two points $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the distance $d(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ is the infimum of the lengths of piecewise linear paths in $\mathop{\boldsymbol{X}}\nolimits^{m}$ joining $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ . In particular, a geodesic will also be piecewise linear and linear within each stratum.

Note that there is no loss of generality in restricting $\mathop{\boldsymbol{X}}\nolimits^{m}$ to contain only positive orthants: given two orthants that differ only in having positive or negative coordinates with respect to one particular axis, the intrinsic metric will be the same as it would be if we replace, say the negative axis, by an axis orthogonal to $\mathbb{R}^{M}$ . Thus, rather than considering $\mathop{\boldsymbol{X}}\nolimits^{m}$ to be a union of arbitrary orthants in $\mathbb{R}^{M}$ , we could consider it to be a union of positive orthants in $\mathbb{R}^{2M}$ . Henceforth, we shall assume all our orthants to be open and positive, mentioning their closure explicitly where that is relevant.

The first condition in the above definition correlates with the constraints used in the definition for orthant space in [15] and the second one restricts attention to the ‘non-positively curved’ orthant spaces in [15] (Proposition 6.10). These two conditions were first used by the authors of [5] to ensure the CAT(0)-property for tree spaces.

Throughout the rest of the paper, $\mathop{\boldsymbol{X}}\nolimits^{m}$ will denote an orthant space of fixed dimension $m$ viewed as comprising strata that are orthants of a fixed Euclidean space $\mathbb{R}^{M}$ , where $M$ is not necessarily $2^{m+2}-m-4$ as it would be for tree space. Also, whenever we specify an orthant by a union of subsets of the standard orthonormal basis $U$ of $\mathbb{R}^{M}$ , that will always be intended as a union of mutually disjoint subsets.

The orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ so defined is a Whitney stratified set in the sense of Thom, [21], the strata being the various orthants that comprise $\mathop{\boldsymbol{X}}\nolimits^{m}$ . Note that, since $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a union of orthants in a fixed Euclidean space $\mathbb{R}^{M}$ , the number of strata in $\mathop{\boldsymbol{X}}\nolimits^{m}$ is always finite. $\mathop{\boldsymbol{X}}\nolimits^{m}$ has the structure of a cone with vertex, or ‘cone point’, the origin $o$ in $\mathbb{R}^{M}$ , since each orthant is such a cone without its vertex, but that vertex, the origin, is necessarily included in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . In particular, $\{o\}$ is the unique zero-dimensional stratum in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . Note however that our relatively open strata differ from those in [6].

The CAT(0)-property of the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ results as follows, where all the references are to [6]. The intersection $L$ of $\mathop{\boldsymbol{X}}\nolimits^{m}$ with the unit sphere in $\mathbb{R}^{M}$ is a simplicial complex on account of condition $(i)$ and, since the axes in $\mathbb{R}^{M}$ are orthogonal, it is an ‘all-right spherical complex’ (Section 7A.10) which, on account of condition $(ii)$ , is a ‘flag complex’. Then, by a theorem of Gromov (Theorem 5.18), $L$ is a CAT(1)-space. The metric on $\mathop{\boldsymbol{X}}\nolimits^{m}$ implied by describing it as the [math]-cone over $L$ (Definition 5.6) is the intrinsic metric so that, by the theorem of Berestowski (Theorem 3.14), $\mathop{\boldsymbol{X}}\nolimits^{m}$ is CAT(0).

In particular, by the Cartan-Hadamard theorem (cf. [6], p.193), there is a unique geodesic between any two points of the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ . It follows that each stratum is totally geodesic in the strong sense that, if a geodesic contains two points of a stratum, it must include the entire linear segment in that stratum determined by those two points. On the other hand, although the distance metric for the CAT(0)-structure is induced from the Euclidean metric, the angles along and between curves may differ for the two contexts. For example, a geodesic, defined as a shortest path between its endpoints in either context, will be a piecewise linear curve in $\mathbb{R}^{M}$ , linear in each stratum, with angle $\pi/2$ in the Euclidean subspace metric where it changes stratum. However, for the CAT(0)-structure, that angle is defined to be $\pi$ .

The properties of an orthant space are largely determined by the incidence relations between its various strata. The following definitions capture two such relationships that will be used frequently in the paper.

Definition 2.

For subsets $E$ and $F$ of the standard orthonormal basis $U=(u_{1},\cdots,u^{\phantom{A}}_{M})$ of $\mathbb{R}^{M}$ , if $E\subseteq F$ , then the orthant ${\cal O}(E)$ is said to bound ${\cal O}(F)$ and ${\cal O}(F)$ to co-bound ${\cal O}(E)$ .

Note that, unlike the case for tree spaces, strata of lower dimension than $m$ need not bound any higher dimensional strata, in particular they need not bound $m$ -dimensional strata.

Definition 3.

An orthant $\sigma$ of dimension $k$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ is said to have co-dimension $m-k$ and, if $m^{\prime}(\leqslant m)$ is the maximum dimension of orthants that $\sigma$ co-bounds, then $\sigma$ is said to have local co-dimension $m^{\prime}-k$ .

The tangent cone

It is natural for our purposes to follow [6] and to define the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at a point $\mathop{\boldsymbol{x}}\nolimits$ to consist of all initial tangent vectors to smooth curves starting from $\mathop{\boldsymbol{x}}\nolimits$ , the smoothness possibly only being one-sided at $\mathop{\boldsymbol{x}}\nolimits$ . Note, however, that this is not the same as the generalised tangent space of [8]. To describe the tangent cone in more detail we work in $\mathbb{R}^{M}$ . Then, when $\mathop{\boldsymbol{x}}\nolimits$ lies in a top-dimensional, or locally top-dimensional, stratum $\sigma$ of dimension $m^{\prime}(\leqslant m)$ , the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ is locally an $m^{\prime}$ -dimensional manifold so that a smooth curve can be extended on both sides of $\mathop{\boldsymbol{x}}\nolimits$ . Thus, the tangent cone will be the usual tangent space, a subspace of $\mathbb{R}^{M}$ isometric with $\mathbb{R}^{m^{\prime}}$ and tangent to $\sigma$ . However, if $\mathop{\boldsymbol{x}}\nolimits$ lies in a stratum of locally positive co-dimension, then the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ is no longer locally a manifold. Consequently, the tangent cone at $\mathop{\boldsymbol{x}}\nolimits$ is no longer a Euclidean space. For example, if the stratum $\sigma$ has co-dimension one and bounds top-dimensional strata, the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits$ is an open book: it has a closed half space $\mathbb{H}^{m}$ for each top-dimensional stratum $\tau$ co-bounding $\sigma$ , with all the boundary $(m-1)$ -dimensional faces identified with each other and with the tangent space to $\sigma$ at $\mathop{\boldsymbol{x}}\nolimits$ .

More generally, the tangent cone at a point $\mathop{\boldsymbol{x}}\nolimits$ , in a stratum $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ of co-dimension $l(\geqslant 1)$ , has a topology and stratification imitating that of $\mathop{\boldsymbol{X}}\nolimits^{m}$ itself in the neighbourhood of $\mathop{\boldsymbol{x}}\nolimits$ : for each stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}<l$ that co-bounds $\sigma$ , so that $F$ comprises the basis vectors that have positive coordinates in $\tau$ but zero coordinates in $\sigma$ , there is the closed stratum $\mathbb{R}(E)\times{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\overline{\mathop{\mathcal{O}}\nolimits(F)}}$ in the tangent cone. Then, the tangent cone at $\mathop{\boldsymbol{x}}\nolimits$ has its stratification determined by identifying the various $\mathbb{R}(E)\times\{\bf 0\}$ with each other as well as identifying any tangent axes shared by pairs of strata that co-bound $\sigma$ . In particular, when no strata co-bound $\sigma$ , the tangent cone is simply the Euclidean space $\mathbb{R}(E)$ .

Definition 4.

Let $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ and $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ be two strata in $\mathop{\boldsymbol{X}}\nolimits^{m}$ with co-dimensions $l$ and $l^{\prime}<l$ , respectively. The component $\mathbb{R}(E)$ common to all the strata in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits\in\sigma$ is referred to as the tangent space to $\sigma$ at $\mathop{\boldsymbol{x}}\nolimits$ . Vectors in the $($ open $)$ stratum $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ of the tangent cone at $\mathop{\boldsymbol{x}}\nolimits\in\sigma$ with non-zero second component are referred to as vectors tangent to $\tau$ at $\mathop{\boldsymbol{x}}\nolimits$ .

The set of unit vectors in $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ is denoted by $\mathop{\mathcal{S}}\nolimits^{m-l^{\prime}}_{\tau,\sigma}$ and the subset of those in $\{{\bf 0}\}\times\mathop{\mathcal{O}}\nolimits(F)$ by $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ .

The sets $\mathop{\mathcal{S}}\nolimits^{m-l^{\prime}}_{\tau,\sigma}$ and $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ are open spherical segments of dimensions $m-l^{\prime}-1$ and $l-l^{\prime}-1$ respectively, the latter lying in the space $\mathbb{R}(F)$ orthogonal to $\mathbb{R}(E)$ .

Note that the basis vectors in $E$ do not generally precede those of $F$ in the standard ordered basis $U$ , and so writing the stratum as $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ implies an appropriate permutation of the coordinates.

Definition 5.

For any subset $E$ of the standard ordered orthonormal basis $U$ of $\mathbb{R}^{M}$ , where $E$ does not necessarily inherit its order from $U$ , we denote by $\jmath:\mathbb{R}(E)\rightarrow\mathbb{R}^{M}$ the linear transformation permuting coordinates and positioning them appropriately as coordinates, with respect to $U$ , of a vector in $\mathbb{R}^{M}$ .

We are mainly interested in the restriction of $\jmath$ to subspaces of $\mathbb{R}(E)$ . For example, if $E=(u_{1},u_{4})$ and $F=(u_{2},u_{6})$ , then a point $(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{y}}\nolimits)$ in $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ with coordinates $((x_{1},x_{2}),(y_{1},y_{2}))$ would have $\jmath(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{y}}\nolimits)=(x_{1},y_{1},0,x_{2},0,y_{2},0,\cdots,0)$ in $\mathbb{R}^{M}$ .

Inherited from the CAT(0)-structure of $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits$ , since it is metrically complete, also has a CAT(0)-structure (cf. [6], Theorem 3.19). While the CAT(0)-metric on $\mathop{\boldsymbol{X}}\nolimits^{m}$ is, by definition, the intrinsic metric, the CAT(0)-metric on the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits$ is defined in terms of the Alexandrov angle. Recall that, for any three points $\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2}$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the comparison triangle of the geodesic triangle $\Delta(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ formed by $\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2}$ is the triangle $\bar{\Delta}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ in the Euclidean plane with vertices $\bar{\mathop{\boldsymbol{x}}\nolimits}$ , $\bar{\mathop{\boldsymbol{x}}\nolimits}_{1}$ , $\bar{\mathop{\boldsymbol{x}}\nolimits}_{2}$ such that the Euclidean distances $d(\bar{\mathop{\boldsymbol{x}}\nolimits},\bar{\mathop{\boldsymbol{x}}\nolimits}_{1})$ etc. match the intrinsic distances $d(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits_{1})$ etc. in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . Then, the Alexandrov angle $\angle_{\mathop{\boldsymbol{x}}\nolimits}(\gamma_{1},\gamma_{2})$ between the geodesics $\gamma_{1}$ and $\gamma_{2}$ starting from $\mathop{\boldsymbol{x}}\nolimits$ is defined to be

[TABLE]

where $\overline{\angle}_{\mathop{\boldsymbol{x}}\nolimits}(\gamma_{1}(t),\gamma_{2}(t))$ is the Euclidean angle at $\bar{\mathop{\boldsymbol{x}}}\nolimits$ of the comparison Euclidean triangle $\bar{\Delta}(\mathop{\boldsymbol{x}}\nolimits,\gamma_{1}(t),\gamma_{2}(t))$ (cf. [6], Section 1.12). Note that, since geodesics in $\mathop{\boldsymbol{X}}\nolimits^{m}$ are piecewise linear, the above limit is well-defined. Then, the inner product on the tangent cone of $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits$ is defined by

[TABLE]

where $\mathop{\boldsymbol{w}}\nolimits_{1}$ and $\mathop{\boldsymbol{w}}\nolimits_{2}$ are the initial tangent vectors of $\gamma_{1}$ and $\gamma_{2}$ . By analogy with vectors in the tangent space to a manifold, the distance $\rho_{\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{2})$ between vectors $\mathop{\boldsymbol{w}}\nolimits_{1}$ and $\mathop{\boldsymbol{w}}\nolimits_{2}$ in the tangent cone at $\mathop{\boldsymbol{x}}\nolimits$ is defined to be

[TABLE]

(cf. [16], p144). Note that, although in general $\ll\,\,,\,\,\gg$ differs from the usual Euclidean inner product $\langle\,\,,\,\,\rangle$ , a geodesic triangle contained in the closure of a stratum of $\mathop{\boldsymbol{X}}\nolimits^{m}$ is in fact a Euclidean geodesic triangle and its angles are the Euclidean ones. In particular, $\ll\mathop{\boldsymbol{w}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{2}\gg=\langle\mathop{\boldsymbol{w}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{2}\rangle$ for any $\mathop{\boldsymbol{w}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{2}$ in the closure of $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ and then $\rho_{\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{2})=\|\mathop{\boldsymbol{w}}\nolimits_{1}-\mathop{\boldsymbol{w}}\nolimits_{2}\|$ .

3 The carriers and supports of geodesics

In order to analyse the logarithm map, we first need to understand the geodesics. The intersection of a geodesic with a stratum, a Euclidean orthant, will be either a single point or a complete intersection of a Euclidean line with that orthant.

Definition 6.

The carrier of a geodesic is the sequence of strata each of whose intersection with the geodesic is a Euclidean line of positive length.

This is essentially the terminology that was introduced in [22] in the context of tree spaces. The case of a single point intersection arises between successive strata of the carrier: between the (open) linear segment in one stratum and that in the next, there will be one point in the common bounding stratum of those two strata. This intermediate stratum is not listed in the carrier; it is in fact specified by the adjacent strata as the stratum of highest dimension in the intersection of their closures. Similarly, when a geodesic starts, or ends, in a stratum of positive co-dimension and does not remain in that stratum, but passes immediately to a co-bounding stratum, then the latter will be the first, or last, stratum in the carrier. In such a situation, we shall regard the point in the bounding stratum as having the same set of axes as the co-bounding stratum, albeit with the relevant coordinates zero. That is, we regard it as a point of the closure of the co-bounding stratum.

To describe the carrier of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ in more detail, as well as for later analysis, we require the following terminology.

Definition 7.

$(i)$ * The subsets $E$ and $F$ of $U$ are said to be compatible in the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ if the orthant ${\cal O}(E\cup F)$ is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ .*

$(ii)$ * For a subset $E$ of the standard orthonormal basis $U$ of $\mathbb{R}^{M}$ , we denote the number of vectors in $E$ by $|E|$ .*

We first identify the set of axes common to the points along a geodesic where, for any $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , $E(\mathop{\boldsymbol{x}}\nolimits)$ denotes the set of axes in $U$ with respect to which $\mathop{\boldsymbol{x}}\nolimits$ has positive coordinates.

Proposition 1.

For any $\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2}\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , the set $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ , defined by

[TABLE]

forms the set of axes common to all strata along the geodesic between $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ .

Proof.

Observe that, for a geodesic, each coordinate function must be linearly interpolated between any two values that are non-zero. It follows that once a particular coordinate, having been positive along the geodesic, becomes zero it must remain so or, having started at zero, once it becomes positive, it must continue monotonically to its final value. In particular, the only basis vectors that can occur with positive coordinate at any point along the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ are those that belong to $\mathop{\boldsymbol{x}}\nolimits_{1}$ or $\mathop{\boldsymbol{x}}\nolimits_{2}$ or to both. Moreover, when the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ passes immediately to a co-bounding stratum, all the new axes in that stratum must have coordinate zero at $\mathop{\boldsymbol{x}}\nolimits_{1}$ increasing linearly along the geodesic to its value at $\mathop{\boldsymbol{x}}\nolimits_{2}$ . Any such additional axis $e$ of the co-bounding stratum is in $E(\mathop{\boldsymbol{x}}\nolimits_{2})$ and is compatible with all the axes in ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}E(\mathop{\boldsymbol{x}}\nolimits_{1})}$ ; and any such $e$ must occur in this way. Thus, the set of axes common to all strata along the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ is precisely the given set $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ . ∎

Note that, at one extreme, if $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ both lie in the closure of an orthant $\mathop{\mathcal{O}}\nolimits(E)$ and not both in the same boundary component, then $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})=E$ . At the other extreme, if $\overline{\mathop{\mathcal{O}}\nolimits(E(\mathop{\boldsymbol{x}}\nolimits_{1}))}\cap\overline{\mathop{\mathcal{O}}\nolimits(E(\mathop{\boldsymbol{x}}\nolimits_{2}))}=\emptyset$ , then $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})=\emptyset$ . In general, $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ depends only on the orthants in which $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ lie, and is independent of their positions in those orthants.

The number $k+1$ of orthants in the carrier $\mathcal{C}=(\mathcal{O}_{0},\mathcal{O}_{1},\cdots,\mathcal{O}_{k})$ of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ will, naturally, depend on both $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ . If $\mathop{\boldsymbol{x}}\nolimits_{1}$ lies in a top dimensional stratum it will have $m$ strictly positive coordinates, all of which, assuming that none are also positive in $\mathop{\boldsymbol{x}}\nolimits_{2}$ , must become zero somewhere along the geodesic and at least one must become zero on each change of stratum as they cannot vanish within a stratum of the carrier. Thus, there will be $m+1$ strata in the carrier, that is $k=m$ , if and only if they vanish one at a time. So, $k<m$ if and only if somewhere along the geodesic at least two coordinates become zero on passing from $\mathcal{O}_{i}$ to $\mathcal{O}_{i+1}$ . When $|E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})|=k_{0}$ , the maximum value of $k$ would now be $k^{\prime}=m-k_{0}$ . Similarly, if $\mathop{\boldsymbol{x}}\nolimits_{1}$ were in a stratum of dimension $m_{0}$ , this maximum would be $m_{0}-k_{0}$ .

From now on, for given $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ , we shall denote the set $E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ by both $A_{0}$ and $B_{0}$ to accord with the following notation. It follows from Proposition 1 that each member of the sequence of strata $\mathcal{C}=(\mathcal{O}_{0},\mathcal{O}_{1},\cdots,\mathcal{O}_{k})$ that comprise the carrier of the geodesic $\gamma$ from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ has $\mathcal{O}(A_{0})=\mathcal{O}(B_{0})$ as a factor. The carrier of $\gamma$ determines further subsets of axes forming two sequences $(A_{1},\cdots,A_{k})$ and $(B_{1},\cdots,B_{k})$ , where $A_{i}$ is the set of all the axes whose coordinates become zero and $B_{i}$ the set of all those whose coordinates become positive as the geodesic passes from $\mathcal{O}_{i-1}$ to $\mathcal{O}_{i}$ . Thus, the stratum $\mathcal{O}_{i-1}$ is $\mathop{\mathcal{O}}\nolimits(B_{0}\cup B_{1}\cup\cdots\cup B_{i-1}\cup A_{i}\cup\cdots\cup A_{k})$ and $\mathcal{O}_{i}=\mathop{\mathcal{O}}\nolimits(B_{0}\cup B_{1}\cup\cdots\cup B_{i}\cup A_{i+1}\cup\cdots\cup A_{k})$ , with $\mathcal{O}_{0}$ determined by $A_{0}\cup A_{1}\cup\cdots\cup A_{k}$ . Clearly, the intermediate stratum between $\mathop{\mathcal{O}}\nolimits_{i-1}$ and $\mathop{\mathcal{O}}\nolimits_{i}$ , their common boundary component, is

[TABLE]

Thus, in particular,

$(a)$

the sets $B_{i}$ and $A_{j}$ of axes are non-empty for all positive $i$ and $j$ , and compatible in $\mathop{\boldsymbol{X}}\nolimits^{m}$ for ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}0\leqslant}i<j$ ; 2. $(b)$

$\gamma$ passes successively with positive length through the orthants $\mathop{\mathcal{O}}\nolimits_{i}$ except that it may meet at most one of $\mathop{\mathcal{O}}\nolimits_{0}$ and $\mathop{\mathcal{O}}\nolimits_{k}$ in a single point; 3. $(c)$

$A_{i}\cap A_{j}=\emptyset$ * and $B_{i}\cap B_{j}=\emptyset$ for all* $i\not=j$ .

The property $(c)$ follows from the facts that $A_{1}\cup\cdots\cup A_{k}$ is disjoint from $B_{1}\cup\cdots\cup B_{k}$ and that an axis once removed cannot be removed again, or once introduced cannot be introduced again.

Definition 8.

For any two points $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the support of the geodesic $\gamma$ from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ is defined to be the pair $(\mathcal{A},\mathcal{B})$ of sequences of sets of axes,

[TABLE]

where $\gamma$ passes successively through the orthants

[TABLE]

that form the carrier of $\gamma$ .

In the context of tree spaces, the definition of the support of a geodesic given here is equivalent to that of the minimal support given in [15].

Example 1.

For a geodesic passing successively through the orthants

[TABLE]

the relevant sequences $\mathcal{A}=(A_{0},A_{1},\cdots,A_{4})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{4})$ forming the support would have members $A_{0}=B_{0}=\{e_{0},e_{1}\}$ , the basis vectors common to all five orthants; $A_{1}=\{e_{2}\}$ , $B_{1}=\{f_{2}\}$ ; $A_{2}=\{e_{3},e_{4}\}$ , $B_{2}=\{f_{3}\}$ ; $A_{3}=\{e_{5}\}$ , $B_{3}=\{f_{4}\}$ ; $A_{4}=\{e_{6}\}$ and $B_{4}=\{f_{5},f_{6}\}$ .

If both $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ lie in the closure of the same orthant, then the geodesic between them is clearly the Euclidean line segment. To understand geodesics in general and, later, to describe and analyse various properties of the logarithm map, we require the orthogonal projections onto the various strata of $\mathop{\boldsymbol{X}}\nolimits^{m}$ , where the orthogonality is with respect to the Euclidean inner product on $\mathbb{R}^{M}$ .

Definition 9.

For $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ and $E\subset U$ such that the orthant $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , $P_{E}(\mathop{\boldsymbol{x}}\nolimits)$ denotes the orthogonal projection of $\mathop{\boldsymbol{x}}\nolimits$ onto $\mathop{\mathcal{O}}\nolimits(E)$ , that is the vector, or when relevant its coordinate vector, formed by the components of $\mathop{\boldsymbol{x}}\nolimits$ in the directions of the unit vectors in $E$ .

In terms of projections, we have the following characterisation of the supports of geodesics when $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ do not lie in the closure of the same orthant.

Proposition 2.

Let $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ be two given points in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . Suppose that $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ are two sequences of sets of axes such that the $\mathop{\mathcal{O}}\nolimits_{i}$ defined by (7) are all contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , where $k>0$ and where all subsets $A_{i}$ and $B_{j}$ are mutually disjoint and non-empty, except for $A_{0}=B_{0}=E(\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2})$ which may be empty. Then, $(\mathcal{A},\mathcal{B})$ is the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ if and only if

$(i)$

for $k>1$ and for all $0<i<k$ ,**

[TABLE]

$(ii)$

for all $0<i\leqslant k$ and all non-trivial partitions $C_{i1}\cup C_{i2}$ for $A_{i}$ and $D_{i1}\cup D_{i2}$ for $B_{i}$ , if the orthant

[TABLE]

is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , then

[TABLE]

Compared with the result in [20] (Theorem 2.5) in the case of tree spaces, this result confirms the claim in Section 6 of [15] that the results on tree spaces also hold for orthant spaces. However, the condition $(ii)$ above is necessarily stronger than that there. This is due to the fact that, in general orthant spaces, the condition that $C_{i2}$ is compatible with $D_{i1}$ does not necessarily guarantee that the orthant $\mathop{\mathcal{O}}\nolimits^{\prime}$ given by (9) is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ .

Proof.

Assuming that $(\mathcal{A},\mathcal{B})$ is the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ , we focus on three consecutive strata of the carrier, $\mathcal{O}_{i-1},\mathcal{O}_{i},\mathcal{O}_{i+1}$ , where $\mathop{\mathcal{O}}\nolimits_{j}$ are as defined in (7), projecting the geodesic onto the subspace $\mathbb{R}(A_{i}\cup A_{i+1}\cup B_{i}\cup B_{i+1})$ . As the geodesic passes from $\mathcal{O}_{i-1}$ to $\mathcal{O}_{i}$ , the coordinates along the axes in $A_{i}$ become zero and those in $B_{i}$ start to grow. Then, on passing from $\mathcal{O}_{i}$ to $\mathcal{O}_{i+1}$ , the coordinates of axes in $A_{i+1}$ become zero and those in $B_{i+1}$ grow. Consider the projection of the geodesic onto the three planar quadrants $\Pi_{i-1}$ determined by the vectors $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ and $P_{A_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ , $\Pi_{i}$ determined by $P_{A_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ and $P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ and $\Pi_{i+1}$ determined by $P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ and $P_{B_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ as in Figure 1.

This is an isometric representation of the relevant quadrants except that, in $\mathbb{R}^{M}$ , all four vectors are mutually orthogonal. Then, $\mathop{\mathcal{O}}\nolimits_{i}$ is in the carrier if and only if the projection of the geodesic passes through the interior of $\Pi_{i}$ . That is if, and only if, the angle $\theta$ that the vector $p(\mathop{\boldsymbol{x}}\nolimits_{1})=P_{A_{i}\cup A_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ makes with $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ in $\Pi_{i-1}$ is greater than the angle $\phi$ that $p(\mathop{\boldsymbol{x}}\nolimits_{2})=P_{B_{i}\cup B_{i+1}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ makes with $P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ in $\Pi_{i+1}$ , as expressed by (8).

Similarly, if $\mathop{\mathcal{O}}\nolimits^{\prime}$ is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the failure of (10) would ensure that the geodesic passed through $\mathop{\mathcal{O}}\nolimits^{\prime}$ , with positive length, between $\mathop{\mathcal{O}}\nolimits_{i-1}$ and $\mathop{\mathcal{O}}\nolimits_{i}$ .

To show that conditions $(i)$ and $(ii)$ determine the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ , we first note that, as seen above, $(i)$ ensures that the geodesic must pass through the orthant $\mathop{\mathcal{O}}\nolimits_{i}$ between $\mathop{\mathcal{O}}\nolimits_{i-1}$ and $\mathop{\mathcal{O}}\nolimits_{i+1}$ . Since $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a cone, it is simply connected and any piecewise linear path from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ can be transformed by homotopy to a geodesic by a sequence of ‘simple moves’ whereby, for each move, two consecutive linear segments of the path are replaced by a single linear segment. Since the geodesic is linear within orthants, that can only occur between consecutive orthants and condition $(ii)$ guarantees that there is no extra orthant in the carrier between $\mathop{\mathcal{O}}\nolimits_{i-1}$ and $\mathop{\mathcal{O}}\nolimits_{i}$ . ∎

As noted previously, if $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ both lie in the closure of an orthant, then $k=0$ and the geodesic between $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ is always a Euclidean segment. Then, when $\mathop{\boldsymbol{x}}\nolimits_{2}$ varies within the orthant in which it lies, the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ remains the same. However, in general, the support may change. The above characterisation of the support of a geodesic implies the following sufficient condition for the support to remain locally constant.

Corollary 1.

Suppose that the hypotheses of Proposition 2 hold. If, for all $0<i\leqslant k$ and for all relevant partitions of $A_{i}$ and $B_{i}$ as in $(ii)$ of that proposition, the inequality (10) is strict then, for all $\mathop{\boldsymbol{x}}\nolimits$ in a sufficiently small neighbourhood of $\mathop{\boldsymbol{x}}\nolimits_{2}$ in its stratum, $(\mathcal{A},\mathcal{B})$ remains the support for the geodesic from $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits$ .

Proof.

Since $\mathop{\boldsymbol{x}}\nolimits$ varies within the stratum in which $\mathop{\boldsymbol{x}}\nolimits_{2}$ lies, the set $A_{0}=B_{0}$ remains unchanged. For the other sets in the support, by continuity, the strict inequalities (8) and, we are assuming, (10) continue to hold for $\mathop{\boldsymbol{x}}\nolimits$ in a sufficiently small neighbourhood of $\mathop{\boldsymbol{x}}\nolimits_{2}$ within its stratum. Hence, the required result follows from Proposition 2. ∎

4 The logarithm map

Analogous to an inverse of the exponential map on a Riemannian manifold, the logarithm map on $\mathop{\boldsymbol{X}}\nolimits^{m}$ is defined as follows.

Definition 10.

The logarithm map at $\mathop{\boldsymbol{x}}\nolimits^{*}\in\mathop{\boldsymbol{X}}\nolimits^{m}$ is the map $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ from $\mathop{\boldsymbol{X}}\nolimits^{m}$ to the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , the image of $\mathop{\boldsymbol{x}}\nolimits$ being the initial tangent vector, with norm $d(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits)$ , to the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ .

The logarithm map is globally well-defined since, as already mentioned, the Cartan-Hadamard theorem implies that there is a unique geodesic between any two points $\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\mathop{\boldsymbol{x}}\nolimits$ of $\mathop{\boldsymbol{X}}\nolimits^{m}$ . If that geodesic has an initial segment in a stratum containing $\mathop{\boldsymbol{x}}\nolimits^{*}$ it will certainly have an initial tangent vector. If it has only $\mathop{\boldsymbol{x}}\nolimits^{*}$ in the initial stratum, it must then have an open segment $\gamma(0,\epsilon)$ , with $\gamma(0)=\mathop{\boldsymbol{x}}\nolimits^{*}$ , in a co-bounding stratum. Then it will still have a one-sided derivative at $\mathop{\boldsymbol{x}}\nolimits^{*}$ which suffices to define the logarithm map.

With the description of the carrier, as well as the results on the support, of a geodesic in the previous section, we are now in a position to derive and analyse its initial tangent vector, or equivalently $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ . As in [2] and [3] for the space of trees, our analysis will mainly involve a modified version of the logarithm map. For this, since the tangent cones at various points in $\sigma$ are all parallel, we may parallel translate them to the cone point $o$ , the origin in $\mathbb{R}^{M}$ , to produce a common isometric copy $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ . Then, since the coordinate vector of the point $\mathop{\boldsymbol{x}}\nolimits^{*}$ , which we also denote by $\mathop{\boldsymbol{x}}\nolimits^{*}$ , lies in the common factor $\mathbb{R}(E)$ of all the strata of $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ , it makes sense to add it to $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ and the result

[TABLE]

will also lie in $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ . We shall refer to $\Phi$ as the translated logarithm map to distinguish it from the logarithm map itself. All the vectors $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ being in the same space implies that the translated logarithm maps are directly comparable as $\mathop{\boldsymbol{x}}\nolimits^{*}$ varies within an orthant and such comparability will be necessary later. Moreover, the difference between the two maps is such that all our analysis of $\Phi$ can easily be translated to that of the logarithm map itself.

Note that, although the origin corresponds to the cone point $o$ of the orthant space $\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ is not the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $o$ , neither being contained in the other, unless $\sigma=\{o\}$ . Note also that, when $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space and $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum, $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ was called the modified logarithm map and was denoted by $\Phi_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ in [3], and the permutation map $\pi$ there corresponds to the linear transformation $\jmath$ given by Definition 5.

The next theorem gives the expression for the translated logarithm map $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ in terms of the projections, specified in Definition 9, onto various sets of axes appearing in the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ .

Theorem 1.

For any two points $\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\mathop{\boldsymbol{x}}\nolimits$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , let the sequences $\mathcal{A}=(A_{0},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},\cdots,B_{k})$ of sets of axes form the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ . Then, the translated logarithm map $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ is given by

[TABLE]

where $\jmath$ is the linear transformation given by Definition 5.

In particular, $\Phi(\,\cdot\,;\lambda\mathop{\boldsymbol{x}}\nolimits^{*})=\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ for any constant $\lambda>0$ .

Recall that, if $k=0$ , then $\mathop{\boldsymbol{x}}\nolimits$ and $\mathop{\boldsymbol{x}}\nolimits^{*}$ lie the closure of an orthant and the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ is a line segment in $\mathbb{R}^{M}$ . In this case, $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\jmath(\mathop{\boldsymbol{x}}\nolimits)$ . If $k>0$ and if $|A_{i}|=|B_{i}|=1$ for $1\leqslant i\leqslant k$ , the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ is also similar to that of the corresponding translated logarithm map in a Euclidean space, after changing the axes $B_{i}$ to $-A_{i}$ .

Proof.

The orthogonal projection of $\gamma$ onto $\mathcal{O}(A_{0})$ determines the component of the initial tangent vector to $\gamma$ that is tangent to $\mathcal{O}(A_{0})$ , namely

[TABLE]

For the remaining coordinates, since the sets $A_{i}$ and $B_{j}$ above are all mutually disjoint, it follows that, for each $i$ , the subspace $\mathbb{R}(A_{i}\cup B_{i})$ is orthogonal to all $\mathbb{R}(A_{j})$ and $\mathbb{R}(B_{j})$ for $j\not=i$ , so that the coordinates of the geodesic $\gamma$ that are positive with respect to the axes in $\mathbb{R}(A_{i}\cup B_{i})$ are just those of the projection $\gamma_{i}$ of $\gamma$ onto that subspace. If $s_{i}$ is the parameter such that $\gamma(s_{i})\in\mathcal{O}_{i-1}\cap\mathcal{O}_{i}$ , then ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{A_{i}}(\gamma(s))}\in\mathop{\mathcal{O}}\nolimits(A_{i})$ declines linearly from ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{A_{i}}(\gamma(0))=P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})}$ to ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{A_{i}}(\gamma(s_{i}))}=\bf{0}$ . Then, the coordinates ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{B_{i}}(\gamma(s))}\in\mathop{\mathcal{O}}\nolimits(B_{i})$ increase linearly from zero at $\gamma(s_{i})$ to ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{B_{i}}(\gamma(1))=P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits)}$ . Thus, the projected geodesic $\gamma_{i}$ lies in the union of the orthogonal orthants $\mathop{\mathcal{O}}\nolimits(A_{i})$ and $\mathop{\mathcal{O}}\nolimits(B_{i})$ and hence has length ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|+\|P_{B_{i}}(\mathop{\boldsymbol{x}}\nolimits)\|}$ . The initial tangent vector to $\gamma_{i}$ is parallel to $-P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits)$ and so is

[TABLE]

Hence, the initial tangent vector to $\gamma$ with norm $d(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits)$ is represented by $(v_{0},v_{1},\cdots,v_{k})$ . However, this ordering of the coordinates, with those in $\mathbb{R}(A_{i})$ preceding those of $\mathbb{R}(A_{i+1})$ for each $i$ , requires the linear transformation $\jmath$ to obtain its representation with respect to the standard basis in $\mathbb{R}^{M}$ . Then, the logarithm map at $\mathop{\boldsymbol{x}}\nolimits^{*}$ will be

[TABLE]

so that equation (11) follows from the coordinates $v_{i}$ since the coordinates of $\mathop{\boldsymbol{x}}\nolimits^{*}$ are $\jmath\left({\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{A_{0}}(\mathop{\boldsymbol{x}}\nolimits^{*}),P_{A_{1}}(\mathop{\boldsymbol{x}}\nolimits^{*}),\cdots,P_{A_{k}}(\mathop{\boldsymbol{x}}\nolimits^{*})}\right)$ . ∎

In the following, when we say that the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ takes the same form as the corresponding expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ , we mean that the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ can be obtained by replacing $\mathop{\boldsymbol{x}}\nolimits_{1}$ by $\mathop{\boldsymbol{x}}\nolimits_{2}$ in the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ . Clearly, the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ will depend on the support $(\mathcal{A},\mathcal{B})$ of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ , noting that the roles that $\mathcal{A}$ and $\mathcal{B}$ play are not symmetric. The following example illustrates this feature where, although $\mathop{\boldsymbol{x}}\nolimits$ lies in the same orthant in the second and third cases, the forms for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , as a function of $\mathop{\boldsymbol{x}}\nolimits$ , differ in the two cases. However, along the boundary between the light and dark grey regions, the two forms give the same result.

Example 2.

Consider $\mathop{\boldsymbol{X}}\nolimits^{2}$ in $\mathbb{R}^{5}$ , which was called $Q_{5}$ in [2], consisting of five orthants as shown in Figure 2, where all five axes are mutually orthogonal.

The tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{2}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}=(x_{1}^{*},x_{2}^{*},0,0,0)$ indicated in Figure 2 is the $(u_{1},u_{2})$ -plane and that at the cone point $o$ is $\mathop{\boldsymbol{X}}\nolimits^{2}$ itself. While $\Phi(\mathop{\boldsymbol{x}}\nolimits;o)=\mathop{\boldsymbol{x}}\nolimits$ for all $\mathop{\boldsymbol{x}}\nolimits$ , the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ takes different forms depending the position of $\mathop{\boldsymbol{x}}\nolimits$ . For example, for any $\mathop{\boldsymbol{x}}\nolimits=(0,x_{2},x_{3},0,0)$ in the orthant $\mathop{\mathcal{O}}\nolimits(u_{2},u_{3})$ ,

[TABLE]

for $\mathop{\boldsymbol{x}}\nolimits=(0,0,x_{3},x_{4},0)$ in the dark grey region of $\mathop{\mathcal{O}}\nolimits(u_{3},u_{4})$ , i.e. if the coordinates of $\mathop{\boldsymbol{x}}\nolimits$ satisfy $x_{4}/x_{3}<\tan(\alpha)=x^{*}_{2}/x^{*}_{1}$ , then

[TABLE]

However, if $\mathop{\boldsymbol{x}}\nolimits=(0,0,x_{3},x_{4},0)$ lies in the light grey region of $\mathop{\mathcal{O}}\nolimits(u_{3},u_{4})$ , i.e. if the coordinates of $\mathop{\boldsymbol{x}}\nolimits$ satisfy $x_{4}/x_{3}>\tan(\alpha)=x^{*}_{2}/x^{*}_{1}$ , then

[TABLE]

In particular, for all $\mathop{\boldsymbol{x}}\nolimits$ in the light grey region of $\mathop{\boldsymbol{X}}\nolimits^{2}$ , the vectors $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ have the same direction $-\frac{1}{\|\mathop{\boldsymbol{x}}\nolimits^{*}\|}(x^{*}_{1},x^{*}_{2},0,0,0)$ and the only difference between them lies in the length of this vector.

The potential variation of the form of the expression (11) for the translated logarithm map, arising from the changes in the supports of the geodesics, is one of the main obstructions to generalising the theory for manifolds to orthant spaces, or more general stratified spaces. To study this variation, we first note the following result, which is a direct consequence of Corollary 1.

Corollary 2.

If the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ satisfies the conditions of Corollary 1, then there is a neighbourhood $\mathcal{N}$ of $\mathop{\boldsymbol{x}}\nolimits$ within its stratum such that, for any $\mathop{\boldsymbol{x}}\nolimits^{\prime}\in\mathcal{N}$ , the form of the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits^{\prime};\mathop{\boldsymbol{x}}\nolimits^{*})$ takes the same form of that for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ .

We now characterise, in terms of the two conditions on the support of a geodesic given in Proposition 2, changes in the form of the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ when $\mathop{\boldsymbol{x}}\nolimits$ varies locally. Although the roles played by these two conditions in determining the support of a geodesic are different, to some extent, they play a similar role in the change of the form of that expression. Replacing the inequality (8) or (10) by equality determines a quadratic co-dimension one hyper-surface. When two or more such hyper-surfaces meet, their normals are linearly independent so they intersect in surfaces of co-dimension at least two. Thus, it suffices to consider a point lying in a single such hyper-surface. Then, points on either side of the hyper-surface will have different supports for their geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}$ , but that will not always result in a change in the form the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ .

Proposition 3.

Let $\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\mathop{\boldsymbol{x}}\nolimits_{0}$ be two given points in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , and let $(\mathcal{A},\mathcal{B})$ be the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{0}$ , where $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ , and where $k>1$ . Assume that $\mathop{\boldsymbol{x}}\nolimits$ moves from $\mathop{\boldsymbol{x}}\nolimits_{0}$ , within its stratum, to a first point $\mathop{\boldsymbol{x}}\nolimits_{1}$ such that, for $i=i_{0}>0$ , the inequality (8), with $\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2}$ replaced by $\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits_{1}$ respectively, becomes an equality while all the other inequalities (8) and (10) remain strict. Then, the support $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{1}$ has

[TABLE]

and similarly for $\mathcal{B}^{\prime}$ .

If the orthant

[TABLE]

is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , then there is a neighbourhood $\mathcal{N}$ of $\mathop{\boldsymbol{x}}\nolimits_{1}$ within its stratum such that, for all $\mathop{\boldsymbol{x}}\nolimits\in\mathcal{N}$ , the form of the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ is identical with that for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{0};\mathop{\boldsymbol{x}}\nolimits^{*})$ .

If $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ is not an orthant of $\mathop{\boldsymbol{X}}\nolimits^{m}$ then, in any neighbourhood $\mathcal{N}$ of $\mathop{\boldsymbol{x}}\nolimits_{1}$ within its stratum, there are $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ and $\mathop{\boldsymbol{x}}\nolimits^{\prime\prime}$ such that the form for $\Phi(\mathop{\boldsymbol{x}}\nolimits^{\prime};\mathop{\boldsymbol{x}}\nolimits^{*})$ is the same as that for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{0};\mathop{\boldsymbol{x}}\nolimits^{*})$ and that for $\Phi(\mathop{\boldsymbol{x}}\nolimits^{\prime\prime};\mathop{\boldsymbol{x}}\nolimits^{*})$ is determined by the support $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ . When $\mathcal{N}$ is sufficiently small, there are no other possibilities.

Proof.

By Corollary 2, the form of the expression (11) will remain constant, as long as the inequalities (8) and (10) remain strict. However, for $\mathop{\boldsymbol{x}}\nolimits=\mathop{\boldsymbol{x}}\nolimits_{1}$ , on account of the equality (8) for $i_{0}$ at $\mathop{\boldsymbol{x}}\nolimits=\mathop{\boldsymbol{x}}\nolimits_{1}$ , the angles $\theta$ and $\phi$ , in the projected diagram of Figure 3, will be equal where the projections are as specified in the proof of Proposition 2.

Consequently at $\mathop{\boldsymbol{x}}\nolimits_{1}$ , $\mathop{\mathcal{O}}\nolimits_{i_{0}}$ will drop out of the carrier, where $\mathop{\mathcal{O}}\nolimits_{i}$ is defined by (7), and, by the continuity of geodesics, the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{1}$ will be $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ .

Now, let $\mathop{\boldsymbol{x}}\nolimits$ continue to move past $\mathop{\boldsymbol{x}}\nolimits_{1}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ , remaining sufficiently close to $\mathop{\boldsymbol{x}}\nolimits_{1}$ and having projection $p(\mathop{\boldsymbol{x}}\nolimits_{2})=P_{B_{i_{0}}\cup B_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits_{2})$ in Figure 3 lying on the opposite side to $p(\mathop{\boldsymbol{x}}\nolimits_{0})=P_{B_{i_{0}}\cup B_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits_{0})$ of the ray from the origin to $p(\mathop{\boldsymbol{x}}\nolimits_{1})=P_{B_{i_{0}}\cup B_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits_{1})$ . If $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the projection of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ would be, as in Figure 3 $(a)$ , the ‘straight’ line from $p(\mathop{\boldsymbol{x}}\nolimits^{*})=P_{A_{i_{0}}\cup A_{i_{0}+1}}(\mathop{\boldsymbol{x}}\nolimits^{*})$ to $p(\mathop{\boldsymbol{x}}\nolimits_{2})$ passing through the planar quadrant $\Pi_{0}$ determined by $\mathop{\boldsymbol{P}}\nolimits_{A_{i_{0}}}(\mathop{\boldsymbol{x}}\nolimits^{*})$ and $P_{B_{i_{0}}}(\mathop{\boldsymbol{x}}\nolimits_{0})$ . This would imply replacing $\mathop{\mathcal{O}}\nolimits_{i_{0}}$ in the carrier by $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ with the resulting support for the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ being $(\mathcal{A}^{\prime\prime},\mathcal{B}^{\prime\prime})$ , where

[TABLE]

and similarly for $\mathcal{B}^{\prime\prime}$ . In this case, the application of the linear transformation $\jmath$ in the expression for $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ implies that, for such $\mathop{\boldsymbol{x}}\nolimits_{2}$ , the form of the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ is identical with that for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{0};\mathop{\boldsymbol{x}}\nolimits^{*})$ .

Assume now that $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ is not an orthant of $\mathop{\boldsymbol{X}}\nolimits^{m}$ . There might still be an intermediate orthant between $\mathop{\mathcal{O}}\nolimits_{i_{0}-1}$ and $\mathop{\mathcal{O}}\nolimits_{i_{0}+1}$ arrived at by non-trivial partitions $A_{i_{0}}=C_{1}\cup C_{2}$ , $A_{i_{0}+1}=D_{1}\cup D_{2}$ , $B_{i_{0}}=E_{1}\cup E_{2}$ and $B_{i_{0}+1}=F_{1}\cup F_{2}$ such that the orthant

[TABLE]

is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ and provides a shorter path between $\mathop{\mathcal{O}}\nolimits_{i_{0}-1}$ and $\mathop{\mathcal{O}}\nolimits_{i_{0}+1}$ . In which case, by Proposition 2 $(i)$ , we must have

[TABLE]

This would result in

[TABLE]

and, taking the limit as $\mathop{\boldsymbol{x}}\nolimits_{2}\rightarrow\mathop{\boldsymbol{x}}\nolimits_{1}$ ,

[TABLE]

On the other hand, the closures of the orthants $\mathop{\mathcal{O}}\nolimits_{i_{0}-1}$ and $\widetilde{\mathop{\mathcal{O}}\nolimits}$ being in $\mathop{\boldsymbol{X}}\nolimits^{m}$ ensure that all 2-dimensional orthants in the closure of

[TABLE]

are in $\mathop{\boldsymbol{X}}\nolimits^{m}$ and hence, by Definition 1, so too is $\widetilde{\mathop{\mathcal{O}}\nolimits}^{*}$ itself. Then, by the assumption of uniqueness of the equality at $\mathop{\boldsymbol{x}}\nolimits_{1}$ of the proposition, we must have by Proposition 2 $(ii)$ that

[TABLE]

Similarly, by considering the orthant $\mathop{\mathcal{O}}\nolimits(B_{0}\cup\cdots\cup B_{i_{0}}\cup F_{1}\cup D_{2}\cup A_{i_{0}+2}\cup\cdots\cup A_{k})$ , we get

[TABLE]

Since, by assumption, $\mathop{\mathcal{O}}\nolimits_{i_{0}}$ drops out of the carrier at $\mathop{\boldsymbol{x}}\nolimits_{1}$ , we also have

[TABLE]

so that, combining (14) and (15), we have

[TABLE]

contradicting (13).

Thus, if $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ is not contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the projection of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{2}$ continues to pass through the origin, as shown in Figure 3 $(b)$ , and the carrier remains as it was for $\mathop{\boldsymbol{x}}\nolimits_{1}$ , where the support is $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ given above. In this case, the form of the expression (11) for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ clearly differs from that for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{0};\mathop{\boldsymbol{x}}\nolimits^{*})$ . ∎

Note that the equality (8) for $i=i_{0}$ at $\mathop{\boldsymbol{x}}\nolimits_{1}$ and the mutual orthogonality of all the axes together imply that

[TABLE]

This confirms that the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{0};\mathop{\boldsymbol{x}}\nolimits^{*})$ is still valid for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ , as expected by the continuity of geodesics. Similarly, the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ is still valid for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ whether or not the orthant $\mathop{\mathcal{O}}\nolimits_{i_{0}}$ has been replaced by $\mathop{\mathcal{O}}\nolimits^{\prime\prime}$ .

A similar argument to that for the proof of Proposition 3 gives the following complementary result.

Proposition 4.

Let $\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\mathop{\boldsymbol{x}}\nolimits_{1}$ be two given points in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , and let $(\mathcal{A},\mathcal{B})$ be the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits_{1}$ , where $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ , and where $k>0$ . Assume that all inequalities (8) and (10), with $\mathop{\boldsymbol{x}}\nolimits_{1},\mathop{\boldsymbol{x}}\nolimits_{2}$ replaced by $\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits_{1}$ respectively, are strict except that, for $i=i_{0}>0$ and unique non-trivial partitions $C_{i_{0}1}\cup C_{i_{0}2}$ for $A_{i_{0}}$ and $D_{i_{0}1}\cup D_{i_{0}2}$ for $B_{i_{0}}$ , (10) is an equality and that the corresponding orthant $\mathop{\mathcal{O}}\nolimits^{\prime}$ given by (9) with $i=i_{0}$ is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ .

If the orthant

[TABLE]

is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , there is a neighbourhood $\mathcal{N}$ of $\mathop{\boldsymbol{x}}\nolimits_{1}$ within its stratum such that the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ is the same, for all $\mathop{\boldsymbol{x}}\nolimits\in\mathcal{N}$ . Then, the common form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ is determined by $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ , where

[TABLE]

and similarly for $\mathcal{B}^{\prime}$ .

If $\mathop{\mathcal{O}}\nolimits^{\prime\prime\prime}$ is not an orthant of $\mathop{\boldsymbol{X}}\nolimits^{m}$ then, in any neighbourhood $\mathcal{N}$ of $\mathop{\boldsymbol{x}}\nolimits_{1}$ within its stratum, there are $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ and $\mathop{\boldsymbol{x}}\nolimits^{\prime\prime}$ such that the form for $\Phi(\mathop{\boldsymbol{x}}\nolimits^{\prime};\mathop{\boldsymbol{x}}\nolimits^{*})$ is the same as that for $\Phi(\mathop{\boldsymbol{x}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ and that for $\Phi(\mathop{\boldsymbol{x}}\nolimits^{\prime\prime};\mathop{\boldsymbol{x}}\nolimits^{*})$ is determined by $(\mathcal{A}^{\prime},\mathcal{B}^{\prime})$ . When $\mathcal{N}$ is sufficiently small, there are no other possibilities.

The carrier of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ will also change when $\mathop{\boldsymbol{x}}\nolimits$ moves from one stratum to another which necessarily involves, as initial, final or intermediate stratum, a stratum of locally positive co-dimension. The set of all such strata, together with the quadratic hyper-surfaces determined by equalities in each of the relevant equations (8), form the defining boundaries for the (pre)-vistal polyhedral subdivision, with respect to $\mathop{\boldsymbol{x}}\nolimits^{*}$ , in [15]. The points in any component of the complement of these surfaces all have the same carrier. However, for our analysis, we shall only be concerned with changes in the forms of the expressions taken by $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ , or equivalently by $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , when $\mathop{\boldsymbol{x}}\nolimits$ or $\mathop{\boldsymbol{x}}\nolimits^{*}$ vary within their strata rather than changes in the underlying carrier. For this, we note that the results in Propositions 3 and 4 where the changed support must be used to obtain the correct expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ are reflections of each other where an othant is removed or introduced, respectively, in the carrier. Thus, we may encapsulate as follows the hyper-surfaces across which, though not at which, it is necessary to take account of the change of support to obtain the correct value for the logarithm map.

Definition 11.

Given a point $\mathop{\boldsymbol{x}}\nolimits^{*}\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ denotes the set that consists of all points $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ for which the support $(\mathcal{A},\mathcal{B})$ , where $\mathcal{A}=(A_{0},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},\cdots,B_{k})$ , of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ has the property that, for one or more $i=i_{0}>0$ , there are non-trivial partitions $A_{i_{0}}=C_{i_{0}1}\cup C_{i_{0}2}$ and $B_{i_{0}}=D_{i_{0}1}\cup D_{i_{0}2}$ with

[TABLE]

where the corresponding orthant $\mathop{\mathcal{O}}\nolimits^{\prime}$ of (9) is contained in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , but $\mathop{\mathcal{O}}\nolimits^{\prime\prime\prime}$ of (16) is not.

In view of the symmetry that reverses the geodesics at the same time as it reverses the order of the strata and interchanges the roles of the sequences $\mathcal{A}$ and $\mathcal{B}$ of edge sets in the support, the definition is symmetric: $\mathop{\boldsymbol{x}}\nolimits\in\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ if and only if $\mathop{\boldsymbol{x}}\nolimits^{*}\in\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ . Since each stratum is a Euclidean orthant, it is preserved under multiplication by $\lambda>0$ in $\mathbb{R}^{M}$ which also multiplies the length of each curve by $\lambda$ . Then, since the geodesic $\gamma$ joining $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ is the shortest curve through the strata of $\mathop{\boldsymbol{X}}\nolimits^{m}$ from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ , it follows that $\gamma$ is mapped onto the geodesic from $\lambda\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\lambda\mathop{\boldsymbol{x}}\nolimits$ . In particular, these two geodesics have the same carrier. Thus, $\mathcal{D}_{\lambda\mathop{\boldsymbol{x}}\nolimits^{*}}=\lambda\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ and, since the equations (17) are homogeneous, $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}=\lambda\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ .

The pseudo-partition of $\mathop{\boldsymbol{X}}\nolimits^{m}$ with respect to $\mathop{\boldsymbol{x}}\nolimits^{*}$ determined by $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ gives rise to a polyhedral subdivision of each stratum by restriction. It is coarser than the (pre)-vistal subdivision of [15] and, if $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space and if $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum, it is equivalent to the polyhedral subdivision defined in [3].

5 Limits, projections and derivatives

We now turn to certain limits and projections of the translated logarithm map that, in particular, will enable us to calculate the directional derivatives we require.

Firstly, we obtain an expression for the limit of the translated logarithm map as the reference point $\mathop{\boldsymbol{x}}\nolimits^{*}$ moves along a geodesic. For a vector $\mathop{\boldsymbol{w}}\nolimits$ in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , write $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits)$ for the point distant $\lambda\|\mathop{\boldsymbol{w}}\nolimits\|$ along the geodesic $\gamma$ starting at $\mathop{\boldsymbol{x}}\nolimits^{*}$ with initial tangent vector $\mathop{\boldsymbol{w}}\nolimits$ . Then, we have the following result.

Theorem 2.

Let $\sigma=\mathcal{O}(E)$ be a stratum of $\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ and $\mathop{\boldsymbol{x}}\nolimits$ be a fixed choice of point anywhere in $\mathop{\boldsymbol{X}}\nolimits^{m}$ .

$(i)$

If $\mathop{\boldsymbol{w}}\nolimits\in\mathbb{R}(E)$ is tangent to $\sigma$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , then

[TABLE] 2. $(ii)$

If $\sigma$ bounds $\tau=\mathcal{O}(E\cup F)$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ and $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ is tangent to $\tau$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , then the limit

[TABLE]

exists. Moreover, there exist $\epsilon>0$ and sequences $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ of sets of axes such that, for each $\lambda\in(0,\epsilon)$ , $(\mathcal{A},\mathcal{B})$ forms the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ . In terms of these $\mathcal{A}$ and $\mathcal{B}$ ,

[TABLE]

where $W_{i}=P_{A_{i}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})$ , unless $P_{A_{i}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})=0$ , in which case $W_{i}=P_{A_{i}\cap F}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ ,* the projection of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ on $\mathbb{R}(A_{i})$ , and $\jmath$ is the linear transformation given by Definition 5.*

For $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma\subseteq\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ defined by (18) is the limit of the translated logarithm map of $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ as $\mathop{\boldsymbol{x}}\nolimits^{\prime}\rightarrow\mathop{\boldsymbol{x}}\nolimits^{*}$ from the direction $\mathop{\boldsymbol{w}}\nolimits$ . When the direction $\mathop{\boldsymbol{w}}\nolimits$ is clear in the context we shall, in the following, call $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ simply the directional limit of $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ .

Proof.

( $i$ ) This follows from the uniform continuity of geodesics with respect to their end points (cf. [6], pp195-196) and also from a minor modification of the proof of ( $ii$ ) below.

( $ii$ ) Note that, since $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ , $\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ lie in different strata. Writing $\gamma_{\lambda}$ for the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ , as $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ moves along $\gamma$ the support of $\gamma_{\lambda}$ can only change when $\gamma$ meets transversally one or more of the hyper-surfaces where the carrier of the geodesic to $\mathop{\boldsymbol{x}}\nolimits$ changes. This can only happen at discrete points along $\gamma$ so, for some $\epsilon>0$ and $0<\lambda\leqslant\epsilon$ , the carriers of the geodesics $\gamma_{\lambda}$ will be independent of $\lambda$ . Let $(\mathcal{A},\mathcal{B})$ be the support of $\gamma_{\epsilon}$ from $\mathop{\boldsymbol{x}}\nolimits^{*}(\epsilon,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ , where $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ . Then, $A_{0}\cup A_{1}\cup\cdots\cup A_{k}=E\cup F$ and, for $0<\lambda\leqslant\epsilon$ , the integer $k$ and the support $(\mathcal{A},\mathcal{B})$ will remain constant for the expression

[TABLE]

replacing $\mathop{\boldsymbol{x}}\nolimits^{*}$ in (11) by $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ . Then, since the $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ lie in $\tau$ for all sufficiently small positive $\lambda$ , the vectors $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}))$ all lie in $\mathbf{C}_{\tau}$ so that it makes sense to take the limit as $\lambda\rightarrow 0+$ , where $\mathbf{C}_{\tau}$ is the common translated cone of the tangent cone at $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ as introduced in Section 2.

To evaluate it, we take the limit in the above expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}))$ . Since $\mathop{\boldsymbol{x}}\nolimits^{*}\in{\mathcal{O}}(E)$ , $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})=\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau}$ for sufficiently small $\lambda>0$ and it follows that $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}))=P_{A_{i}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})+\lambda P_{A_{i}}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ . So the limit as $\lambda\rightarrow 0+$ of this term is $P_{A_{i}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})$ if that is non-zero. If it is zero, then $A_{i}\cap E=\emptyset$ since $\|P_{\{e\}}(\mathop{\boldsymbol{x}}\nolimits^{*})\|>0$ for all $e\in E$ . Then $P_{A_{i}}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ , the projection of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ on $\mathbb{R}(A_{i})$ is, in fact, $P_{A_{i}\cap F}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ . ∎

If $\sigma$ has co-dimension $l$ and $\tau$ co-dimension $l^{\prime}$ then, when $l-l^{\prime}=1$ and so $|F|=1$ , there is no $i>0$ such that $|A_{i}|>1$ and $P_{A_{i}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})=0$ as all the axes involved in the carrier that are not in $E\cup F$ are in $A_{0}=B_{0}$ . If further $l=1$ and $l^{\prime}=0$ , that is, $\sigma$ is a stratum of local co-dimension one and $\tau$ co-bounding $\sigma$ is a locally top-dimensional stratum, then $\Psi(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ obtained here is identical with the map resulting from the ‘folding map’ composed with $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ used in [3] when $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space, noting that $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ in this case is unique up to a positive scalar multiple.

Example 3.

Consider the orthant space $\mathop{\boldsymbol{X}}\nolimits^{2}$ in Example 2. Take $\sigma=\{o\}$ and $\tau=\mathop{\mathcal{O}}\nolimits(u_{1},u_{2})$ . Recall from Example 2 that the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{2}$ at $o$ is $\mathop{\boldsymbol{X}}\nolimits^{2}$ itself. Take $\mathop{\boldsymbol{w}}\nolimits_{\tau}=\mathop{\boldsymbol{x}}\nolimits^{*}$ , where $\mathop{\boldsymbol{x}}\nolimits^{*}\in\tau$ is indicated in Figure 2. Then, $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};o)=\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ for any $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{2}$ . Since the light grey region in Figure 2 may change if $\mathop{\boldsymbol{x}}\nolimits^{*}$ changes, $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ may change as a map when $\mathop{\boldsymbol{x}}\nolimits^{*}$ changes. Hence, the directional limit $\Psi(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};o)$ of $\Phi(\,\cdot\,;\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau})$ from the direction $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ as $\lambda\rightarrow 0$ , as a map, also depends on $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ .

For $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ as given in Theorem 2 $(ii)$ , write $\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}$ for the component of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ orthogonal to $\sigma$ , that is, the component in $\{\textbf{0}\}\times\mathop{\mathcal{O}}\nolimits(F)\subset\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ . Then, the following consequences of Theorem 2 imply that, although the directional limit $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ generally depends on $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , for given $\mathop{\boldsymbol{x}}\nolimits$ and $\mathop{\boldsymbol{x}}\nolimits^{*}$ , as noted in Example 3 above, it remains constant in some circumstances. In particular, to consider the changes of $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ as $\mathop{\boldsymbol{x}}\nolimits$ varies, it suffices to restrict attention to $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ , recalling that $\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ is the open unit spherical segment of $\{{\bf 0}\}\times\mathop{\mathcal{O}}\nolimits(F)$ given by Definition 4.

Corollary 3.

With the notation and hypotheses of Theorem $\ref{thm2}(ii)$ ,

$(i)$

$\Psi(\,\cdot\,,\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ * for all $\lambda>0$ ;* 2. $(ii)$

$\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp};\mathop{\boldsymbol{x}}\nolimits^{*})$ .

Proof.

$(i)$ is obvious from the expression (19) and $(ii)$ is immediate since $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ , $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ and only the $F$ -coordinates of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ are potentially involved in (19). ∎

When $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum $\sigma$ of positive co-dimension that is not locally top-dimensional, the vector $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ , and so $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , will usually have non-zero components both tangent to $\sigma$ and orthogonal to it. In order to discuss the projections, onto these components, of the translated logarithm map and of its directional limits, as well as to discuss their derivatives, we extend the notation $P$ for projection maps on $\mathop{\boldsymbol{X}}\nolimits^{m}$ given by Definition 9 to include projection maps on tangent cones, or their translated cones. However, since we are more interested in the orthant itself rather than the axes determining it, we shall use $P_{\sigma}$ instead of $P_{E}$ , where $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ . In particular, for any stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ co-bounding $\sigma$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , $P_{\sigma}$ and $P_{\tau\setminus\sigma}$ respectively are the projections onto the two factors of the corresponding stratum $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ in the common translated cone $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ , or equivalently in the tangent cone at a point of $\sigma$ , depending on the context.

For $\mathop{\boldsymbol{x}}\nolimits^{*}$ in $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ or in $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ co-bounding $\sigma$ , we shall denote $P_{\sigma}(\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits))$ by $\log^{\sigma}_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ and $P_{\sigma}(\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}))$ by $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ . Note that, on $\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ , $P_{\sigma}$ so defined is the tangential projection onto $\sigma$ and $P_{\tau\setminus\sigma}$ is one of several possible normal projections. In particular, for $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ , $\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}=P_{\tau\setminus\sigma}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ . We shall further extend the notation $P_{\sigma}$ to include top-dimensional, or locally top-dimensional, strata by taking it to be the identity in that case, so that in particular $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ if $\sigma$ is a top-dimensional, or locally top-dimensional, stratum.

For $\mathop{\boldsymbol{x}}\nolimits^{*}$ in $\sigma$ of locally positive co-dimension, the non-zero components of $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ orthogonal to $\sigma$ correspond to axes with respect to which $\mathop{\boldsymbol{x}}\nolimits^{*}$ has zero coefficient and $\mathop{\boldsymbol{x}}\nolimits$ has non-zero coefficient. Hence, these axes are in $A_{0}=B_{0}=E(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits)$ , the set of axes common to all strata in the carrier of the geodesic between these two points, so that they correspond to components of $v_{0}$ in (12). This implies, in particular, that $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ is given by (11) with $P_{B_{0}}(\mathop{\boldsymbol{x}}\nolimits)$ there replaced by $P_{B_{0}\cap E}(\mathop{\boldsymbol{x}}\nolimits)$ . Then, since the restriction to each stratum of the set $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ given by Definition 11 is relatively closed, the form of the expression for $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ will remain constant for $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ varying in a neighbourhood of $\mathop{\boldsymbol{x}}\nolimits^{*}$ in $\sigma$ when $\mathop{\boldsymbol{x}}\nolimits^{*}$ is restricted to avoid $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ . Hence, the proof of Lemma 4 in [3] of the differentiability of $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ with respect to $\mathop{\boldsymbol{x}}\nolimits^{*}$ for the case that $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space and $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum will give the following generalisation of that result to the derivative of $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ with respect to $\mathop{\boldsymbol{x}}\nolimits^{*}$ . Since the proof is similar to that for Lemma 4 in [3], we omit it here.

Proposition 5.

Let $\mathop{\boldsymbol{x}}\nolimits$ and $\mathop{\boldsymbol{x}}\nolimits^{*}$ be fixed points in $\mathop{\boldsymbol{X}}\nolimits^{m}$ with $\mathop{\boldsymbol{x}}\nolimits^{*}$ in the stratum $\sigma=\mathcal{O}(E)$ and $\mathop{\boldsymbol{x}}\nolimits\not\in\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ , where the set $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ is given by Definition 11. Then, the map

[TABLE]

is differentiable with respect to $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ with derivative given by

[TABLE]

where the sequences $\mathcal{A}=(A_{0},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},\cdots,B_{k})$ form the support of the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ and $J$ is the matrix representation of the linear transformation $\jmath$ given by Definition 5, and where, for $\mathop{\boldsymbol{y}}\nolimits=(y_{1},\cdots,y^{\phantom{A}}_{l}\!\!)\not=0$ ,

[TABLE]

is the derivative of the map $\mathop{\boldsymbol{y}}\nolimits\mapsto\frac{1}{\|\mathop{\boldsymbol{y}}\nolimits\|}\mathop{\boldsymbol{y}}\nolimits$ .

Note that, if $l>1$ , $\|\mathop{\boldsymbol{y}}\nolimits\|\,M^{\dagger}_{\mathop{\boldsymbol{y}}\nolimits}$ is the projection onto the hyper-plane in $\mathbb{R}^{l}$ orthogonal to $\mathop{\boldsymbol{y}}\nolimits$ and, when $l=1$ , $M^{\dagger}_{y_{1}}=0$ . Hence, if $k=0$ or if $k>0$ and $|A_{i}|=1$ for all $1\leqslant i\leqslant k$ , then the derivative of $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ , with respect to $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ , at $\mathop{\boldsymbol{x}}\nolimits^{\prime}=\mathop{\boldsymbol{x}}\nolimits^{*}$ is zero. Recall that the corresponding translated logarithm map in the Euclidean space is the identity map, independent of $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ , and so its derivative with respect to $\mathop{\boldsymbol{x}}\nolimits^{\prime}$ is identically zero. Hence, in a broad sense, Proposition 5 captures where and how the derivative of $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ differs from that of the corresponding translated Euclidean logarithm map.

Returning to the directional limit $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ of $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ with $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma=\mathop{\mathcal{O}}\nolimits(E)$ , where $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ co-bounds $\sigma$ and $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ is in $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ , since $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is in $\mathop{\boldsymbol{C}}\nolimits_{\tau}$ , both projections $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=P_{\tau}(\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*}))$ and $P_{\sigma}(\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*}))$ are well defined. In particular, $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is a map from $\mathop{\boldsymbol{X}}\nolimits^{m}$ onto $\mathbb{R}(E\cup F)$ . Then, we also have the following consequences of Theorem 2, giving the relationships between the projections of the directional limit of the translated logarithm map and the directional limit of the projections of the translated logarithm map.

Corollary 4.

With the notation and hypotheses of Theorem $\ref{thm2}(ii)$ ,

$(i)$

$\lim\limits_{\lambda\rightarrow 0+}\Phi_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}))=\lim\limits_{\lambda\rightarrow 0+}\Phi_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}))=\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ ; 2. $(ii)$

$P_{\sigma}\left(\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})\right)=\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ .

Proof.

The equality of the extreme terms in $(i)$ follows since the $W_{i}$ in (19) are determined by the axes in $E\cup F$ , so that it does not matter whether we project on $\mathop{\mathcal{O}}\nolimits(E\cup F)$ before or after taking the limit, and the remaining term $P_{B_{0}\cap(E\cup F)}(\mathop{\boldsymbol{x}}\nolimits)$ remains constant throughout the limiting process. The equality with the central term in $(i)$ follows from Corollary 3 $(ii)$ : $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp};\mathop{\boldsymbol{x}}\nolimits^{*})$ , which is $\lim\limits_{\lambda\rightarrow 0+}\Phi_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}))$ by the case already established.

Note that, since projection onto $\mathbb{R}(E)\subset\mathbb{R}(E\cup F)$ is unaffected by first projecting onto $\mathbb{R}(E\cup F)$ , $(ii)$ is equivalent to

[TABLE]

To establish (24), we need to allow for the fact that the geodesics $\gamma_{\lambda}$ from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ and the geodesic $\gamma_{0}$ from $\mathop{\boldsymbol{x}}\nolimits^{*}$ to $\mathop{\boldsymbol{x}}\nolimits$ may have different carriers. We assume that $\lambda$ is restricted to the range $0<\lambda<\epsilon$ such that the initial segments of $\gamma_{\lambda}$ all lie in $\zeta=\mathcal{O}(E\cup F\cup G)$ , where possibly $G=\emptyset$ and so $\zeta=\tau$ , and let $K$ be the set of axes with respect to which the initial segment of $\gamma_{0}$ has positive coordinates. Then, $K\supseteq E\cup G$ . Now, $e\in E\cup F\cup G$ if, and only if, for each $\lambda$ and some maximal $\delta(\lambda)>0$ , $\|P_{\{e\}}(\gamma_{\lambda}(s))\|>0$ for $s\in(0,\delta(\lambda))$ . From the uniform continuity of geodesics with respect to their endpoints, it is clear that we must have $\delta(\lambda)\rightarrow\delta_{0}\geqslant 0$ as $\lambda\rightarrow 0$ . If $\delta_{0}>0$ , then $\|P_{\{e\}}(\gamma_{0}(s))\|>0$ for $s\in(0,\delta_{0})$ and so $e\in K$ . Conversely, $e\in K\cap(E\cup F\cup G)$ implies that $\|P_{\{e\}}(\gamma_{0}(s))\|>0$ for $s\in(0,\delta(0))$ and we must have $\delta_{0}=\delta(0)$ .

Thus, for any axis $e$ in $K\cap(E\cup F\cup G)$ , the projections $P_{\{e\}}(\gamma_{\lambda}(s))$ and $P_{\{e\}}(\gamma_{0}(s))$ of the initial segments of these geodesics all lie in the closure of the stratum $\mathcal{O}(E\cup F\cup G)$ . The uniform continuity of these geodesics, and so of their projections, with respect to their endpoints, together with their linearity within that closed stratum, implies that the components $P_{\{e\}}\left(\dot{\gamma}_{\lambda}(0)\right)$ converge to $P_{\{e\}}\left(\dot{\gamma}_{0}(0)\right)$ as $\lambda\rightarrow 0$ . In particular, since $E\subseteq K$ , this is valid for any axis $e$ in $E$ , which establishes (24). ∎

The comments made prior to Proposition 5 regarding the form of the expression for $\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ can be generalised to apply to $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ : using the notation in Theorem 2 $(ii)$ for $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ we have that

[TABLE]

Recall that $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ denotes the set of unit vectors in $\{{\bf 0}\}\times\mathop{\mathcal{O}}\nolimits(F)\subset\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ that comprises all unit vectors that are tangent to $\tau$ and orthogonal to $\sigma$ . If $l-l^{\prime}=1$ , $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ comprises a single point. When $l-l^{\prime}>1$ , for any fixed $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , the pseudo-partition of $\mathop{\boldsymbol{X}}\nolimits^{m}$ determined by $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ induces a polyhedral subdivision of $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ where, in each cell of the induced polyhedral subdivision, the form of the expression (19) for $\Psi(\mathop{\boldsymbol{x}}\nolimits,\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ , and so the form of the expression for $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ , remains the same. In particular, this implies that, for fixed $\mathop{\boldsymbol{x}}\nolimits$ , $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is a continuous function of $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ . In fact, the directional derivatives of $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ with respect to $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ also exist in directions $\mathop{\boldsymbol{v}}\nolimits$ in the tangent space to $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ at $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ that we denote by $\mathcal{T}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ . These derivatives have the property given in the following proposition, where we note that $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)\subset\mathbb{R}(E)\times\mathbb{R}(F)$ so that, for fixed $\mathop{\boldsymbol{x}}\nolimits$ and $\mathop{\boldsymbol{x}}\nolimits^{*}$ , $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ and $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ lie in the same Euclidean space.

Proposition 6.

Let the stratum $\sigma=\mathcal{O}(E)$ of co-dimension $l(\geqslant 2)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}(<l-1)$ . Fix $\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{x}}\nolimits^{*}\in\mathop{\boldsymbol{X}}\nolimits^{m}$ with $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ . Then, as a function of $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , the directional derivative $D$ of $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ at $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ in the direction $\mathop{\boldsymbol{v}}\nolimits\in\mathcal{T}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ exists and satisfies

[TABLE]

Proof.

Without loss of generality, we may assume that $\|\mathop{\boldsymbol{v}}\nolimits\|=1$ . Consider the geodesic on $\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ given by $\alpha(s)=\mathop{\boldsymbol{w}}\nolimits_{\tau}\cos s+\mathop{\boldsymbol{v}}\nolimits\sin s$ . Write $w_{1}$ for a vector whose coordinates comprise a subset of those of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , and $v_{1}$ , $\alpha_{1}$ for the corresponding components of $\mathop{\boldsymbol{v}}\nolimits$ and $\alpha$ respectively. Then, the initial tangent vector of the function $f(s)=\frac{\alpha_{1}(s)}{\|\alpha_{1}(s)\|}$ is $\dot{f}(0)=v_{1}M^{\dagger}_{w_{1}}$ , where $M^{\dagger}_{\mathop{\boldsymbol{y}}\nolimits}$ is given by (23). Clearly, $\langle w_{1},\dot{f}(0)\rangle=0$ , since the image of $M^{\dagger}_{w_{1}}$ is orthogonal to $w_{1}$ .

On the other hand, it follows from the argument in the proof of Theorem 2 that, for all sufficiently small $s$ , the expression for $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\alpha(s);\mathop{\boldsymbol{x}}\nolimits^{*})$ all have the same form provided that, when $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ lies on the boundary of a cell of the induced polyhedral subdivision on $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , we use for $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ the expression valid for $s>0$ . Thus, we may use the expression for $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ given by (25) to express $D\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})(\mathop{\boldsymbol{v}}\nolimits)$ in the form $\mathop{\boldsymbol{v}}\nolimits M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ , where

[TABLE]

and where, using the notation of Theorem 2, $W_{l_{i}}=P_{A_{l_{i}}\cap F}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ are just those components in the expression for $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ for which $P_{A_{l_{i}}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})=0$ and $|A_{l_{i}}\cap F|>1$ . Since $\|y\|M^{\dagger}_{y}$ is the projection onto the hyperplane orthogonal to $y$ in the Euclidean space where $y$ lies as noted after the statement of Proposition 5, the result follows. ∎

The proof of Proposition 6 also shows that, if $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ lies in the interior of a single cell of the induced polyhedral subdivision of $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , then $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{\prime}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is differentiable with respect to $\mathop{\boldsymbol{w}}\nolimits^{\prime}_{\tau}$ at $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ . However, if $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ lies in the boundary of a cell of the induced polyhedral subdivision, this no longer holds, although directional derivatives still exist.

The directional derivative of $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\,\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})\rangle$ , as a function of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , now follows from Proposition 6.

Corollary 5.

Assume that all assumptions in Proposition 6 hold. Then, for any $\mathop{\boldsymbol{v}}\nolimits\in\mathcal{T}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ , the derivative $D$ in the direction $\mathop{\boldsymbol{v}}\nolimits$ of $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\,\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})\rangle$ at $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ is given by

[TABLE]

Proof.

The second term in the expansion

[TABLE]

vanishes by Proposition 6. The result then follows since the directional derivative $D\mathop{\boldsymbol{w}}\nolimits_{\tau}(\mathop{\boldsymbol{v}}\nolimits)$ is given by the derivative at $s=0$ of the geodesic $\alpha(s)=\mathop{\boldsymbol{w}}\nolimits_{\tau}\cos s+\mathop{\boldsymbol{v}}\nolimits\sin s$ . ∎

6 Characterisation of Fréchet means

In the remainder of this paper, we use the knowledge obtained so far on the translated logarithm map to investigate Fréchet means of probability measures on $\mathop{\boldsymbol{X}}\nolimits^{m}$ . So, from now on we assume that $\mu$ is a probability measure on $\mathop{\boldsymbol{X}}\nolimits^{m}$ and that its Fréchet function defined by (1), where $\mathop{\boldsymbol{M}}\nolimits=\mathop{\boldsymbol{X}}\nolimits^{m}$ , is finite at one point. The latter ensures that the Fréchet function of $\mu$ is finite everywhere.

Since the squared distance on a CAT(0)-space is a convex function with respect to each of its variables, it follows that the Fréchet mean of $\mu$ is unique and that the condition for $\mathop{\boldsymbol{x}}\nolimits^{*}$ to be the Fréchet mean of $\mu$ , that is, the condition for $\mathop{\boldsymbol{x}}\nolimits^{*}$ to satisfy

[TABLE]

is equivalent to this inequality holding in any neighbourhood of $\mathop{\boldsymbol{x}}\nolimits^{*}$ . Then, since the Fréchet function of $\mu$ is differentiable at $\mathop{\boldsymbol{x}}\nolimits^{*}$ if $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional, or locally top-dimensional, stratum, the above condition for such $\mathop{\boldsymbol{x}}\nolimits^{*}$ to be the Fréchet mean of $\mu$ is equivalent to the condition that

[TABLE]

similar to the condition for Fréchet means in Riemannian manifolds of non-positive curvature.

When $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum $\sigma$ of locally positive co-dimension, the squared distance $d(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits)^{2}$ is no longer differentiable at $\mathop{\boldsymbol{x}}\nolimits^{*}$ for any fixed $\mathop{\boldsymbol{x}}\nolimits$ . Nevertheless, it has directional derivatives along all possible directions and then the above condition becomes that, at $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ , the Fréchet function of $\mu$ has non-negative directional derivatives along all possible directions. The fact that $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a CAT(0)-space also implies that the derivative at $\mathop{\boldsymbol{x}}\nolimits^{*}$ in the direction $\mathop{\boldsymbol{w}}\nolimits$ of the distance function $d_{\mathop{\boldsymbol{x}}\nolimits}=d(\cdot,\mathop{\boldsymbol{x}}\nolimits)$ can be expressed as

[TABLE]

where $\ll\,\,,\,\,\gg$ is defined by (2) (cf. [14], (2.5), p417). Thus, the criterion for a point $\mathop{\boldsymbol{x}}\nolimits^{*}$ lying in a stratum $\sigma$ of locally positive co-dimension to be the Fréchet mean of $\mu$ is equivalent to the condition that

[TABLE]

for all tangent vectors $\mathop{\boldsymbol{w}}\nolimits$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ .

For any vector $\mathop{\boldsymbol{w}}\nolimits$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ which is tangent to $\sigma$ , the fact that $-\mathop{\boldsymbol{w}}\nolimits$ is also tangent to $\sigma$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ implies that the inequality (28) must be an equality for all such $\mathop{\boldsymbol{w}}\nolimits$ . From this it follows that

[TABLE]

analogous to the condition (27). On the other hand, for any given stratum $\tau$ co-bounding $\sigma$ and any vector $\mathop{\boldsymbol{w}}\nolimits$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ tangent to $\tau$ , it is possible to link the derivative, at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , of the Fréchet function in the direction $\mathop{\boldsymbol{w}}\nolimits$ with $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , the projection of the directional limit of $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ . To show this, we need the following limiting property of the directional derivatives on general CAT(0)-spaces.

Lemma 1.

Let $X$ be a ${\rm CAT}(0)$ -space, and let $x_{0}$ and $x$ be two distinct fixed points in $X$ . For some $\epsilon>0$ , assume that $\gamma:[0,\epsilon)\rightarrow X$ is a geodesic with $\gamma(0)=x$ and $\dot{\gamma}(0)=v_{x}$ . Then, if $\{x_{i}\,:\,i\geqslant 1\}$ is a sequence of points along $\gamma$ convergent to $x$ , the derivative $D$ at $x$ in the direction $v_{x}$ of the distance function $d_{x_{0}}=d(x_{0},\cdot)$ has the property that

[TABLE]

where $v_{x_{i}}$ denotes the tangent vector at $x_{i}$ of the geodesic $\gamma$ .

Proof.

For $x,y,z\in X$ , denote by $\angle_{x}(y,z)$ the Alexandrov angle at $x$ between the geodesics from $x$ to $y$ and $z$ respectively. Since $Dd_{x_{0}}(v_{x})=-\ll v_{x},\log_{x}(x_{0})\gg/d_{x_{0}}(x)=-{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\|v_{x}\|}\cos\angle_{x}(x^{\prime},x_{0})$ where $x^{\prime}$ is a point on the geodesic $\gamma$ , it is sufficient to show that, for a fixed point $x^{\prime}$ chosen on $\gamma$ , $\angle_{x}(x^{\prime},x_{0})=\lim\limits_{i\rightarrow\infty}\angle_{x_{i}}(x^{\prime},x_{0})$ .

For this, we write $\gamma_{a,b}$ for the (unique) geodesic segment joining $a$ and $b$ , for any two distinct points $a$ and $b$ in $X$ . Then, given sequences of points $a_{i}\rightarrow a$ , $b_{i}\rightarrow b$ and $c_{i}\rightarrow c$ in $X$ , it follows from the Cartan-Hadamand theorem that the geodesic segments $\gamma_{a_{i},b_{i}}$ and $\gamma_{a_{i},c_{i}}$ converge uniformly, as maps, to $\gamma_{a,b}$ and $\gamma_{a,c}$ respectively. From this it follows that $\angle_{a}(b,c)\geqslant\limsup_{i\rightarrow\infty}\angle_{a_{i}}(b_{i},c_{i})$ (cf. [7], Theorem 4.3.11, p.119). Applying this to the sequence of geodesic triangles $\Delta(x^{\prime}x_{i}x_{0})$ , we obtain

[TABLE]

On the other hand, using (4.3) p.124 of [7], we have

[TABLE]

where, as in Section 2, $\overline{\angle}$ denotes the corresponding comparison angle in $\mathbb{R}^{2}$ . Then, since $\overline{\angle}_{x_{i}}(x,x_{0})\geqslant\angle_{x_{i}}(x,x_{0})$ , the above implies that

[TABLE]

However, since $X$ has non-positive curvature, if $x_{i}$ lies between $x$ and $x^{\prime}$ on the geodesic segment $\gamma_{x,x^{\prime}}$ , then $\angle_{x_{i}}(x^{\prime},x_{0})+\angle_{x_{i}}(x_{0},x)\geqslant\pi$ (cf. [7], p117, line 5). Hence,

[TABLE]

This, together with (30), gives that

[TABLE]

so that the required result follows. ∎

Recalling that $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\log_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)+\mathop{\boldsymbol{x}}\nolimits^{*}$ , the criteria (28) and (29) for a point $\mathop{\boldsymbol{x}}\nolimits^{*}$ to be the Fréchet mean of $\mu$ may now be recast, the former in terms of the standard Euclidean inner product and $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , the projection of the directional limit of $\Phi(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ , when $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum $\sigma$ of positive co-dimension and $\mathop{\boldsymbol{w}}\nolimits$ is tangent to a co-bounding stratum $\tau$ .

Theorem 3.

Let $\sigma$ be a stratum in $\mathop{\boldsymbol{X}}\nolimits^{m}$ of co-dimension $l(\geqslant 0)$ . The necessary and sufficient conditions for a given point $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ to be the Fréchet mean of $\mu$ are

$(i)$

for any stratum $\tau$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ of co-dimension $l^{\prime}$ , $0\leqslant l^{\prime}<l$ , co-bounding $\sigma$ and any $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ ,

[TABLE]

where $\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ is given by Definition 4;** 2. $(ii)$

for all $l\geqslant 0$ ,

[TABLE]

Note that case $(i)$ may only occur if $l>0$ , but need not occur then. Note also that, if $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space, the special case $l=0$ of this result is the same as that of Lemma 3 of [3]; and the special case $l=1$ , so that $l^{\prime}=0$ , is equivalent to that given by Lemma 5 of [3]: on the one hand, $\mathcal{S}^{l-l^{\prime}}_{\tau\setminus\sigma}$ contains a single unit vector and, on the other hand, as we noted earlier, $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is identical with the composition of the ‘folding map’ with $\Phi(\,\cdot\,;\mathop{\boldsymbol{x}}\nolimits^{*})$ in [3].

Proof.

Noting that ( $ii$ ) is precisely (29), it is sufficient to show that ( $i$ ) is equivalent to (28) for any tangent vector $\mathop{\boldsymbol{w}}\nolimits$ that is not tangent to $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ . For this, we fix any stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ , of co-dimension $l^{\prime}$ , co-bounding $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ and take $\mathop{\boldsymbol{w}}\nolimits=\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ . Then, it follows from Lemma 1 that (28) is equivalent to

[TABLE]

where $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}){\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}=\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ , as defined prior to Theorem 2. Since $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ is tangent to $\tau$ at $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})$ for sufficiently small $\lambda>0$ and, for any given $\mathop{\boldsymbol{x}}\nolimits$ , $\log_{\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})}(\mathop{\boldsymbol{x}}\nolimits)$ is tangent either to $\tau$ or to one of the strata that co-bound $\tau$ , we have

[TABLE]

However,

[TABLE]

Hence, by Corollary 4 $(i)$ and then Corollary 3 $(ii)$ , (33) is equivalent to

[TABLE]

where $\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}=P_{\tau\setminus\sigma}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ . Decomposing $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ as $\mathop{\boldsymbol{w}}\nolimits_{\tau}=\mathop{\boldsymbol{w}}\nolimits_{\sigma}+\mathop{\boldsymbol{w}}\nolimits_{\tau}^{\perp}$ , where $\mathop{\boldsymbol{w}}\nolimits_{\sigma}=P_{\sigma}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ , leads to

[TABLE]

where the second equality follows from Corollary 4 $(ii)$ . The required result now follows by noting $(ii)$ , noting that $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\mathop{\boldsymbol{x}}\nolimits^{*}\rangle=\langle\mathop{\boldsymbol{w}}\nolimits_{\sigma},\mathop{\boldsymbol{x}}\nolimits^{*}\rangle=0$ and noting that, by applying the projection $P_{\tau}$ to the result of Corollary 3 $(i)$ , $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{\perp}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{\perp}_{\tau}/\|\mathop{\boldsymbol{w}}\nolimits^{\perp}_{\tau}\|;\mathop{\boldsymbol{x}}\nolimits^{*})$ . ∎

From now on, we assume that $\mathop{\boldsymbol{\xi}}\nolimits$ is a random variable defined on a probability space $(\mathop{\boldsymbol{\Omega}}\nolimits,\mathcal{F},{\bf P})$ with values in $\mathop{\boldsymbol{X}}\nolimits^{m}$ and that $\mu$ is the distribution (measure) of $\mathop{\boldsymbol{\xi}}\nolimits$ , i.e. $\mu(B)={\bf P}(\mathop{\boldsymbol{\xi}}\nolimits^{-1}(B))$ for any Borel set $B$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . When the stratum containing the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of the probability measure $\mu$ on $\mathop{\boldsymbol{X}}\nolimits^{m}$ is of locally positive co-dimension, (31) being an equality has a significant influence on the nature of the distributions of the Euclidean random variables $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ , which will be seen in Propositions 7, 8 and 9. We shall also see, in Proposition 10, its link with the long term behaviour of sample Fréchet means.

Definition 12.

For the stratum $\sigma$ of co-dimension $l(\geqslant 1)$ , in which the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies, and the stratum $\tau$ , of co-dimension $l^{\prime}$ , co-bounding $\sigma$ , the subset $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ of $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ is defined as

[TABLE]

where $\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ is given by Definition 4.

The convexity of the directional derivative $D(d_{\mathop{\boldsymbol{x}}\nolimits}^{2})(\mathop{\boldsymbol{w}}\nolimits)$ in $\mathop{\boldsymbol{w}}\nolimits$ (cf. [14], pp416-417) ensures that $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is a convex subset of $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ and that

[TABLE]

is a convex subset of $\bigcup\limits_{\tau\supset\sigma}\mathcal{S}_{\tau\setminus\sigma}^{l-l^{\prime}}\subseteq\mathop{\boldsymbol{C}}\nolimits_{\sigma}$ . If $l-l^{\prime}=1$ , $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ consists of a single unit vector so that $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is either $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ itself or an empty set. In general, if the closure of $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is contained in $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , the fact that $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\,\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})\rangle$ is continuous in $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ implies that $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ itself must be closed.

The following result gives a relationship between the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ and the Euclidean mean of $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ . Here, and henceforth, by interior we intend the relative interior, that is, interior with respect to the subspace topology.

Proposition 7.

Let the stratum $\sigma$ of co-dimension $l(\geqslant 2)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau$ of co-dimension $l^{\prime}(<l-1)$ . Assume that the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in $\sigma$ and that ${\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))\not=\emptyset$ . Then, for any $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ ,

[TABLE]

Note that, if $l^{\prime}=l-1$ , equality (36) holds automatically since its left hand side is a 1-dimensional vector so that the equality follows from the assumption that $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ .

Proof.

By the continuity of $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ in $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , we may assume that $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\hbox{int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ . Then equality holds in (31) in a neighbourhood of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , so that

[TABLE]

By Corollary 5, this implies that

[TABLE]

On the other hand, it follows from $\int_{\mathop{\boldsymbol{X}}\nolimits^{m}}\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})\,d\mu(\mathop{\boldsymbol{x}}\nolimits)=\mathop{\boldsymbol{x}}\nolimits^{*}$ and $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\mathop{\boldsymbol{x}}\nolimits^{*}\rangle=0$ that

[TABLE]

Hence, taking the directional derivative of the left hand side as a function of $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , we have

[TABLE]

for all $\mathop{\boldsymbol{v}}\nolimits\in\mathcal{T}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ . Noting that the left hand side of (36) is a vector lying in the $(l-l^{\prime})$ -dimensional Euclidean space containing $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , the fact that $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ , together with the above, implies that the required result holds for any $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\hbox{int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ . ∎

One immediate consequence of Proposition 7 is the following.

Corollary 6.

Assume that the conditions given in Proposition 7 are satisfied. If $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ and $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ then, for all $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ ,

[TABLE]

That is, the point $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ , as a point in $\mathbb{R}(E\cup F)$ , is the Euclidean mean of each of the Euclidean random variables $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ for such $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ .

If a stratum $\sigma=\mathcal{O}(E)$ of co-dimension $l(\geqslant 1)$ bounds, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau$ of co-dimension $l^{\prime}(<l)$ and if $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ , then it follows from the proof of Proposition 6 that the maps $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ and $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ from $\mathop{\boldsymbol{X}}\nolimits^{m}$ to $\mathbb{R}(E\cup F)$ are generally not identical for any given distinct $\mathop{\boldsymbol{w}}\nolimits^{i}_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , $i=1,2$ . With the insight obtained from that proof, to characterise the places where they differ we introduce the subset $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})$ of $\mathop{\boldsymbol{X}}\nolimits^{m}$ as follows. It will be clear later, in the proof of Proposition 9, that the set of $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ where $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{1};\mathop{\boldsymbol{x}}\nolimits^{*})\not=\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ is contained in the set $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau}^{1})\cup\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau}^{2})\cup\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ . Thus, in particular, for $\mathop{\boldsymbol{\xi}}\nolimits$ lying outside of the latter set, the Euclidean random variables $\Psi(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ and $\Psi(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{2};\mathop{\boldsymbol{x}}\nolimits^{*})$ are identical. This fact will be used in the derivation of the limiting distribution of sample Fréchet means in the next section.

Definition 13.

Let the stratum $\sigma=\mathcal{O}(E)$ of co-dimension $l(\geqslant 1)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}(<l)$ . For $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ and $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , a point $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ is called singular with respect to $(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{w}}\nolimits_{\tau})$ , if at least one $A_{i}$ with $A_{i}\cap E=\emptyset$ has $|A_{i}\cap F|>1$ , where $i\geqslant 1$ and the sequences $\mathcal{A}=(A_{0},A_{1},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},B_{1},\cdots,B_{k})$ form the support of the geodesics from $x^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau}$ to $\mathop{\boldsymbol{x}}\nolimits$ for all sufficiently small $\lambda>0$ . The set $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})$ consists of all points $\mathop{\boldsymbol{x}}\nolimits$ that are singular with respect to $(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{w}}\nolimits_{\tau})$ .

For example, in the orthant space $\mathop{\boldsymbol{X}}\nolimits^{2}$ of Example 3, using the notation there, $\Sigma_{\tau,\{o\}}(o;\mathop{\boldsymbol{w}}\nolimits_{\tau})$ is the closure of the light grey region in Figure 2. It follows from comparison of the corresponding expressions (19) and (25) that the singularity of $\mathop{\boldsymbol{x}}\nolimits$ with respect to $(\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{w}}\nolimits_{\tau})$ has the same effect on $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ as it does on $\Psi(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ . In particular, in terms of the matrix $M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits)$ given by (26), we can express $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})$ defined above as

[TABLE]

Note that $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})=\emptyset$ if $l-l^{\prime}=1$ , since then $\mathcal{S}^{l-l^{\prime}}_{\tau\setminus\sigma}$ contains a single unit vector $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ which leads to the impossibility that $|A_{i}\cap F|>1$ . Generally, if $l-l^{\prime}>1$ , which implies that $l\geqslant 2$ , $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})$ could be relatively substantial. Nevertheless, we have the following result on the measure of $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})$ .

Proposition 8.

Let the stratum $\sigma$ of co-dimension $l(\geqslant 2)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau$ of co-dimension $l^{\prime}(<l-1)$ . Assume that the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in $\sigma$ and that $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in{\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ , where $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is defined by (34). Then, $\mu\left(\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau})\right)=0$ .

Proof.

Let $\alpha(s)$ be a unit speed geodesic in $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , write $\mathop{\boldsymbol{v}}\nolimits(s)=\dot{\alpha}(s)$ and define $h(s)=\left\langle\mathop{\boldsymbol{v}}\nolimits(s),\int_{\mathop{\boldsymbol{X}}\nolimits^{m}}\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\alpha(s);\mathop{\boldsymbol{x}}\nolimits^{*})\,d\mu(\mathop{\boldsymbol{x}}\nolimits)\right\rangle$ . Since $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ is an open subset of a Euclidean sphere, we have $\dot{\mathop{\boldsymbol{v}}}\nolimits(s)=-\alpha(s)$ , $\ddot{\alpha}(s)=-\alpha(s)$ and so, by Proposition 6 and its proof,

[TABLE]

where $M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits)$ is given by (26). The expression for $M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits)$ implies that, for $\mathop{\boldsymbol{w}}\nolimits\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ and any fixed $\mathop{\boldsymbol{x}}\nolimits\in\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits)$ , $\langle\mathop{\boldsymbol{v}}\nolimits,\mathop{\boldsymbol{v}}\nolimits M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits)\rangle$ can be written in the form

[TABLE]

for some $1\leqslant j\leqslant k$ , where $W_{l_{i}}$ and $P_{B_{l_{i}}}(\mathop{\boldsymbol{x}}\nolimits)$ are those required for the expression (26) for $M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\mathop{\boldsymbol{w}}\nolimits)$ in the proof of Proposition 6. This implies that $\dot{h}(0)$ must be non-positive. Moreover, for any open or closed subset $\mathcal{E}\subseteq\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\alpha(0))$ such that $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\alpha(0);\mathop{\boldsymbol{x}}\nolimits^{*})$ has the same expression for all $\mathop{\boldsymbol{x}}\nolimits\in\mathcal{E}$ , there is a vector $\mathop{\boldsymbol{v}}\nolimits(0)\in\mathcal{T}_{\alpha(0)}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ such that $\langle\mathop{\boldsymbol{v}}\nolimits(0),\,\mathop{\boldsymbol{v}}\nolimits(0)\,M_{\mathop{\boldsymbol{x}}\nolimits^{*},\mathop{\boldsymbol{x}}\nolimits}(\alpha(0))\rangle<0$ for all $\mathop{\boldsymbol{x}}\nolimits\in\mathcal{E}$ . Then, if $\mu(\mathcal{E})\not=0$ , the corresponding $h$ satisfies

[TABLE]

Clearly, $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\alpha(0))$ can be decomposed as a finite disjoint union of such sets $\mathcal{E}$ .

If $\mathop{\boldsymbol{w}}\nolimits_{\tau}=\alpha(0)\in\hbox{int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ then, for any $v(0)\in\mathcal{T}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}(\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma})$ , the corresponding geodesic $\alpha(s)$ lies in $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ for all sufficiently small $s\geqslant 0$ . Using a similar argument to that for the proof of Proposition 7, the corresponding $h(s)$ must be identically zero for all sufficiently small $s\geqslant 0$ , which implies that $\dot{h}(0)=0$ . Hence, we must have $\mu(\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau}))=0$ . ∎

If a stratum $\sigma$ bounds $\tau$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ and $\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau}$ , $\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau}$ are two different vectors at $\mathop{\boldsymbol{x}}\nolimits^{*}$ tangent to $\tau$ , then it follows from the map $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ generally differing from $\Psi_{\tau}(\,\cdot\,,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ that the distribution of the Euclidean random variable $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ generally differs from that of $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ . Nevertheless, under the conditions in Proposition 8, the $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ are in fact a.s. identical for $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in{\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ .

Proposition 9.

Assume that $\mathop{\boldsymbol{\xi}}\nolimits$ is a random variable on $\mathop{\boldsymbol{X}}\nolimits^{m}$ with distribution measure $\mu$ having Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ . Assume further that $\mu\left(\mathcal{D}_{x^{*}}\right)=0$ and that $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in the stratum $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ of co-dimension $l(\geqslant 2)$ . Let the stratum $\tau$ of co-dimension $l^{\prime}(<l-1)$ co-bound $\sigma$ , in $\mathop{\boldsymbol{X}}\nolimits^{m}$ . If ${\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))\not=\emptyset$ , then the distributions of the Euclidean random variables $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ are independent of $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in{\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ , where the set $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is defined by (34).

Note that the example in the next section makes it clear that the condition $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in{\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ in the statement of Proposition 9 cannot be relaxed to $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ .

Proof.

First, we show that, for any given distinct $\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ , $j=1,2$ , and for $\mathop{\boldsymbol{x}}\nolimits\not\in\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})\bigcup\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau})\bigcup\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ , $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ . Then, it follows from the assumption and Proposition 8 that $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ a.s. Recall from the proof of Theorem 2 $(ii)$ that, for fixed $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ and $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in S^{l-l^{\prime}}_{\tau\setminus\sigma}$ , the supports of the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})=\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau}$ to $\mathop{\boldsymbol{x}}\nolimits$ are the same, for all sufficiently small $\lambda>0$ , and that the expression for $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is determined by this common support. Thus, $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{j};\mathop{\boldsymbol{x}}\nolimits^{*})$ is identical if the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{j})$ to $\mathop{\boldsymbol{x}}\nolimits$ have the same support when $\lambda>0$ is sufficiently small.

Suppose now that the supports $(\mathcal{A}^{j},\mathcal{B}^{j})$ , $j=1,2$ , of the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})$ and $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau})$ respectively to $\mathop{\boldsymbol{x}}\nolimits$ are different, for all sufficiently small $\lambda>0$ . Then, the geodesic $\gamma_{\lambda}$ between $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})$ and $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau})$ must meet at least one hyper-surface in $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ . If there are more than one, but necessarily finitely many, such hyper-surfaces, by introducing a point on $\gamma_{\lambda}$ between each pair of consecutive such hyper-surfaces, the change of the supports of the geodesics from points of $\gamma_{\lambda}$ to $\mathop{\boldsymbol{x}}\nolimits$ can be considered inductively to reduce the case to where $\gamma_{\lambda}$ meets only one such hyper-surface.

Hence, without loss of generality, we assume that $\gamma_{\lambda}$ only meets $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ at a point $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ on one of the hyper-surfaces in $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ . That is, $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ satisfies (17) for a particular $i_{0}$ with $\mathop{\boldsymbol{x}}\nolimits^{*}$ being replaced by $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ and all the other relevant inequalities in Proposition 2, with $\mathop{\boldsymbol{x}}\nolimits_{1}$ and $\mathop{\boldsymbol{x}}\nolimits_{2}$ replaced by $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ and $\mathop{\boldsymbol{x}}\nolimits$ , are strict. If $\mathop{\boldsymbol{x}}\nolimits\not\in\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ so that $\mathop{\boldsymbol{x}}\nolimits^{*}\not\in\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits}$ , we may assume that the points $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau})$ lie on the opposite sides of $H$ for all sufficiently small $\lambda>0$ . Then, by Proposition 4, as $\gamma_{\lambda}$ moves through $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ , the supports of the geodesics from $\gamma_{\lambda}$ to $\mathop{\boldsymbol{x}}\nolimits$ change, with the relevant subset $A^{1}_{i_{0}}=C_{i_{0}1}\cup C_{i_{0}2}$ of the sequence $\mathcal{A}^{1}=(A^{1}_{0},\cdots,A^{1}_{k})$ in the support $(\mathcal{A}^{1},\mathcal{B}^{1})$ on the one side splitting, say, into two subsets $C_{i_{0}1},C_{i_{0}2}$ on the other, and similarly for $B^{1}_{i_{0}}$ in $\mathcal{B}^{1}$ . That is, the support $(\mathcal{A}^{2},\mathcal{B}^{2})$ of the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}^{2})$ to $\mathop{\boldsymbol{x}}\nolimits$ is related to $(\mathcal{A}^{1},\mathcal{B}^{1})$ by $\mathcal{A}^{2}=(A^{1}_{0},\cdots,A^{1}_{i_{0}-1},C_{i_{0}1},C_{i_{0}2},A^{1}_{i_{0}+1},\cdots,A^{1}_{k})$ , and similarly $\mathcal{B}^{2}$ to $\mathcal{B}^{1}$ . We show now that neither of these subsets $C_{i_{0}1}$ and $C_{i_{0}2}$ can meet $E$ . If only one of these two sets meets $E$ , say $C_{i_{0}1}$ , then since $P_{C_{i_{0}2}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau}))\rightarrow 0$ as $\lambda\rightarrow 0$ , it is impossible that there are $\mathop{\boldsymbol{x}}\nolimits_{\lambda}$ such that the corresponding equality (17) holds for all sufficiently small $\lambda>0$ . Similarly, if both of these sets meet $E$ , then the proof of Corollary 4 $(ii)$ shows that $P_{C_{i_{0}s}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau}))\rightarrow P_{C_{i_{0}s}}(\mathop{\boldsymbol{x}}\nolimits^{*})$ , as $\lambda\rightarrow 0$ , for $j=1,2$ . This implies that, for $j=1$ , the corresponding strict inequality (10) holds for $\mathop{\boldsymbol{x}}\nolimits^{*}$ while, for $j=2$ , it is reversed. Hence, that is also impossible.

Thus, in the case when the supports $(\mathcal{A}^{i},\mathcal{B}^{i})$ are different, we still have $A^{1}_{j}=A^{2}_{j}$ for all $j>0$ such that $A^{1}_{j}\cap E\not=\emptyset$ .

If further $\mathop{\boldsymbol{x}}\nolimits\not\in\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})\bigcup\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau})$ , then the change of the support described above cannot happen when both $C_{i_{0}1}\cap E$ and $C_{i_{0}2}\cap E$ are empty, as then $|A_{i_{0}}^{1}\cap F|>1$ , and so we would have $\mathop{\boldsymbol{x}}\nolimits\in\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})$ . Since $A^{1}_{0}=A^{2}_{0}$ , the above implies that we must have $(\mathcal{A}^{i},\mathcal{B}^{i})$ identical for $i=1,2$ and so, for such $\mathop{\boldsymbol{x}}\nolimits$ , $\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})=\Psi_{\tau}(\mathop{\boldsymbol{x}}\nolimits,\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ .

Next, assume that the two $\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau}$ are chosen to be sufficiently close that, for any given $\mathop{\boldsymbol{x}}\nolimits$ and all sufficiently small $\lambda>0$ , the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ have the same support. Then, if $\mathop{\boldsymbol{w}}\nolimits_{\tau}(\alpha)$ , $\alpha\in[0,1]$ , is the geodesic between $\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau}$ and $\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau}$ , an elementary argument on the relevant parameters in the inequalities (8) and (10) that determine the carrier will show that these parameters are monotonic in $\alpha$ along the geodesic. So, the geodesic from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau}(\alpha))$ to $\mathop{\boldsymbol{x}}\nolimits$ will have the same support as that for the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits^{j}_{\tau})$ to $\mathop{\boldsymbol{x}}\nolimits$ . This implies that $\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits_{\tau}(\alpha))\subseteq\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau})\bigcup\Sigma_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mathop{\boldsymbol{w}}\nolimits^{2}_{\tau})$ , so that $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau}(\alpha);\mathop{\boldsymbol{x}}\nolimits^{*})$ are a.s. independent of $\alpha\in[0,1]$ .

Finally, since $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is convex, there is a sequence $\{\mathop{\boldsymbol{w}}\nolimits^{n}_{\tau}\mid n\geqslant 1\}\subset\hbox{int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ such that

[TABLE]

where $C_{n}$ is the convex hull in $\mathop{\mathcal{S}}\nolimits_{\tau\setminus\sigma}^{l-l^{\prime}}$ of $\{\mathop{\boldsymbol{w}}\nolimits^{1}_{\tau},\cdots,\mathop{\boldsymbol{w}}\nolimits^{n}_{\tau}\}$ . The above argument implies that, without loss of generality, we may also assume that $\{\mathop{\boldsymbol{w}}\nolimits^{n}_{\tau}\mid n\geqslant 1\}$ have the property that, for any $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in C_{n}$ ,

[TABLE]

This shows that

[TABLE]

so that $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits,\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ are a.s. independent of $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in C_{n}$ . Hence, it follows from (38) that

[TABLE]

which gives the required result. ∎

7 The limiting distribution of sample Fréchet means

In this section, we assume that $\{\mathop{\boldsymbol{\xi}}\nolimits_{i}\,:\,i\geqslant 1\}$ is a sequence of i.i.d. random variables defined on a common probability space $(\mathop{\boldsymbol{\Omega}}\nolimits,\mathcal{F},{\bf P})$ with values in $\mathop{\boldsymbol{X}}\nolimits^{m}$ ; that $\mu$ is the distribution measure of $\mathop{\boldsymbol{\xi}}\nolimits_{1}$ ; and that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ is the sample Fréchet mean of $\mathop{\boldsymbol{\xi}}\nolimits_{1},\cdots,\mathop{\boldsymbol{\xi}}\nolimits_{n}$ . Then, $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ converges to the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ almost surely as $n$ tends to infinity (cf. [23]).

7.1 On the support of the limiting distribution

If $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum, $\mathop{\boldsymbol{X}}\nolimits^{m}$ is locally an $m$ -dimensional manifold. One would expect that the limiting behaviour of sample Fréchet means $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ is similar, to some extent, to that of sample Fréchet means in a Riemannian manifold as obtained in [4] and [13]. In particular, the support of the limiting distribution of $\sqrt{n}\log_{x^{*}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})=\sqrt{n}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})$ is the entire tangent space to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , as long as $\hbox{cov}(\Phi(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}))$ has rank $m$ . This fact was proved for the case of open books in [10] and for the case of tree spaces in [2] and [3]. We shall see in the following that the argument used in [3] can be generalised to $\mathop{\boldsymbol{X}}\nolimits^{m}$ , so that the corresponding conclusion is also valid for orthant spaces.

However, when $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum of locally positive co-dimension, the limiting behaviour of sample Fréchet means is generally very different. In the case that $\mathop{\boldsymbol{X}}\nolimits^{m}$ is an open book or a tree space and that the stratum containing $\mathop{\boldsymbol{x}}\nolimits^{*}$ is of the co-dimension one, this phenomenon was observed and studied in [10], [2] and [3]. Similarly, for general orthant spaces, the strictness or otherwise of the inequality (31) affects the limiting behaviour of $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ . In particular, when (31) is strict, there is a constraint on the support of the limiting distribution. To describe this, we recall that, for $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ of co-dimension $l$ and $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}<l$ co-bounding $\sigma$ , we are denoting the set of unit vectors in $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ by $\mathop{\mathcal{S}}\nolimits^{m-l^{\prime}}_{\tau,\sigma}$ and those in $\{\mathbf{0}\}\times\mathop{\mathcal{O}}\nolimits(F)$ by $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ . Then, for $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ in the latter, denote by $\mathcal{H}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ the intersection of the half hyper-plane $\mathbb{R}(E)\times\{c\mathop{\boldsymbol{w}}\nolimits_{\tau}\mid c>0\}$ with $\mathop{\mathcal{S}}\nolimits^{m-l^{\prime}}_{\tau,\sigma}$ , namely

[TABLE]

and let

[TABLE]

Proposition 10.

Let the stratum $\sigma=\mathcal{O}(E)$ of co-dimension $l(\geqslant 1)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}(<l)$ . Assume that the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in $\sigma$ and that $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\setminus\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)}$ , where $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ and $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ are given by Definitions 4 and 12 respectively. Then,

[TABLE]

Proof.

For $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ as given in the proposition, let

[TABLE]

Then, the set $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ consists of points with the property that, for arbitrary $\epsilon>0$ , there exist arbitrarily large $n$ such that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ lies in $\tau$ and $(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})/\|\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\|$ is within a distance $\epsilon$ of $\mathcal{H}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ . Since $\Omega_{n}^{k}(\mathop{\boldsymbol{w}}\nolimits_{\tau})\supseteq\Omega_{n}^{k+1}(\mathop{\boldsymbol{w}}\nolimits_{\tau})$ , the required result is equivalent to showing that ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}{\bf P}}(\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}})=0$ .

Without loss of generality, we may assume that, restricted to $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ , $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ lie in $\tau$ for all $n$ and $\mathop{\boldsymbol{w}}\nolimits_{n}=(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})/\|\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\|\rightarrow\mathop{\boldsymbol{w}}\nolimits$ as $n\rightarrow\infty$ for some (random) unit vector $\mathop{\boldsymbol{w}}\nolimits\in\mathcal{H}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ .

Recall that, for given $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\setminus\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)}$ , each $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{i},\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ is a Euclidean random variable on $\mathbb{R}(E\cup F)$ . Then, let

[TABLE]

and write $\Omega_{0}$ for the subset of $\mathop{\boldsymbol{\Omega}}\nolimits$ consisting of points such that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\mathop{\boldsymbol{w}}\nolimits_{\tau}}_{n}$ converges to

[TABLE]

It follows from the classical Law of Large Numbers that ${\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}{\bf P}}(\Omega_{0})=1$ . Hence, restricted to $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}\cap\Omega_{0}$ , the assumption on $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ implies that, for some constant $c<0$ , there is an $n_{0}$ such that, for $n>n_{0}$ , $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\,\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\mathop{\boldsymbol{w}}\nolimits_{\tau}}_{n}\rangle<c$ . However, the assumption that $\mathop{\boldsymbol{w}}\nolimits_{n}\rightarrow\mathop{\boldsymbol{w}}\nolimits\in\mathcal{H}_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}$ implies that $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\,\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\rangle>0$ for all sufficiently large $n$ . Putting these two conclusions together, we have that, restricted to $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}\cap\Omega_{0}$ ,

[TABLE]

as $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\mathop{\boldsymbol{x}}\nolimits^{*}\rangle=0$ .

On the other hand, restricted to $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}\cap\Omega_{0}$ , $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ is in $\tau$ by the assumption made earlier. Then, it follows from (32) that

[TABLE]

Thus, we can express the difference $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\mathop{\boldsymbol{w}}\nolimits_{\tau}}_{n}$ as

[TABLE]

Decompose $\mathop{\boldsymbol{w}}\nolimits_{n}=(\mathop{\boldsymbol{w}}\nolimits_{n})_{\sigma}+(\mathop{\boldsymbol{w}}\nolimits_{n})^{\perp}$ , where $(\mathop{\boldsymbol{w}}\nolimits_{n})_{\sigma}=P_{\sigma}(\mathop{\boldsymbol{w}}\nolimits_{n})$ and $(\mathop{\boldsymbol{w}}\nolimits_{n})^{\perp}=P_{\tau\setminus\sigma}(\mathop{\boldsymbol{w}}\nolimits_{n})$ . Then, by Corollary 3 $(ii)$ , for each $1\leqslant i\leqslant n$ ,

[TABLE]

where $(\mathop{\boldsymbol{w}}\nolimits_{n})_{\tau}=(\mathop{\boldsymbol{w}}\nolimits_{n})^{\perp}/\|(\mathop{\boldsymbol{w}}\nolimits_{n})^{\perp}\|\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ . Without loss of generality, we assume that the carriers of the geodesics from $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ to $\mathop{\boldsymbol{\xi}}\nolimits_{i}$ remain constant. The general case follows from a similar inductive argument to that outlined in the beginning of the proof of Proposition 9 and from the fact that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ converges to $\mathop{\boldsymbol{x}}\nolimits^{*}$ a.s. Then, if $(\mathcal{A},\mathcal{B})$ is the common support of the geodesics, where $\mathcal{A}=(A_{0},\cdots,A_{k})$ and $\mathcal{B}=(B_{0},\cdots,B_{k})$ , and, for $0<j\leqslant k$ , writing $W_{j}$ for $P_{A_{j}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})$ if $A_{j}\cap E\not=\emptyset$ and otherwise $P_{A_{j}\cap F}(\mathop{\boldsymbol{w}}\nolimits_{n})$ , Theorem 1 tells us that the $j$ th set of components of $(\jmath^{-1})(\Phi(\mathop{\boldsymbol{\xi}}\nolimits_{i},\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}))$ is the vector $-\frac{\|P_{B_{j}}(\mathop{\boldsymbol{\xi}}\nolimits_{i})\|}{\|P_{A_{j}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})\|}P_{A_{j}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})$ and, for $(\jmath^{-1})(\Psi(\mathop{\boldsymbol{\xi}}\nolimits_{i},\mathop{\boldsymbol{w}}\nolimits_{n};\mathop{\boldsymbol{x}}\nolimits^{*}))$ , Theorem 2 $(ii)$ tells us that the corresponding vector is $-\frac{{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\|P_{B_{j}}(\mathop{\boldsymbol{\xi}}\nolimits_{i})\|}}{\|W_{j}\|}W_{j}$ . Hence, the proof of Theorem 2 $(ii)$ shows that, when $A_{j}\cap E=\emptyset$ , these two vectors are identical, since $P_{A_{j}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})=P_{A_{j}\cap F}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})=P_{A_{j}\cap F}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})$ . While, if $A_{j}\cap E\not=\emptyset$ , the difference between these two vectors is of the same order as $\frac{P_{A_{j}\cap F}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})}{\|P_{A_{j}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})\|}$ whose limit, as $n\rightarrow\infty$ , is zero since $\|P_{A_{j}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})\|\geqslant\|P_{A_{j}\cap E}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})\|\rightarrow\|P_{A_{j}\cap E}(\mathop{\boldsymbol{x}}\nolimits^{*})\|>0$ but $P_{A_{j}\cap F}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})\rightarrow 0$ a.s. It follows that, as $n\rightarrow\infty$ ,

[TABLE]

Moreover, since $\mathop{\boldsymbol{w}}\nolimits^{\perp}=P_{\tau\setminus\sigma}(\mathop{\boldsymbol{w}}\nolimits)\not=0$ , $\mathop{\boldsymbol{w}}\nolimits_{n}\rightarrow\mathop{\boldsymbol{w}}\nolimits$ implies that $(\mathop{\boldsymbol{w}}\nolimits_{n})_{\tau}\rightarrow\frac{\mathop{\boldsymbol{w}}\nolimits^{\perp}}{\|\mathop{\boldsymbol{w}}\nolimits^{\perp}\|}=\mathop{\boldsymbol{w}}\nolimits_{\tau}$ . Then, it follows from a similar argument to that of the proof of Proposition 6 that, for sufficiently large $n$ ,

[TABLE]

where $\mathop{\boldsymbol{v}}\nolimits_{n}$ is the component of $(\mathop{\boldsymbol{w}}\nolimits_{n})_{\tau}-\mathop{\boldsymbol{w}}\nolimits_{\tau}$ orthogonal to $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , so that as $n\rightarrow\infty$ ,

[TABLE]

by Proposition 6. Then, (40), (43), (44) and (45) together imply that, when it is restricted to $\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}\cap\Omega_{0}$ , $\langle\mathop{\boldsymbol{w}}\nolimits_{\tau},\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\mathop{\boldsymbol{w}}\nolimits_{\tau}}_{n}\rangle\rightarrow 0$ a.s., as $n\rightarrow\infty$ , contradicting (39). Hence, ${\bf P}(\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}}\cap\Omega_{0})=0$ , so that ${\bf P}(\Omega_{\mathop{\boldsymbol{w}}\nolimits_{\tau}})=0$ as required. ∎

When $l-l^{\prime}=1$ , $\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ contains a single unit vector, so that we have the following special case. In particular, taking $l=1$ and so $l^{\prime}=0$ recovers the result of Lemma 6 in [3] for the case of co-dimension one when $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space.

Corollary 7.

Let the stratum $\sigma$ of co-dimension $l(\geqslant 1)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau$ of co-dimension $l^{\prime}=l-1$ . Assume that the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in $\sigma$ . If the inequality (31) corresponding to the unique $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ is strict then, for all sufficiently large $n$ , $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ cannot lie in $\tau$ .

Thus, when $l-l^{\prime}=1$ , the support of the limiting distribution of any appropriately scaled difference $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}$ intersects the stratum $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ only if the inequality (31) corresponding to the unique $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in\mathop{\mathcal{S}}\nolimits^{l-l^{\prime}}_{\tau\setminus\sigma}$ is an equality.

Similar to the case where $l-l^{\prime}=1$ , Proposition 10 has the following consequence on the support of the limiting distribution when $l-l^{\prime}>1$ , where $\mathcal{C}(\Theta)$ denotes the Euclidean cone on $\Theta$ .

Corollary 8.

Let the stratum $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ of co-dimension $l(\geqslant 2)$ bound, in $\mathop{\boldsymbol{X}}\nolimits^{m}$ , the stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}\leqslant l-2$ . Assume that $\mathop{\boldsymbol{x}}\nolimits^{*}\in\sigma$ is the Fréchet mean of $\mu$ . Then the support of the limiting distribution of an appropriately scaled difference $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}$ , if it meets the stratum $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ , must be contained in $\mathbb{R}(E)\times\mathcal{C}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ , where $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is defined by (34).

Hence, the support of the limiting distribution of an appropriately scaled difference $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}$ is contained in $\mathcal{K}_{\mu}$ where, for the closed sets

[TABLE]

in the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ ,

[TABLE]

and where we regard $\sigma$ as co-bounding itself. Nevertheless, the following example shows that

$(i)$

if it is non-empty, $\mathbb{R}(E)\times\mathcal{C}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ is not necessarily an entire stratum $\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ ; 2. $(ii)$

even if it is the entire stratum, the support of the limiting distribution of $\sqrt{n}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})$ does not necessarily intersect that stratum; and 3. $(iii)$

it is possible that the support of the limiting distribution, when restricted to the stratum, is only a subset of $\mathbb{R}(E)\times\mathcal{C}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))$ .

Example 4.

Consider the orthant space $\mathop{\boldsymbol{X}}\nolimits^{2}$ of Example 2. Let $\mu$ have mass $1/2$ at the two points $p_{1}$ and $p_{2}$ equidistant from the cone point $o$ along a geodesic through that point as illustrated in Figure 4.

Then its Fréchet mean is at the cone point and the sample Fréchet means always lie on this geodesic segment. This, in particular, implies that the support of the limiting distribution of $\sqrt{n}\{\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-o\}$ is the union of the cone point with two half lines, one in $\{o\}\times\tau^{\phantom{A}}_{1,5}$ and other in $\{o\}\times\tau^{\phantom{A}}_{2,3}$ , each extending the relevant geodesic segment, where $\tau^{\phantom{A}}_{i,j}=\mathop{\mathcal{O}}\nolimits(u_{i},u_{j})$ .

( $a$ )

For any direction $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,2}}\in\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,2}\setminus\{o\}}$ , $\Psi_{\tau^{\phantom{A}}_{1,2}}(p,\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,2}};o)$ lies in the plane spanned by the orthant $\tau^{\phantom{A}}_{1,2}$ , for any $p$ , and, by identifying $u_{3}$ and $u_{5}$ with $-u_{1}$ and $-u_{2}$ respectively, $\Psi_{\tau^{\phantom{A}}_{1,2}}(p_{i},\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,2}};o)=p_{i}$ for $i=1,2$ . Thus,* $\int_{\mathop{\boldsymbol{X}}\nolimits^{2}}\Psi_{\tau^{\phantom{A}}_{1,2}}(p,\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,2}};o)\,d\mu(p)=0$ and so $\Theta_{\tau^{\phantom{A}}_{1,2},\{o\}}(o;\mu)=\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,2}\setminus\{o\}}$ . Since, in this case, the support of the limiting distribution does not intersect $\{0\}\times\tau^{\phantom{A}}_{1,2}$ , this illustrates $(ii)$ above with $\sigma=\{o\}$ and $\tau=\tau^{\phantom{A}}_{1,2}$ .* 2. ( $b$ )

For any direction $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}\in\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,5}\setminus\{o\}}$ such that the angle between $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}$ and $u_{1}$ -axis is less than or equal $\alpha$ , a similar argument shows that

[TABLE]

Hence, such $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}$ are always contained in $\Theta_{\tau^{\phantom{A}}_{1,5},\{o\}}(o;\mu)$ , i.e.

[TABLE]

where $\theta\in\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,5}}\setminus\{o\}$ is measured from the $u_{1}$ -axis.** 3. ( $c$ )

However,* for any direction $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}\in\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,5}\setminus\{o\}}$ such that the angle between $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}$ and the $u_{1}$ -axis is greater than $\alpha$ , the vector $\Psi_{\tau^{\phantom{A}}_{1,5}}(p_{1},\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}};o)=p_{1}$ , but the vector $\Psi_{\tau^{\phantom{A}}_{1,5}}(p_{2},\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}};o)$ lies on the line spanned by the unit vector $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}$ in $(u_{1},u_{5})$ -plane. Hence, these two vectors do not lie on the same line in the $(u_{1},u_{5})$ -plane through the origin. This gives*

[TABLE]

Hence, if the angle between $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}$ and the $u_{1}$ -axis is greater than $\alpha$ ,* then $\mathop{\boldsymbol{w}}\nolimits_{\tau^{\phantom{A}}_{1,5}}\not\in\Theta_{\tau^{\phantom{A}}_{1,5},\{o\}}(o;\mu)$ . Combining this with the conclusion $(b)$ shows that $\Theta_{\tau^{\phantom{A}}_{1,5},\{o\}}(o;\mu)=\{\theta\in\mathop{\mathcal{S}}\nolimits^{2}_{\tau^{\phantom{A}}_{1,5}\setminus\{o\}}\mid\theta\leqslant\alpha\}$ , illustrating $(i)$ and $(iii)$ above with $\sigma=\{o\}$ and $\tau=\tau^{\phantom{A}}_{1,5}$ .*

7.2 The limiting distribution

To describe the limiting distribution of $\sqrt{n}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})$ , where the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in a stratum $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ of local co-dimension $l\geqslant 0$ , we continue to regard $\sigma$ as co-bounding itself so that, in this case, the set $F$ of additional axes in the ‘co-bounding’ stratum is empty. Moreover, we shall relate the form of the limiting distribution in the set (46) for each $\tau$ co-bounding $\sigma$ to a limiting distribution of the Euclidean means of various Euclidean random variables depending on $\tau$ :

$(i)$

for $\tau=\sigma$ , corresponding to the set $\mathbb{R}(E)\times\{\mathbf{0}\}$ in (46), the relevant Euclidean random variable is $\Phi_{\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ ; 2. $(ii)$

for $\tau\not=\sigma$ the relevant Euclidean random variable is $\Psi_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1},\mathop{\boldsymbol{w}}\nolimits_{\tau};\mathop{\boldsymbol{x}}\nolimits^{*})$ where, if $l-l^{\prime}>1$ , $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ is any chosen vector in int $\left(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)\right)$ if this set is not empty and, if $l-l^{\prime}=1$ with $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)\not=\emptyset$ , $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ is its unique element; 3. $(iii)$

we take the zero random variable otherwise.

Note that, by Proposition 9, different choices of $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ in the case $l-l^{\prime}>1$ of $(ii)$ give random variables that are a.s. equal. Note also that, by Corollary 8, the random variables in the case $(iii)$ play no role in the description of the limiting distribution so that they can be replaced by any other random variables. For simplicity, we denote the relevant random variable above in each case by $\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ . With this context and notation write, for each $\tau$ co-bounding $\sigma$ ,

[TABLE]

where $M_{\mathop{\boldsymbol{x}}\nolimits^{*}}^{\sigma}(x)$ is defined by (22), $U_{\tau}$ is the $M\times(m-l^{\prime})$ matrix whose entries are all zero except for those at $(l_{i},i)$ being one, and $u^{\phantom{A}}_{l_{1}},\cdots,u^{\phantom{A}}_{l_{m-l^{\prime}}}$ are the ordered axes that span $\jmath(\mathbb{R}(E\cup F))$ . Note that, since $M_{\mathop{\boldsymbol{x}}\nolimits^{*}}^{\sigma}(\mathop{\boldsymbol{x}}\nolimits)$ is negative semi-definite, the above inverse is well defined when $E[M_{\mathop{\boldsymbol{x}}\nolimits^{*}}^{\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{1})]$ exists. Then, letting $Z_{\tau}$ be a random variable in $\mathbb{R}(E\cup F)$ with normal distribution $N(0,A_{\sigma,\tau}^{\top}V_{\tau}A_{\sigma,\tau})$ , where $V_{\tau}={\rm cov}(\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}))$ , we have the following result.

Theorem 4.

Let $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ be a stratum in $\mathop{\boldsymbol{X}}\nolimits^{m}$ of co-dimension $l(\geqslant 0)$ . Assume that

$(i)$

the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies in $\sigma$ ; 2. $(ii)$

$\mu(\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}})=0$ , where $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ is given by Definition 11; 3. $(iii)$

$E\left[M^{\sigma}_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{\xi}}\nolimits_{1})\right]$ * exists, where $M^{\sigma}_{\mathop{\boldsymbol{x}}\nolimits^{*}}(\mathop{\boldsymbol{x}}\nolimits)$ is given by (22);* 4. $(iv)$

for any stratum $\tau$ in $\mathop{\boldsymbol{X}}\nolimits^{m}$ which co-bounds $\sigma$ and has co-dimension $l^{\prime}\leqslant l-2$ , if $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)\not=\emptyset$ then ${\rm int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu))\not=\emptyset$ .

Then, if there exists a random variable $\eta$ on the tangent cone at $\mathop{\boldsymbol{x}}\nolimits^{*}$ such that

[TABLE]

then $\eta$ has the following property: for any stratum $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l^{\prime}(\leqslant l)$ co-bounding $\sigma$ , if ${\bf P}\left(\eta\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)\right)>0$ then, for $Z_{\tau}$ defined as above and $\mathcal{K}_{\mu}$ by (47),

[TABLE]

for any Borel set $B$ contained in

[TABLE]

Proof.

We assume that $l^{\prime}\leqslant l-2$ . The case for $l^{\prime}=l-1$ can be similarly derived by noting Corollary 7, whereas for $l^{\prime}=l$ the result can be derived directly by simplifying the following arguments.

Write $\Xi_{\tau}=\mathop{\mathcal{O}}\nolimits(E)\times\mathcal{C}(\hbox{int}(\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)))$ . By Corollary 7, given $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}\in\tau$ , $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}\in\Xi_{\tau}$ for sufficiently large $n$ and, by Theorem 3, we also have

[TABLE]

For any $\mathop{\boldsymbol{x}}\nolimits^{\prime}\in\tau$ and $\mathop{\boldsymbol{x}}\nolimits\in\mathop{\boldsymbol{X}}\nolimits^{m}$ , denote the projection $P_{\tau\setminus\sigma}\left(\Phi_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})\right)$ of $\Phi_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ by $\Phi_{\tau\setminus\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{\prime})$ . Define $\tilde{\Psi}_{\tau\setminus\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ similarly. Then, $\tilde{\Psi}_{\tau\setminus\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})=\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})-\Phi_{\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ by Corollary 4( $ii$ ). Since $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ is in $\Xi_{\tau}$ and converges to $\mathop{\boldsymbol{x}}\nolimits^{*}$ a.s., the result of Proposition 9 and the argument for the proof of Theorem 2 together imply that, for any given $\mathop{\boldsymbol{x}}\nolimits$ and all sufficiently large $n$ , $\Phi_{\tau\setminus\sigma}(\mathop{\boldsymbol{x}}\nolimits;\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})=\tilde{\Psi}_{\tau\setminus\sigma}(\mathop{\boldsymbol{x}}\nolimits;\mathop{\boldsymbol{x}}\nolimits^{*})$ a.s.. Hence, in particular, for sufficiently large $n$ ,

[TABLE]

is a.s. the Euclidean mean of $\tilde{\Psi}_{\tau\setminus\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}),\cdots,\tilde{\Psi}_{\tau\setminus\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{n};\mathop{\boldsymbol{x}}\nolimits^{*})$ , so that

[TABLE]

Thus, the limiting distribution of $\sqrt{n}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})1_{\Xi_{\tau}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})$ is the same as that of

[TABLE]

Since $\tilde{\Psi}_{\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{i};\mathop{\boldsymbol{x}}\nolimits^{*})=\Phi_{\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{i};\mathop{\boldsymbol{x}}\nolimits^{*})$ , Proposition 5 implies that the limiting distribution of $\sqrt{n}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*})1_{\Xi_{\tau}}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})$ is equal to that of

[TABLE]

Hence, by (37), the required result follows from a similar argument to that used in [2] and [3]. ∎

As for $\Theta_{\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ defined by (35), the convexity in $\mathop{\boldsymbol{w}}\nolimits$ of the directional derivative $D(d_{\mathop{\boldsymbol{x}}\nolimits}^{2})(\mathop{\boldsymbol{w}}\nolimits)$ implies that $\mathcal{K}_{\mu}$ is a convex subset of the tangent cone to $\mathop{\boldsymbol{X}}\nolimits^{m}$ at $\mathop{\boldsymbol{x}}\nolimits^{*}$ . This, together with the structure of an orthant space, implies that the result of Theorem 4 refers to the behaviour of the limiting distribution only within the interior of $\mathcal{K}_{\mu}$ . Its behaviour at the boundaries will depend on how these sets relate to each other and on the shape of the boundary $\partial\mathcal{K}_{\mu}$ .

The assumption in Theorem 4 that $\mu(\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}})=0$ ensures that we are able to employ the so-called delta method for the approximate probability distribution of a function of an asymptotically normal statistical estimator. In principle, it is possible to relax this assumption by using directional derivatives and combining that with the use of the law of the total probability. However, it is clear from the definition of $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ that its structure, although conceptually straightforward, is generally more complex than will admit a simple algebraic representation, and the ensuing results will consequently depend heavily on the behaviour of $\mu$ on $\mathcal{D}_{\mathop{\boldsymbol{x}}\nolimits^{*}}$ .

To observe special cases of Theorem 4, let $\sigma=\mathop{\mathcal{O}}\nolimits(E)$ be a stratum in $\mathop{\boldsymbol{X}}\nolimits^{m}$ of co-dimension $l(\geqslant 0)$ in which the Fréchet mean $\mathop{\boldsymbol{x}}\nolimits^{*}$ of $\mu$ lies, assume that the conditions of Theorem 4 are satisfied and write

[TABLE]

where we assume that $l(\mu)=l$ if there is no $\tau$ with co-dimension $l^{\prime}<l$ which satisfies the above required condition. We assume further that, for $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ of co-dimension $l(\mu)$ co-bounding $\sigma$ and, if $l(\mu)<l$ , with $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)\not=\emptyset$ , $V_{\tau}=\hbox{cov}(\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}))$ is of full rank $m-l(\mu)$ . Then, it is clear from the proof of Theorem 4 that ${\bf P}(\eta\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F))>0$ .

Case $l(\mu)=l$ : in this case, $\mathcal{K}_{\mu}=\mathbb{R}(E)$ and the support of the distribution of $\eta$ is contained in the tangent space of $\sigma$ . Then, Theorem 4 says that $\eta$ is a normal random variable with mean zero and covariance matrix $A_{\sigma,\sigma}^{\top}\hbox{cov}(\Phi_{\sigma}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}))\,A_{\sigma,\sigma}$ , where $A_{\sigma,\tau}$ is defined by (48). This generalises the limiting distribution of $\sqrt{n}\{\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\}$ when $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a top-dimensional stratum of a tree space obtained in [3].

Case $l(\mu)=l-1$ so that $l\geqslant 1$ : if $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ is a stratum of co-dimension $l^{\prime}=l-1$ such that $\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)\not=\emptyset$ , then $F$ contains only one axis. By taking the Borel set $B=\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ , we see that ${\bf P}\left(\eta\in\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)\right)=1/2$ since the corresponding $Z_{\tau}$ is a normal random variable in $\mathbb{R}^{m-l+1}$ with mean zero. Hence, there are at most two strata of co-dimension $l(\mu)$ co-bounding $\sigma$ on which infinitely many $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ lie. Moreover, in the case of there being only one such a stratum, ${\bf P}(\eta\in\sigma)=1/2$ and, in case of two such strata, ${\bf P}(\eta\in\sigma)=0$ .

Case that $0\leqslant l(\mu)<l$ , that there is a single $\tau_{0}=\mathop{\mathcal{O}}\nolimits(E\cup F_{0})$ such that the co-dimension of $\tau_{0}$ is $l(\mu)$ and that $\Theta_{\tau_{0},\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)=\mathcal{S}^{l-l(\mu)}_{\tau_{0}\setminus\sigma}$ : in this case, we have the following full description of the distribution of $\eta$ in terms of $\phi_{\tau_{0}}$ , the probability density function of the random variable $Z_{\tau_{0}}$ defined prior to Theorem 4. We first note that, since $\mathcal{K}_{\mu}$ defined by (47) is convex and closed, the result of Proposition 9 implies that, in this case,

[TABLE]

Then, we extend the projection map $P$ to $\mathbb{R}(E\cup F_{0})$ in an obvious fashion and, for any $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ , where $F\subseteq F_{0}$ , and any $\mathop{\boldsymbol{z}}\nolimits\in\mathbb{R}(E\cup F_{0})$ , write $\mathop{\boldsymbol{z}}\nolimits_{\tau}=P_{\tau}(\mathop{\boldsymbol{z}}\nolimits)$ and $\mathop{\boldsymbol{z}}\nolimits_{\tau_{0}\setminus\tau}=P_{\tau_{0}\setminus\tau}(\mathop{\boldsymbol{z}}\nolimits)=\mathop{\boldsymbol{z}}\nolimits-\mathop{\boldsymbol{z}}\nolimits_{\tau}$ .

Proposition 11.

Under the above assumptions and notation, the limiting distribution of $\sqrt{n}\{\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\}$ is given as follows: for any $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ , where $F\subseteq F_{0}$ , and any Borel subset $B\subseteq\mathbb{R}(E)\times\mathop{\mathcal{O}}\nolimits(F)$ ,

[TABLE]

where

[TABLE]

The special case that $l(\mu)=l-1$ of this Proposition, together with the comments in the previous two paragraphs, generalises the limiting distribution of $\sqrt{n}\{\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}-\mathop{\boldsymbol{x}}\nolimits^{*}\}$ when $\mathop{\boldsymbol{X}}\nolimits^{m}$ is a tree space and $\mathop{\boldsymbol{x}}\nolimits^{*}$ lies in a stratum of co-dimension one obtained in [3].

Proof.

By Theorem 4, we only need to consider the case where $F\not=F_{0}$ . Assume that $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ has co-dimension $l^{\prime}$ and fix $\mathop{\boldsymbol{w}}\nolimits_{\tau}\in S^{l-l^{\prime}}_{\tau\setminus\sigma}$ . We first show that

[TABLE]

Recall, from the proof of Theorem 2, that the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}})=\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}$ to $\mathop{\boldsymbol{x}}\nolimits$ have the same support for all $\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}\in S^{l-l(\mu)}_{\tau_{0}\setminus\sigma}$ sufficiently close to $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ and all sufficiently small $\lambda>0$ . For such $\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}$ and $\lambda$ , by Definition 13, Corollary 4 $(i)$ , Propositions 8 and 9, the sequence $\mathcal{A}=(A_{0},\cdots,A_{k})$ in the support $(\mathcal{A},\mathcal{B})$ of the geodesics from $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}})=\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}$ to $\mathop{\boldsymbol{\xi}}\nolimits_{1}$ has the property that, if $i>0$ and if $A_{i}\cap E=\emptyset$ , then $A_{i}$ consists of a single axis in $F_{0}$ a.s., so that the $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}))/\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}))\|$ is independent of the value of $\lambda$ a.s. This, together with the fact implied by Corollary 4 $(ii)$ that, if $A_{i}\cap E\not=\emptyset$ , $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}))\rightarrow P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*})$ as $\lambda\rightarrow 0$ , shows that, with probability one, each $P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}))/\|P_{A_{i}}(\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}))\|$ in the expression (11) for $\Phi_{\tau_{0}}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}+\lambda\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}})$ is a continuous function at $\mathop{\boldsymbol{x}}\nolimits^{*}$ in the corresponding Euclidean space. It follows that

[TABLE]

exists a.s. and so, in particular,

[TABLE]

Thus, the definition of $\Psi$ gives

[TABLE]

Since the limit on the right hand side exists, to find it, we take a particular path for $\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}$ to approach $\mathop{\boldsymbol{w}}\nolimits_{\tau}$ : $\mathop{\boldsymbol{w}}\nolimits_{\tau_{0}}=\sin\alpha\,\mathop{\boldsymbol{w}}\nolimits^{\perp}+\cos\alpha\mathop{\boldsymbol{w}}\nolimits_{\tau}$ , where $\langle\mathop{\boldsymbol{w}}\nolimits^{\perp},\mathop{\boldsymbol{w}}\nolimits_{\tau}\rangle=0$ and $\|\mathop{\boldsymbol{w}}\nolimits^{\perp}\|=1$ . Then, writing $\beta=\lambda\sin\alpha$ , we have

[TABLE]

where the second equality follows from Corollary 3 $(ii)$ . Hence, it follows from Corollary 4 $(ii)$ that

[TABLE]

as $\mathop{\boldsymbol{x}}\nolimits^{*}(\lambda,\mathop{\boldsymbol{w}}\nolimits_{\tau})\in\tau$ . Hence, (49) follows.

Since $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ will lie in $\mathcal{K}_{\mu}$ for sufficiently large $n$ a.s. by Corollary 7, without loss of generality, we assume that it is true for all $n$ . Let $\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau}_{n}$ denote the sample Euclidean mean of $\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*}),\cdots,\tilde{\Psi}_{\tau}(\mathop{\boldsymbol{\xi}}\nolimits_{1};\mathop{\boldsymbol{x}}\nolimits^{*})$ . Then, $\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau}_{n}\in\mathbb{R}(E\cup F)$ and, by Corollary 6, $\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau}_{n}\rightarrow\mathop{\boldsymbol{x}}\nolimits^{*}$ a.s. Also, application of (49) gives $P_{\tau\setminus\sigma}\left(\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau}_{n}\right)=P_{\tau\setminus\sigma}\left(\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau_{0}}_{n}\right)$ . On the other hand, the argument for the proof of Theorem 4 implies that, for all sufficiently large $n$ ,

[TABLE]

so that, for all sufficiently large $n$ ,

[TABLE]

However, given that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}$ is in $\mathcal{K}_{\mu}$ , since $P_{\sigma}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n})=P_{\sigma}(\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau_{0}})$ by Corollary 4 $(ii)$ and Corollary 6, the fact that $\hat{\mathop{\boldsymbol{\xi}}}\nolimits_{n}\in\tau$ is equivalent to the fact that $P_{\tau\setminus\sigma}\left(\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau_{0}}_{n}\right)$ lies in $\mathop{\mathcal{O}}\nolimits(F)$ and $-{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}P_{\tau_{0}\setminus\tau}\left(\hat{\mathop{\boldsymbol{\xi}}}\nolimits^{\tau_{0}}_{n}\right)}\in\mathop{\mathcal{O}}\nolimits(F_{0}\setminus F)$ . Hence, we can re-express the above equality as

[TABLE]

The required result then follows by a slight modification to the proof of Theorem 4. ∎

In fact, the argument for the proof of Proposition 11, in particular (49), also shows that, if $\tau=\mathop{\mathcal{O}}\nolimits(E\cup F)$ has co-dimension greater than $l(\mu)$ and if $\mathbb{R}\times\Theta_{\tau,\sigma}(\mathop{\boldsymbol{x}}\nolimits^{*};\mu)$ is contained in the interior of $\mathcal{K}_{\mu}$ , then ${\bf P}(\eta\in\mathbb{R}\times\mathop{\mathcal{O}}\nolimits(F))=0$ .

Aknowledgements. The authors are indebted to Megan Owen for her continuing helpful discussions, following her collaboration in [2] and [3]. We are indebted to the referees for helpful suggestions to improve the description and presentation of our results. The second author acknowledges funding from the Engineering and Physical Sciences Research Council.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Bacak (2014). Computing medians and means in Hadamard spaces, SIAM J. Optimiz. 24 , 1542-1566.
2[2] D. Barden, H. Le and M. Owen (2013). Central limit theorems for Fréchet means in the space of phylogenetic trees, Electron. J. Probab. 18 , no. 25.
3[3] D. Barden, H. Le and M. Owen (2016). Limiting behaviour of Fréchet means in the space of phylogenetic trees. To appear in Annals of the Institute of Statistical Mathematics .
4[4] R. Bhattacharya and V. Patrangenaru (2005). Large sample theory of intrinsic and extrinsic sample means on manifolds-II, Ann. Statist. 33 , 1225-1259.
5[5] L.J. Billera, S.P. Holmes and K. Vogtmann (2001). Geometry of the space of phylogenetic trees, Advances in Applied Mathematics 27 , 733-767.
6[6] M.R. Bridson and A. Haefliger (1999). Metric Spaces of Non-positive Curvature . Springer-Verlag, Berlin/New York.
7[7] B. Burago, Y. Burago and S. Ivanov (2001). A Course in Metric Geometry . American Mathematical Society.
8[8] M. Goresky and R. Mac Pherson (1980). Stratified Morse Theory . Springer-Verlag, Berlin/New York.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The Logarithm Map, its Limits

Abstract

1 Introduction

2 Orthant spaces

Definition 1**.**

Definition 2**.**

Definition 3**.**

Definition 4**.**

Definition 5**.**

3 The carriers and supports of geodesics

Definition 6**.**

Definition 7**.**

Proposition 1**.**

Proof.

Definition 8**.**

Example 1**.**

Definition 9**.**

Proposition 2**.**

Proof.

Corollary 1**.**

Proof.

4 The logarithm map

Definition 10**.**

Theorem 1**.**

Proof.

Example 2**.**

Corollary 2**.**

Proposition 3**.**

Proof.

Proposition 4**.**

Definition 11**.**

5 Limits, projections and derivatives

Theorem 2**.**

Proof.

Example 3**.**

Corollary 3**.**

Proof.

Proposition 5**.**

Corollary 4**.**

Proof.

Proposition 6**.**

Proof.

Corollary 5**.**

Proof.

6 Characterisation of Fréchet means

Lemma 1**.**

Proof.

Theorem 3**.**

Proof.

Definition 12**.**

Proposition 7**.**

Proof.

Corollary 6**.**

Definition 13**.**

Proposition 8**.**

Proof.

Proposition 9**.**

Proof.

7 The limiting distribution of sample Fréchet means

7.1 On the support of the limiting distribution

Proposition 10**.**

Proof.

Corollary 7**.**

Corollary 8**.**

Example 4**.**

7.2 The limiting distribution

Theorem 4**.**

Proof.

Proposition 11**.**

Proof.

Definition 1.

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6.

Definition 7.

Proposition 1.

Definition 8.

Example 1.

Definition 9.

Proposition 2.

Corollary 1.

Definition 10.

Theorem 1.

Example 2.

Corollary 2.

Proposition 3.

Proposition 4.

Definition 11.

Theorem 2.

Example 3.

Corollary 3.

Proposition 5.

Corollary 4.

Proposition 6.

Corollary 5.

Lemma 1.

Theorem 3.

Definition 12.

Proposition 7.

Corollary 6.

Definition 13.

Proposition 8.

Proposition 9.

Proposition 10.

Corollary 7.

Corollary 8.

Example 4.

Theorem 4.

Proposition 11.