Uniqueness questions in a scaling-rotation geometry on the space of   symmetric positive-definite matrices

David Groisser; Sungkyu Jung; Armin Schwartzman

arXiv:1702.03237·math.MG·December 22, 2017

Uniqueness questions in a scaling-rotation geometry on the space of symmetric positive-definite matrices

David Groisser, Sungkyu Jung, Armin Schwartzman

PDF

TL;DR

This paper investigates the geometric structure of the space of symmetric positive-definite matrices, focusing on the uniqueness of minimal scaling-rotation curves and their relation to Grassmannian geometry, with results for dimensions up to 4 and from 11 onwards.

Contribution

It characterizes conditions for the uniqueness of minimal scaling-rotation curves in the eigen-decomposition geometry of SPD matrices, linking fiber structure and Grassmannian geometry.

Findings

01

Characterizes when MSSR curves are unique for given matrices.

02

Provides results on Grassmannian geometry for p ≤ 4 and p ≥ 11.

03

Introduces a half-angle formula for principal angles between subspaces.

Abstract

Jung et al. (2015) introduced a geometric structure on $Sym^{+} (p)$ , the set of $p \times p$ symmetric positive-definite matrices, based on eigen-decomposition. Eigenstructure determines both a stratification of $Sym^{+} (p)$ , defined by eigenvalue multiplicities, and fibers of the "eigen-composition" map $F : M (p) := S O (p) \times Diag^{+} (p) \to Sym^{+} (p)$ . When $M (p)$ is equipped with a suitable Riemannian metric, the fiber structure leads to notions of scaling-rotation distance between $X, Y \in Sym^{+} (p)$ , the distance in $M (p)$ between fibers $F^{- 1} (X)$ and $F^{- 1} (Y)$ , and minimal smooth scaling-rotation (MSSR) curves, images in $Sym^{+} (p)$ of minimal-length geodesics connecting two fibers. In this paper we study the geometry of the triple $(M (p), F, Sym^{+} (p))$ , focusing on some basic questions: For which $X, Y$ is there a unique MSSR curve from $X$ to $Y$ ?…

Equations254

M (p) := S O (p) \times Diag^{+} (p),

M (p) := S O (p) \times Diag^{+} (p),

F (U, D) = U D U^{T} .

F (U, D) = U D U^{T} .

[(U, D)] = {(U R, D) : R \in G_{D}^{0}},

[(U, D)] = {(U R, D) : R \in G_{D}^{0}},

g_{S O} ∣_{I} (A_{1}, A_{2}) = - \frac{1}{2} tr (A_{1} A_{2}),

g_{S O} ∣_{I} (A_{1}, A_{2}) = - \frac{1}{2} tr (A_{1} A_{2}),

g_{D^{+}} ∣_{D} (L_{1}, L_{2}) = tr (D^{- 1} L_{1} D^{- 1} L_{2})

g_{D^{+}} ∣_{D} (L_{1}, L_{2}) = tr (D^{- 1} L_{1} D^{- 1} L_{2})

g_{M} := k g_{S O} \oplus g_{D^{+}},

g_{M} := k g_{S O} \oplus g_{D^{+}},

d_{M}^{2} ((U, D), (V, Λ))

d_{M}^{2} ((U, D), (V, Λ))

=

(g, (U, D)) \mapsto g \mbox \cdot (U, D) := (U P_{g}^{- 1}, π_{g} \mbox \cdot D)

(g, (U, D)) \mapsto g \mbox \cdot (U, D) := (U P_{g}^{- 1}, π_{g} \mbox \cdot D)

d_{S R} (X, Y) := (U, D) \in E_{X}, (V, Λ) \in E_{Y} in f d_{M} ((U, D), (V, Λ)) .

d_{S R} (X, Y) := (U, D) \in E_{X}, (V, Λ) \in E_{Y} in f d_{M} ((U, D), (V, Λ)) .

d_{S R} (X, Y) = in f {ℓ (γ) ∣ γ : [0, 1] \to M (p) \mbox i s a g eo d es i c w i t h γ (0) \in E_{X}, γ (1) \in E_{Y}} .

d_{S R} (X, Y) = in f {ℓ (γ) ∣ γ : [0, 1] \to M (p) \mbox i s a g eo d es i c w i t h γ (0) \in E_{X}, γ (1) \in E_{Y}} .

d_{S R} (X, Y)

d_{S R} (X, Y)

\wideparen d (g; (U, D), (V, Λ)) = R_{U} \in G_{D}^{0}, R_{V} \in G_{Λ}^{0} min {d_{S O} (U R_{U}, V R_{V} P_{g}^{- 1})} .

\wideparen d (g; (U, D), (V, Λ)) = R_{U} \in G_{D}^{0}, R_{V} \in G_{Λ}^{0} min {d_{S O} (U R_{U}, V R_{V} P_{g}^{- 1})} .

E_{X} \times E_{Y} = {(g_{1} \mbox \cdot (U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) : R_{U} \in G_{D}^{0}, R_{V} \in G_{Λ}^{0}; g_{1}, g_{2} \in \tilde{S}_{p}^{+}} .

E_{X} \times E_{Y} = {(g_{1} \mbox \cdot (U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) : R_{U} \in G_{D}^{0}, R_{V} \in G_{Λ}^{0}; g_{1}, g_{2} \in \tilde{S}_{p}^{+}} .

d_{M} (g_{1} \mbox \cdot (U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) = d_{M} ((U R_{U}, D), (g_{1}^{- 1} g_{2}) \mbox \cdot (V R_{V}, Λ)) .

d_{M} (g_{1} \mbox \cdot (U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) = d_{M} ((U R_{U}, D), (g_{1}^{- 1} g_{2}) \mbox \cdot (V R_{V}, Λ)) .

d_{M} ((U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) = d_{M} ((U R_{U}, D), h_{D} \mbox \cdot g_{1} \mbox \cdot h_{Λ} \mbox \cdot (V R_{V}, Λ))

d_{M} ((U R_{U}, D), g_{2} \mbox \cdot (V R_{V}, Λ)) = d_{M} ((U R_{U}, D), h_{D} \mbox \cdot g_{1} \mbox \cdot h_{Λ} \mbox \cdot (V R_{V}, Λ))

χ^{'} (t) = e^{t A} {[A, U Λ (t) U^{T}] + U L Λ (t) U^{T}} e^{- t A},

χ^{'} (t) = e^{t A} {[A, U Λ (t) U^{T}] + U L Λ (t) U^{T}} e^{- t A},

[A, U Λ (t_{0}) U^{T}] + U L Λ (t_{0}) U^{T} = 0.

[A, U Λ (t_{0}) U^{T}] + U L Λ (t_{0}) U^{T} = 0.

R_{V, 2} P_{g_{2}}^{- 1} R_{U, 2}^{- 1} = R_{V, 1} P_{g_{1}}^{- 1} R_{U, 1}^{- 1},

R_{V, 2} P_{g_{2}}^{- 1} R_{U, 2}^{- 1} = R_{V, 1} P_{g_{1}}^{- 1} R_{U, 1}^{- 1},

(proj_{S O (p)} γ_{1}^{'} (0)) R_{U_{1}}^{- 1} = (proj_{S O (p)} γ_{2}^{'} (0)) R_{U_{2}}^{- 1},

(proj_{S O (p)} γ_{1}^{'} (0)) R_{U_{1}}^{- 1} = (proj_{S O (p)} γ_{2}^{'} (0)) R_{U_{2}}^{- 1},

D

D

Λ_{2}

\mbox and R_{U, 1}^{- 1} R_{U, 2}

∥ lo g (D^{- 1} (π \mbox \cdot Λ) ∥^{2} \geq ∣ c_{1} + a_{π^{- 1} (i)} - a_{i} ∣^{2} > (2 p c_{1})^{2} = ∥ lo g (D^{- 1} Λ) ∥^{2} + c .

∥ lo g (D^{- 1} (π \mbox \cdot Λ) ∥^{2} \geq ∣ c_{1} + a_{π^{- 1} (i)} - a_{i} ∣^{2} > (2 p c_{1})^{2} = ∥ lo g (D^{- 1} Λ) ∥^{2} + c .

∥ lo g (D^{- 1} (π \mbox \cdot Λ) ∥^{2} > ∥ lo g (D^{- 1} Λ) ∥^{2} + \linebreak k diam (S O (p))^{2}

∥ lo g (D^{- 1} (π \mbox \cdot Λ) ∥^{2} > ∥ lo g (D^{- 1} Λ) ∥^{2} + \linebreak k diam (S O (p))^{2}

d_{S R} (X, Y)^{2}

d_{S R} (X, Y)^{2}

d_{M} ((U, D), g_{1} \mbox \cdot (V, Λ))^{2}

d_{M} ((U, D), g_{1} \mbox \cdot (V, Λ))^{2}

d_{S R} (X, Y)^{2}

d_{S R} (X, Y)^{2}

Φ_{m, p} (W) = P_{W^{⊥}} - P_{W} = I - 2 P_{W},

Φ_{m, p} (W) = P_{W^{⊥}} - P_{W} = I - 2 P_{W},

{\sf R}(\theta_{1},\dots,\theta_{k})=\left[\begin{array}[]{cccccc}C(\theta_{1})&&&&&\\ &C(\theta_{2})&&&&\\ &&.&&&\\ &&&.&&\\ &&&&.&\\ &&&&&C(\theta_{k})\end{array}\right],

{\sf R}(\theta_{1},\dots,\theta_{k})=\left[\begin{array}[]{cccccc}C(\theta_{1})&&&&&\\ &C(\theta_{2})&&&&\\ &&.&&&\\ &&&.&&\\ &&&&.&\\ &&&&&C(\theta_{k})\end{array}\right],

C(\theta)=\left[\begin{array}[]{cc}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{array}\right]

C(\theta)=\left[\begin{array}[]{cc}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{array}\right]

C(\theta)=\exp(\theta J)\ \ \mbox{where}\ \ J=\left[\begin{array}[]{cc}0&-1\\ 1&0\end{array}\right].

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Uniqueness questions in

a scaling-rotation geometry on the space of symmetric positive-definite matrices

David Groisser

[email protected]

Sungkyu Jung

[email protected]

Armin Schwartzman

[email protected]

Department of Mathematics, University of Florida, Gainesville, FL 32611, USA

Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15260, USA

Division of Biotatistics, University of California, San Diego, CA 92903, USA

Abstract

Jung et al. (2015) introduced a geometric structure on ${\rm Sym}^{+}(p)$ , the set of $p\times p$ symmetric positive-definite matrices, based on eigen-decomposition. Eigenstructure determines both a stratification of ${\rm Sym}^{+}(p)$ , defined by eigenvalue multiplicities, and fibers of the “eigen-composition” map $F:M(p):=SO(p)\times{\rm Diag}^{+}(p)\to{\rm Sym}^{+}(p)$ . When $M(p)$ is equipped with a suitable Riemannian metric, the fiber structure leads to notions of scaling-rotation distance between $X,Y\in{\rm Sym}^{+}(p)$ , the distance in $M(p)$ between fibers $F^{-1}(X)$ and $F^{-1}(Y)$ , and minimal smooth scaling-rotation (MSSR) curves, images in ${\rm Sym}^{+}(p)$ of minimal-length geodesics connecting two fibers. In this paper we study the geometry of the triple $(M(p),F,{\rm Sym}^{+}(p))$ , focusing on some basic questions: For which $X,Y$ is there a unique MSSR curve from $X$ to $Y$ ? More generally, what is the set ${\cal M}(X,Y)$ of MSSR curves from $X$ to $Y$ ? This set is influenced by two potential types of non-uniqueness. We translate the question of whether the second type can occur into a question about the geometry of Grassmannians $G_{m}({\bf R}^{p})$ , with $m$ even, that we answer for $p\leq 4$ and $p\geq 11$ . Our method of proof also yields an interesting half-angle formula concerning principal angles between subspaces of ${\bf R}^{p}$ whose dimensions may or may not be equal. The general- $p$ results concerning MSSR curves and scaling-rotation distance that we establish here underpin the explicit $p=3$ results in Groisser et al. (2017). Addressing the uniqueness-related questions requires a thorough understanding of the fiber structure of $M(p)$ , which we also provide.

keywords:

eigen-decomposition , stratified spaces , scaling-rotation distance , signed-permutation group , geometric structures on quotient spaces , principal angles , geometry of Grassmannians

MSC:

[2010] 53C99 , 53C15 , 57R15 , 53C22 , 51F25 , 15A18 , 58A35

††journal: arXivmytitlenotemytitlenotefootnotetext: This work was supported by NIH grant R21EB012177 and NSF grant DMS-1307178.

1 Introduction

In this work, we investigate a geometric structure on ${\rm Sym}^{+}(p)$ , the set of $p\times p$ symmetric positive-definite (SPD) matrices, $p>1$ , and special curves that this structure gives rise to. Both the geometric structure and these special curves are built from eigen-decomposition of SPD matrices.

Let ${\rm Diag}^{+}(p)$ denote the set of $p\times p$ diagonal matrices with positive diagonal entries. By an (orthonormal) eigen-decomposition of $X\in{\rm Sym}^{+}(p)$ we will mean a pair $(U,D)\in SO(p)\times{\rm Diag}^{+}(p)$ such that $X=UDU^{-1}=UDU^{T}$ . The space of such decompositions,

[TABLE]

thus comes naturally equipped with a smooth surjective map $F:M\to{\rm Sym}^{+}(p)$ defined by

[TABLE]

For each $X\in{\rm Sym}^{+}(p)$ we call the set ${\cal E}_{X}:=F^{-1}(X)$ the fiber over $X$ . However, $M(p)$ is not a fiber bundle over ${\rm Sym}^{+}(p)$ with projection $F$ ; the map $F$ is not even a submersion. (Rather, the relation of $M(p)$ to ${\rm Sym}^{+}(p)$ is reminiscent of the notion of blow-up in algebraic geometry: $M(p)$ can be viewed as a sort of blow-up of ${\rm Sym}^{+}(p)$ along several subvarieties.) The natural action $SO(p)\times{\rm Sym}^{+}(p)\to{\rm Sym}^{+}(p)$ , $(U,X)\mapsto UXU^{T}$ , endows ${\rm Sym}^{+}(p)$ with a stratification by orbit-type, and the derivative of $F$ is nonsingular only on the pre-image of the“top” stratum. This stratification is identical to the stratification by “eigenvalue-multiplicity type”, in which the strata are labeled by partition of the integer $p$ . Eigenvalue multiplicities also determine a more refined stratification of the space $M(p)$ , in which the strata are labeled by partitions of the set $\{1,\dots,p\}$ . Appendix B reviews these stratifications.

The fiber structure of $M(p)$ formalizes the notion of minimal smooth scaling-rotation curves [10]. In 2006, motivated by applications to diffusion-tensor imaging, Schwartzman [14] introduced smooth scaling-rotation curves as a way of interpolating between SPD matrices in such a way that eigenvectors and eigenvalues both change at uniform speed. Minimal smooth scaling-rotation curves were defined in [10] as smooth curves of shortest length as determined by an appropriate Riemannian metric on $M(p)$ —curves that minimize a suitable measure of the amount of scaling and rotation needed to transform one SPD matrix into another.

More precisely, each factor of $M(p)$ is a Lie group, and for our Riemannian metric $g_{M}$ on $M(p)$ we take a product metric determined by choosing bi-invariant metrics $g_{SO},{g_{{\cal D}^{+}}}$ on the factors. We define smooth scaling-rotation (SSR) curves in ${\rm Sym}^{+}(p)$ to be the projections to ${\rm Sym}^{+}(p)$ of geodesics in $(M(p),g_{M})$ . In this scaling-rotation framework, the “distance” $d_{\cal SR}(X,Y)$ between any two matrices $X,Y\in{\rm Sym}^{+}(p)$ is defined to be the distance between the fibers ${\cal E}_{X}$ and ${\cal E}_{Y}$ (nonzero if $X\neq Y$ since each fiber is compact). We use the term $F$ -minimal geodesic for a minimal-length geodesic connecting two fibers ${\cal E}_{X}$ and ${\cal E}_{Y}$ , and minimal pair for the pair of endpoints of such a geodesic. A minimal smooth scaling-rotation (MSSR) curve is the image under $F$ of an $F$ -minimal geodesic.

As shown in [10], $d_{\cal SR}$ restricts to a metric on the top stratum of ${\rm Sym}^{+}(p)$ , but is not a metric on all of ${\rm Sym}^{+}(p)$ . In [8], we show that $d_{\cal SR}$ generates a true metric $\rho_{\cal SR}$ on ${\rm Sym}^{+}(p)$ and investigate features of this metric. But fully understanding the geometry of the metric $\rho_{\cal SR}$ relies on first understanding MSSR curves, the function $d_{\cal SR}$ , and related issues we address in the present article.

This paper is devoted primarily to uniqueness-related issues that arise in studying MSSR curves, and to some unanticipated geometric results (described in more detail below), potentially of independent interest, that were discovered as a result of studying these issues.

A thorough understanding of the fibers of $F$ is key to analyzing several features of the scaling-rotation framework, including these uniqueness-related issues. Appendix A provides a thorough picture of the fiber structure of $M(p)$ , including its inextricable tie to the group ${\tilde{S}}_{p}^{+}$ of “even signed-permutations”, a group not to be confused with a more familiar group of the same order and similar-sounding description in terms of signs and permutations, the Weyl group of the simple Lie algebra $D_{p}$ . Some results proven in Appendix A are applied earlier in the main body of this paper, and some were previously stated without proof in [7] and applied there.

The uniqueness-related results in this paper contribute to a rigorous and systematic description of the geometry and topology of the triple $(M(p),F,{\rm Sym}^{+}(p))$ , and to a firm foundation for further study of the scaling-rotation framework, such as in [7] and [8].

Some of the uniqueness issues we study are related directly to (non-)uniqueness of MSSR curves themselves, while others are related more directly to (non-)uniqueness of minimal pairs. It is easy to see that for all $X,Y\in{\rm Sym}^{+}(p)$ , at least one MSSR curve from $X$ to $Y$ exists; however, such curves are not always unique. The dependence on $X$ and $Y$ of the set ${\cal M}(X,Y)$ of such curves is quite intricate, and relates strongly to the stratified nature of ${\rm Sym}^{+}(p)$ . Non-uniqueness issues for minimal pairs are important because computing $d_{\cal SR}(X,Y)$ and MSSR curves from $X$ to $Y$ requires finding minimal pairs in ${\cal E}_{X}\times{\cal E}_{Y}$ . Even when the resulting MSSR curve from $X$ to $Y$ is unique, minimal pairs in ${\cal E}_{X}\times{\cal E}_{Y}$ are never unique, because ${\tilde{S}}_{p}^{+}$ acts on $M(p)$ in a nontrivial isometric, fiber-preserving fashion. This action carries minimal pairs to minimal pairs. For some $(X,Y)$ , there are also minimal pairs that are not related to each other by this action.

The broad structure of this paper is as follows. Section 2 establishes notation. Sections 3 and 4 contain the statements of most of our main results, which we will describe below, and those proofs that can be given quickly. The proofs of many of our results—especially the “bonus” results that are applicable outside the scaling-rotation framework entirely—are quite long; these occupy Sections 5–7.

We devote the remainder of this introduction to a more detailed outline of the paper, and more detailed descriptions of the questions we study and the results we achieve.

In Section 3.1 we review the basics of SSR curves, before restricting attention to MSSR curves in Section 3.2 and beyond. In Section 3.2 we discuss the computational-complexity problem arising from the non-uniqueness of minimal pairs. Proposition 3.7 takes advantage of the $\tilde{S}_{p}^{+}$ -action by using double-cosets in ${\tilde{S}}_{p}^{+}$ to reduce the complexity of computing $d_{\cal SR}(X,Y)$ , of characterizing all minimal pairs, and of finding all MSSR curves from $X$ to $Y$ . Proposition 3.7 was applied in [7] to help derive closed-form formulas for $d_{\cal SR}$ and MSSR curves for $p=3$ . Also discussed and proven in Section 3.2 is a general result about scaling-rotation curves: All such curves are either constant-maps or immersions. This result is important for an understanding of MSSR curves.

In Section 3.3, we begin to address uniqueness questions for MSSR curves, the most basic of which is: under what conditions on $X,Y\in{\rm Sym}^{+}(p)$ is there more than one MSSR curve from $X$ to $Y$ ? A more refined version of this question is: for each pair $(X,Y)$ , what is the set ${\cal M}(X,Y)$ explicitly? By characterizing all minimal pairs, Proposition 3.7 provides a starting point for answering this question. Among this proposition’s outcomes is also the fact that, for each pair $(X,Y)$ , every MSSR curve from $X$ to $Y$ is represented by a minimal pair whose first point lies in any given connected component of ${\cal E}_{X}$ . But to completely understand ${\cal M}(X,Y)$ —or even just determine its cardinality—we still need a way to tell whether MSSR curves corresponding to two (not necessarily distinct) minimal pairs with first point in a given connected component of ${\cal E}_{X}$ are the same. Proposition 3.11 gives a necessary and sufficient criterion. This result was applied in [7], where it enabled an explicit computation of the sets ${\cal M}(X,Y)$ for $p=3$ when $X$ and $Y$ do not both lie in the top stratum.

In Section 3.3 we also define two different ways that non-uniqueness of MSSR curves can occur. Given $X,Y\in{\rm Sym}^{+}(p)$ , for there to be more than one MSSR curve from $X$ to $Y$ , there must exist distinct shortest-length geodesics $\gamma_{1},\gamma_{2}:[0,1]\to M(p)$ from ${\cal E}_{X}$ to ${\cal E}_{Y}$ such that $F\circ\gamma_{1}\neq F\circ\gamma_{2}$ . There are essentially two ways, not mutually exclusive, that this can happen: (i) there can exist such $\gamma_{i}$ ( $i=1,2$ ) whose endpoint-pairs are distinct minimal pairs, and (ii) there exist such $\gamma_{i}$ whose endpoint-pairs are the same minimal pair. We call these possibilities “Type I” and “Type II” non-uniqueness, respectively. Proposition 3.11 applies to both.

The study of Type II non-uniqueness, which we begin in Section 3.4, turns out to be especially fruitful. A minimal pair $((U,D),(V,{\Lambda}))\in M(p)\times M(p)$ has more than one minimal geodesic connecting its points if and only if the pair $(U,V)\in SO(p)\times SO(p)$ is geodesically antipodal (Definition 3.10), which is equivalent to $V^{-1}U$ being an involution. Our chief tool for determining whether such minimal pairs exist is a property we call sign-change reducibilty: we say that the pair $(U,V)$ is sign-change reducible if $d_{SO}(U,V)$ can be reduced by multiplying $U$ or $V$ by a (positive-determinant) “sign-change matrix”, a diagonal matrix each of whose diagonal entries is $\pm 1$ .

We show in Proposition 3.20 that if $(U,V)\in SO(p)\times SO(p)$ is not sign-change reducible, then there exist $D,{\Lambda}$ in the top stratum of ${\rm Diag}^{+}(p)$ such that $((U,D),(V,{\Lambda}))$ is a minimal pair. We show in Proposition 3.18 that for $p\leq 4$ , every geodesically antipodal pair $(U,V)$ is sign-change reducible, and that for $p\geq 11$ , there exist geodesically antipodal pairs that are not sign-change reducible. From these propositions we deduce that Type II non-uniqueness never occurs for $p\leq 4$ (Corollary 3.19), and that it always occurs for some $(X,Y)\in{\rm Sym}^{+}(p)\times{\rm Sym}^{+}(p)$ if $p\geq 11$ (Corollary 3.21). We do not believe that either of the numbers 4 and 11 above is sharp; our methods are simply not conclusive when $5\leq p\leq 10$ .

Together, Proposition 3.20 and Corollary 3.21 show that sign-change reducibility is the only obstruction to having points $X,Y$ in the top stratum of ${\rm Sym}^{+}(p)$ for which the set ${\cal M}(X,Y)$ exhibits Type II non-uniqueness.

Even without Proposition 3.18, for $p\leq 3$ it is rather trivial that all geodesically antipodal pairs are sign-change reducible, and for $p=4$ an independent proof relying on quaternions is also possible. However, our proof of the $p\leq 4$ part of Proposition 3.18 makes no use of quaternions, and unifies these low- $p$ results.

Our proof of Proposition 3.18, completed in Section 7 after laying groundwork in Sections 4–6, takes us in unexpected directions, with unanticipated consequences. We initially introduced the notion of sign-change reducibility into our scaling-rotation-curve study as an ad hoc tool to help us determine whether Type II non-uniqueness of MSSR curves, impossible for $p\leq 4$ , is ever possible. This is equivalent to answering the question “Are all geodesically antipodal pairs in $SO(p)\times SO(p)$ sign-change reducible?” But as we show in Proposition 4.11, a refined version of the latter question is equivalent to a question purely about the geometry of Grassmannians equipped with a standard Riemannian metric: for $m$ even and positive, is every $m$ -dimensional subspace of ${\bf R}^{p}$ within a certain distance $c(m)$ of a coordinate $m$ -plane? (This question can, of course, be asked without restricting the parity of $m$ , but the above equivalence leads us to consider only even $m$ in this paper.) By constructing examples, we show that for $m=2$ , the answer to the Grassmannian question is no for $p\geq 11$ . This, combined with the equivalence result in Proposition 4.11, yields the “ $p\geq 11$ ” part of Proposition 3.18 mentioned above. The “ $p\leq 4$ ” part of Proposition 3.18 is proven by other means (via the more technical Proposition 4.6).

While the possibility of Type-II non-uniqueness is what led us to the question above about Grassmannians, this question and our study of it may be of independent interest. Our study led us to investigate several related questions concerning distances between (even-dimensional) subspaces of ${\bf R}^{p}$ and (even-dimensional) coordinate planes not necessarily of the same dimension. Perhaps the most unexpected of these is a half-angle relation stated in Proposition 4.10 and proven in Section 5: for any two involutions $R_{1},R_{2}\in SO(p)$ , each of the principal angles between the $(-1)$ -eigenspaces of $R_{1}$ and $R_{2}$ is exactly half a correspondingly indexed normal-form angle of $R_{1}R_{2}$ . This relationship holds whether or not the dimensions of the $(-1)$ -eigenspaces are equal. When the dimensions are equal, we use this relationship to show that a natural correspondence between ${\rm Gr}_{m}({\bf R}^{p})$ and a connected component of the set of involutions in $SO(p)$ is a metric-space isometry up to a constant factor of 2 (Proposition 4.9). This isometric relation is also deducible (and may already be known) from a purely Riemannian approach, but our proof uses essentially no Riemannian geometry (see Remark 5.5 for a more precise statement, and an additional interpretation of what our proof of Proposition 4.9 shows).

The most important results coming from our study of sign-change reducibility are stated in Section 4, with the proofs deferred to Sections 5, 6, and 7. These results include those mentioned above, and one more whose statement involves terminology not included in this Introduction: Proposition 4.8, a special case of a more general conjecture we make about sign-change reducibility (Conjecture 4.7). Key to almost all of these results is the technical Lemma 5.2, which establishes several facts concerning the product of a general involution in $SO(p)$ and a positive-determinant sign-change matrix.

We mention in passing that there is a vast body of literature devoted to defining and studying “distance-functions” (not necessarily true metrics) on ${\rm Sym}^{+}(p)$ different from the scaling-rotation distance $d_{\cal SR}$ and metric $\rho_{\cal SR}$ ; for a discussion and comparison see [7] and the references therein.

2 Notational preliminaries

In this paper, when a group $G$ acts from the left on a set $X$ in a previously specified way, we generally denote the action simply by $(g,x)\mapsto g\,{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}x$ .

Let ${\rm Part}(\{1,\dots,p\})$ denote the set of partitions of $\{1,2,\dots,p\}$ , and ${\rm Part}(p)$ the set of partitions of the integer $p$ . Let ${\rm Diag}(p)$ denote the set of $p\times p$ diagonal matrices. Each $D\in{\rm Diag}(p)$ naturally determines an element ${\sf J}_{D}\in{\rm Part}(\{1,\dots,p\})$ according to “which eigenvalues are equal” (see Notation A.1). The group $SO(p)$ acts on ${\rm Sym}^{+}(p)$ on the left via $(U,X)\mapsto UXU^{T}$ . The stabilizer $G_{D}$ of $D\in{\rm Diag}^{+}(p)$ under this action depends only on ${\sf J}_{D}$ , and $G_{D_{1}}=G_{D_{2}}$ if ${\sf J}_{D_{1}}={\sf J}_{D_{2}}$ . For each ${\sf J}\in{\rm Part}(\{1,\dots,p\})$ we may define a subgroup $G_{\sf J}\subset SO(p)$ by declaring to be $G_{\sf J}=G_{D}$ for any $D$ for which ${\sf J}_{D}={\sf J}$ . We write $G_{D}^{0},G_{\sf J}^{0}$ for the identity component of $G_{D},G_{\sf J}$ respectively. See Appendix A for an alternative definition of $G_{\sf J}$ and additional facts concerning these groups.

We call an element of $U\in O(p)$ a signed-permutation matrix if every entry of $U$ is either [math] or $\pm 1$ , and call a signed-permutation matrix even if it lies in $SO(p)$ . The set of even signed-permutation matrices forms a subgroup ${\tilde{S}}_{p}^{+}\subset SO(p)$ of order $2^{p-1}p!$ . As discussed in Appendix A (Section A.2), we view this subgroup as a canonical copy of an “abstract” group ${\tilde{S}}_{p}^{+}$ of even signed-permutations, an extension of the symmetric group $S_{p}$ . We will typically denote an even signed-permutation by the letter $g$ , and the corresponding matrix by $P_{g}$ . We denote the natural epimomorphism ${\tilde{S}}_{p}^{+}\to S_{p}$ by $g\mapsto\pi_{g}$ . The group ${\tilde{S}}_{p}^{+}$ plays a critical role in understanding the fibers of $F$ (starting with Corollary 3.2 in the next section) and in simplifying computations of $d_{\cal SR}$ . This group, which is not encountered in geometry as often as another group of the same order, is discussed in greater detail in Appendix A.

We define ${\cal I}_{p}^{+}={\tilde{S}}_{p}^{+}\mbox{\small\$ \bigcap $\ }{\rm Diag}^{+}(p)$ , and call elements of ${\cal I}_{p}^{+}$ (even) sign-change matrices. We view ${\cal I}_{p}^{+}$ as a copy of a (certain) index-two subgroup of $({\bf Z}_{2})^{p}$ , as discussed in Appendix B. We will typically denote an element of the abstract group ${\cal I}_{p}^{+}$ by the letter $\sigma$ , and the corresponding matrix by $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ .

Notation 2.1

(a) For ${\sf J}=\{J_{1},\dots,J_{r}\}\in{\rm Part}(\{1,\dots,p\})$ , define (i) $\Gamma_{\sf J}={\tilde{S}}_{p}^{+}\mbox{\small\$ \bigcap $\ }G_{\sf J}$ , (ii) $\Gamma_{\sf J}^{0}=\Gamma_{\sf J}\mbox{\small\$ \bigcap $\ }G_{\sf J}^{0}={\tilde{S}}_{p}^{+}\mbox{\small\$ \bigcap $\ }G_{\sf J}^{0}$ , (iii) $K_{\sf J}=\{\pi\in S_{p}:\pi(J_{i})=J_{i},\ \linebreak 1\leq i\leq r\}\subset S_{p}$ , and (iv) $\tilde{K}_{\sf J}=\{g\in{\tilde{S}}_{p}^{+}:\pi_{g}\in K_{\sf J}\}$ . Observe that ${\cal I}_{p}^{+}\subset\tilde{K}_{\sf J}$ , and that $K_{\sf J}=\{\pi\in S_{p}\mid\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D=D\ \mbox{for {\em some}$ D $with$ {\sf J}{D}={\sf J} $}\}=\{\pi\in S_{p}\mid\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D=D\ \mbox{for {\em all}$ D $with$ {\sf J}{D}={\sf J} $}\}.$

(b) For any $X\in{\rm Sym}^{+}(p)$ and $(U,D)\in{\cal E}_{X}$ , define

[TABLE]

the connected component of ${\cal E}_{X}$ containing $(U,D)$ . We write ${\rm Comp}({\cal E}_{X})$ for the set of connected components of ${\cal E}_{X}$ .

(c) For any Lie group $G$ and closed subgroup $K$ , we write $G/K$ and $K\backslash G$ for the spaces of left- and right-cosets, respectively, of $K$ in $G$ .

3 The scaling-rotation framework and some results

for scaling-rotation curves

The Lie groups $SO(p)$ and ${\rm Diag}^{+}(p)$ carry natural bi-invariant Riemannian metrics. If we endow $M(p)=SO(p)\times{\rm Diag}^{+}(p)$ with a product Riemannian metric $g_{M}$ constructed from these, the geodesics $\gamma$ in $(M(p),g_{M})$ are easily computed. We define smooth scaling-rotation (SSR) curves in ${\rm Sym}^{+}(p)$ to be the projections to ${\rm Sym}^{+}(p)$ of the geodesics in $(M(p),g_{M})$ , i.e. curves of the form $F\circ\gamma$ . (In [14] and [10] these were called simply “scaling-rotation curves”. In Section 3.2 we explain why we have added “smooth” to this name.)

3.1 Smooth scaling-rotation curves

The Lie algebra ${\mathfrak{so}}(p)=T_{I}(SO(p))$ is the space of $p\times p$ antisymmetric matrices. The bi-invariant Riemannian metric $g_{SO}$ on $SO(p)$ we will use is defined at the identity $I\in SO(p)$ by

[TABLE]

(The requirement of bi-invariance determines a Riemannian metric on $SO(p)$ up to a constant factor unless $p=4$ , of course, but for all $p\geq 3$ the inner product (3.1) is a multiple of the Killing form.)

Since the abelian Lie group ${\rm Diag}^{+}(p)$ is an open subset of the vector space ${\rm Diag}(p)$ , for each $D\in{\rm Diag}^{+}(p)$ we will identify $T_{D}({\rm Diag}^{+}(p))$ canonically with ${\rm Diag}(p)$ . With this identification understood, the invariant Riemannian metric ${g_{{\cal D}^{+}}}$ we use is defined by

[TABLE]

where $D\in{\rm Diag}^{+}(p)$ and $L_{1},L_{2}\in T_{D}({\rm Diag}^{+}(p)).$ Up to a constant factor, ${g_{{\cal D}^{+}}}$ is the unique (bi-)invariant metric on ${\rm Diag}^{+}(p)$ that is also invariant under the natural action of the symmetric group $S_{p}$ .

Naturally identifying of $T_{(U,D)}M(p)$ with $T_{U}(SO(p))\oplus T_{D}({\rm Diag}^{+}(p))$ , the Riemannian metric $M(p)$ we will use is

[TABLE]

where $k>0$ is an arbitrary parameter that can be chosen as desired for applications.

Definition 3.1

A smooth scaling-rotation (SSR) curve is a curve $\chi$ in ${\rm Sym}^{+}(p)$ of the form $F\circ\gamma$ , where $\gamma:I\to M(p)$ is a geodesic defined on some interval $I$ . **

In this paper, we use curve sometimes to mean a parametrized curve (a map with domain some interval), and sometimes to mean an equivalence class of such maps, where two maps are regarded as equivalent if one is a monotone reparametrization of the other. Also, we use the noun geodesic sometimes to mean a complete geodesic and sometimes to mean a geodesic segment. Our intended meanings should always be clear from context.

The geodesics $\gamma$ in $M(p)$ are exactly the curves of the form $t\mapsto(\gamma_{1}(t),\gamma_{2}(t))$ , where $\gamma_{1}$ is a geodesic in $(SO(p),g_{SO})$ and $\gamma_{2}$ is a geodesic in $({\rm Diag}^{+}(p),{g_{{\cal D}^{+}}})$ . Since the metrics $g_{SO}$ and ${g_{{\cal D}^{+}}}$ are bi-invariant, the geodesics in $(SO(p),g_{SO})$ and $({\rm Diag}^{+}(p),{g_{{\cal D}^{+}}})$ can be obtained as either left-translates or right-translates of geodesics through the identity. For agreement with [10] and [7], in this paper we use right-translates.

It well known that in the Riemannian manifold $(SO(p),g_{SO})$ , the cut-locus of the identity is the set of all involutions, $\{R\in SO(p)\mid R^{2}=I\neq R\}$ . For every non-involution $R\in SO(p)$ , there is a unique $A\in{\mathfrak{so}}(p)$ of smallest norm such that $\exp(A)=R$ (see Section 4.1); we define $\log(R)=A$ . If $R$ is an involution, there is not a unique such $A$ , but all minimal-norm $A$ ’s with $\exp(A)=R$ have the same norm, which we denote $\|\log(R)\|$ . (Thus $\|\log(R)\|$ is a well-defined real number for all $R\in SO(p)$ , even when there is no uniquely defined element“ $\log R$ ” in ${\mathfrak{so}}(p)$ .) With this understood, the geodesic-distance function $d_{M}$ on $M(p)$ is given by

[TABLE]

where in (3.5) and for the rest of this paper, $\|\ \|$ denotes the Frobenius norm on matrices: $\|A\|^{2}=\|A\|_{F}^{2}={\rm tr}(A^{T}A)$ for any matrix $A$ .

The invariances of the metrics $d_{SO}$ and $d_{{\cal D}^{+}}$ lead to the following proposition, key to many of our results (e.g. Proposition, 3.7, Proposition A.6, and Corollary A.7).

Proposition 3.2

The map ${\tilde{S}}_{p}^{+}\times M(p)\to M(p)$ defined by

[TABLE]

*is a free, isometric, left-action of ${\tilde{S}}_{p}^{+}$ on $M(p)$ that preserves every fiber of $F$ .

3.2

Scaling-rotation distance and MSSR curves

Definition 3.3 ([10, Definition 3.10])

For $X,Y\in{\rm Sym}^{+}(p)$ , the scaling-rotation distance $d_{\cal SR}(X,Y)$ between $X$ and $Y$ is defined by

[TABLE]

Definition 3.4

Let $\gamma$ be a piecewise-smooth curve in $M(p)$ and let $\ell(\gamma)$ denote the length of $\gamma$ . For $X,Y\in{\rm Sym}^{+}(p)$ , we call $\gamma:[0,1]\to M(p)$ an $F$ -minimal geodesic (from ${\cal E}_{X}$ to ${\cal E}_{Y}$ ) if $\gamma(0)\in{\cal E}_{X},\gamma(1)\in{\cal E}_{Y}$ , and $\ell(\gamma)=d_{\cal SR}(X,Y)$ . We call a pair of points $((U,D),(V,{\Lambda}))\in{\cal E}_{X}\times{\cal E}_{Y}$ a minimal pair if $(U,D)=\gamma(0)$ and $(V,{\Lambda})=\gamma(1)$ for some $F$ -minimal geodesic $\gamma$ . A minimal smooth scaling-rotation (MSSR) curve from $X$ to $Y$ is a curve $\chi$ in ${\rm Sym}^{+}(p)$ of the form $F\circ\gamma$ where $\gamma$ is an $F$ -minimal geodesic. We say that the MSSR curve $\chi=F\circ\gamma$ corresponds to the minimal pair formed by the endpoints of $\gamma$ . We let ${\cal M}(X,Y)$ denote the set of MSSR curves from $X$ to $Y.$ **

Obviously an $F$ -minimal geodesic is a minimal geodesic in the usual sense of Riemannian geometry: it is a curve of shortest length among all piecewise-smooth curves with the same endpoints. (From the general theory of geodesics, the image of any such curve $\gamma$ is actually smooth.) Thus a definition equivalent to (3.7) is

[TABLE]

Thus an $F$ -minimal geodesic can alternatively be defined as a geodesic of minimal length among all geodesics starting in one given fiber and ending in another.

Every fiber of $F$ is compact (an explicit description is given in Corollary A.7), so the infimum in (3.7) is always achieved. Hence for all $X,Y\in{\rm Sym}^{+}(p)$ , there always exists an $F$ -minimal geodesic, a minimal pair in ${\cal E}_{X}\times{\cal E}_{Y}$ , and an MSSR curve from $X$ to $Y$ .

Remark 3.5

Observe that we have not defined a Riemannian metric on ${\rm Sym}^{+}(p)$ , so there is no “automatic” meaning attached to the phrase length of a smooth curve in ${\rm Sym}^{+}(p)$ . However, for an SSR curve $\chi$ in ${\rm Sym}^{+}(p)$ we define the length of $\chi$ to be $\ell(\chi):=\inf\{\ell(\gamma):\gamma\ \mbox{is a geodesic in$ M(p) $and}\ F\circ\gamma=\chi\}$ . With this definition, (3.8) becomes

[TABLE]

A priori, given $X,Y\in{\rm Sym}^{+}(p)$ , a concrete computation of $d_{\cal SR}(X,Y)$ involves computing the distance in $M(p)$ between each connected component of ${\cal E}_{X}$ and each connected component of ${\cal E}_{Y}$ , then taking the minimum over all component-pairs. For $X=F(U,D)$ , the number of connected components of ${\cal E}_{X}$ is $|{\tilde{S}}_{p}^{+}|/|\Gamma^{0}_{{\sf J}_{D}}|$ (see Proposition A.6 in Appendix A), which tends to be a rather large number (see Corollary A.7). It is obvious from Propositions 3.2 and A.6) that computing all the distances between fiber-components is redundant. It is not so obvious exactly how much redundancy there is (more than one might guess just from looking at these two propositions). As a practical matter, it is desirable to reduce the number of component-pair computations as much as possible, taking advantage of less-obvious redundancy. We will do this in Proposition 3.7 below. This proposition plays a crucial role in [7], where for $p=3$ we apply it to compute all scaling-rotation distances, and to help compute and classify all MSSR curves. The proof of Proposition 3.7 (which is given only in the present paper, not in [7]) relies on the characterization of fibers given in Appendix A as Corollary A.4.

Definition 3.6

Recall that given any group $G$ and subgroups $H_{1},H_{2}$ , an $(H_{1},H_{2})$ double-coset is an equivalence class under the equivalence relation $\sim$ on $G$ defined by declaring $g_{1}\sim g_{2}$ if there exist $h_{1}\in H_{1},h_{2}\in H_{2}$ such that $g_{2}=h_{1}g_{1}h_{2}$ . The set of equivalence classes under this relation is denoted $H_{1}\backslash G/H_{2}$ . By a set of representatives of $H_{1}\backslash G/H_{2}$ we mean a subset of $G$ consisting of exactly one element from each $(H_{1},H_{2})$ double-coset. Since every left or right coset is also a double-coset, this defines “set of representatives” for ordinary cosets as well.

Proposition 3.7

*Let $X,Y\in{\rm Sym}^{+}(p)$ and let $(U,D)\in{\cal E}_{X},(V,{\Lambda})\in{\cal E}_{Y}$ . Let $Z$ be any set of representatives of $\Gamma_{{\sf J}_{D}}^{0}\backslash{\tilde{S}}_{p}^{+}/\Gamma_{{\sf J}_{\Lambda}}^{0}$ . Then the scaling-rotation distance from $X$ to $Y$ is given by

where*

[TABLE]

Every MSSR curve from $X$ to $Y$ corresponds to some minimal pair whose first element lies in the connected component $[(U,D)]$ of ${\cal E}_{X}$ .

Proof*: From Corollary A.4 we have*

[TABLE]

By Proposition 3.2, for all $g_{1},g_{2}\in{\tilde{S}}_{p}^{+}$ we have

[TABLE]

Proposition 3.2 implies that the action of $g_{1}^{-1}$ on $M(p)$ carries a geodesic $\gamma_{1}$ with endpoints $g_{1}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}(UR_{U},D),g_{2}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}(VR_{V},{\Lambda})$ into a geodesic $\gamma_{2}$ with endpoints $(UR_{U},D),(g_{1}^{-1}g_{2}){\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}(VR_{V},{\Lambda})$ and that satisfies $F\circ\gamma_{1}=F\circ\gamma_{2}$ . Hence, every smooth scaling-rotation (SSR) curve from $X$ to $Y$ is of the form $F\circ\gamma$ where $\gamma:[0,1]\to M(p)$ is a geodesic with $\gamma(0)=(UR_{U},D)\in[(U,D)]$ and $\gamma(1)\in{\cal E}_{Y}$ .

Suppose $\gamma_{1},\gamma_{2}$ are two such geodesics, with $\gamma_{i}(1)=(VR_{V}P_{g_{i}}^{-1},\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda})$ , $i=1,2$ . If $g_{2}=h_{D}g_{1}h_{\Lambda}$ , with $h_{D}\in\Gamma_{{\sf J}_{D}}^{0}$ and $h_{\Lambda}\in\Gamma_{{\sf J}_{\Lambda}}^{0}$ , then

[TABLE]

*where $R_{U,1}=R_{U}h_{D}\in G_{D}^{0}$ and $R_{V,1}=R_{V}h_{\Lambda}^{-1}\in G_{\Lambda}^{0}$ . The same argument as in the preceding paragraph shows that the SSR curve determined by the pair $((UR_{U},D),(VR_{V}P_{g_{2}}^{-1},\pi_{g_{2}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}))$ is the same as the SSR curve determined by the pair $((UR_{U,1},D),(VR_{V,1}\,P_{g_{1}}^{-1},\pi_{g_{1}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}))$ . Hence any representative $g\in{\tilde{S}}_{p}^{+}$ of a given $(\Gamma_{{\sf J}_{D}}^{0},\Gamma_{{\sf J}_{\Lambda}}^{0})$ double-coset determines the same set of SSR curves as does any other representative of that double-coset. The Proposition now follows.

We end this subsection with a discussion and results that motivate our inclusion of the word smooth in“smooth scaling-rotation curve”. By its definition, every SSR curve $\chi:I\to{\rm Sym}^{+}(p)$ is a smooth map, but it is not clear whether the image of $\chi$ is “geometrically smooth”, i.e. locally (in $I$ ) a smooth submanifold or submanifold-with-boundary of ${\rm Sym}^{+}(p)$ . For the image of $\chi$ to be geometrically smooth in this sense, $\chi$ must admit a regular parametrization, one that is an immersion. It turns out that all SSR curves do, except for those whose images are single points:

Proposition 3.8

If $\gamma$ is a non-constant geodesic, then $F\circ\gamma$ is either an immersion or a constant map.

Proof*: Let $\gamma:[0,1]\to M(p)$ be a non-constant $F$ -minimal geodesic and let $\chi=F\circ\gamma$ .*

Let $(U,D)=\gamma(0)$ and let $X=\chi(0)=F(U,D)$ . Since $\gamma$ is a geodesic there exist unique $A\in{\mathfrak{so}}(p),L\in{\rm Diag}(p)$ such that $\gamma(t)=(e^{tA}U,e^{tL}D)$ . Non-constancy implies $(A,L)\neq(0,0)$ . Direct computation yields

[TABLE]

where ${\Lambda}(t)=e^{tL}D$ and $[\ ,\ ]$ denotes matrix commutator.

Suppose that $t_{0}\in[0,1]$ is such that $\chi^{\prime}(t_{0})=0$ . Then

[TABLE]

Multiplying on left by $U^{T}$ and on the right by $U$ yields $[\tilde{A},{\Lambda}(t_{0})]+L{\Lambda}(t_{0})=0,$ where $\tilde{A}=U^{T}AU$ . But because ${\Lambda}(t_{0})$ is diagonal, the diagonal entries of any commutator $[B,{\Lambda}(t_{0})]$ are zero. Since $L{\Lambda}(t_{0})$ is a diagonal matrix, this implies that $[\tilde{A},{\Lambda}(t_{0})]=0=L{\Lambda}(t_{0})$ . But ${\Lambda}(t_{0})$ is invertible, so the second equality implies $L=0$ . Thus ${\Lambda}(t)=D$ for all $t$ , and plugging this into (3.13) with $t=t_{0}$ we find $[A,X]=0$ . It follows that $X$ commutes with $e^{tA}$ for every $t$ . Hence $\chi(t)=e^{tA}UDU^{T}e^{-tA}=e^{tA}Xe^{-tA}=X$ for all $t$ .

*Thus either $\chi^{\prime}(t)$ is nonzero for every $t\in[0,1]$ or $\chi$ is constant.

As noted in [10], the “scaling-rotation distance” $d_{\cal SR}$ is not a metric on ${\rm Sym}^{+}(p)$ ; it does not satisfy the triangle inequality. In [8], we show that the pseudometric $\rho_{\cal SR}$ generated by the semimetric $d_{\cal SR}$ is a true metric on ${\rm Sym}^{+}(p)$ . (It is not trivial to show that $\rho_{\cal SR}(X,Y)\neq 0$ for $X\neq Y$ .) Effectively, the construction enlarges the class of scaling-rotation (SR) curves $\chi$ considered in (3.9) from smooth maps to piecewise-smooth maps (with $\ell(\chi)$ redefined correspondingly). This definition of the scaling-rotation metric $\rho_{\cal SR}$ is analogous to the definition of “distance between two points in a Riemannian manifold”: the infimum of the lengths of piecewise-smooth curves joining the points. But some minimal-length SR curves are geometrically non-smooth (having corners); an MSSR curve from $X$ to $Y$ has minimal length only among smooth scaling-rotation curves from $X$ to $Y$ . (This phenomenon does not occur in Riemannian geometry; in a Riemannian manifold, minimal piecewise-smooth curves between two points are always geometrically smooth.) It is for this reason we have made “smooth” part of the terminology used in Definition 3.1.

Remark 3.9

It seems likely that a non-constant MSSR curve $\chi$ is actually an embedding (for this, it suffices that $\chi$ be injective, since $[0,1]$ is compact), but we have not proven this. There do exist non-minimal non-constant SSR curves that are not one-to-one. One example is any nonconstant periodic SSR curve: $t\mapsto F(\exp(tA)U,D)$ where $(U,D)\in{\rm Sym}^{+}(p)$ and $A\in{\mathfrak{so}}(p)$ is any nonzero element for which there exists $t_{1}\neq 0$ such that $\exp(t_{1}A)=I$ . (For $p\leq 3$ , the latter condition is redundant.) The restriction of this curve to $[0,|t_{1}|\,]$ is an SSR curve of positive length from $(U,D)$ to $(U,D)$ . A nonperiodic example with $p=2$ is the following. Let $J=\left(\begin{array}[]{rr}0&-1\\ 1&0\end{array}\right)$ , $U(t)=\exp(t\frac{\pi}{2}J)$ , $D(t)=\left(\begin{array}[]{rr}e^{1-t}&0\\ 0&e^{t}\end{array}\right).$ Then the curve $t\mapsto\gamma(t):=(U(t),D(t))$ is a geodesic in $M(2)$ . Let $\chi$ be the SSR curve $F\circ\gamma$ . Then, as the reader may check, if $t_{1}<t_{2}$ we have $\chi(t_{1})=\chi(t_{2})$ if (and only if) for some integer $n\geq 0$ we have $t_{1}=-n$ and $t_{2}=n+1$ . Now let $n_{1},n_{2}$ be non-negative integers, let $t_{1}\in(-n_{1}-1,-n_{1}),t_{2}\in(n_{2}+1,n_{2}+2),$ and let $n=\min\{n_{1},n_{2}\},X=\chi(t_{1})$ , and $Y=\chi(t_{2})$ . Then $\chi_{[t_{1},t_{2}]}$ is an SSR curve from $X$ to $Y$ with $n+1$ self-crossings. Note that the presence of self-crossings does not directly imply that $\chi_{[t_{1},t_{2}]}$ is not an MSSR curve: if we remove the closed curve $\chi_{[-n,n+1]}$ from $\chi_{[t_{1},t_{2}]}$ , the piecewise-smooth curve $\chi_{1}$ from $X$ to $Y$ that remains is not an SSR curve. (As the reader may check, the set $\{\chi^{\prime}(-n),\linebreak\chi^{\prime}(n+1)\}$ is linearly independent, so $\chi_{1}$ cannot be reparametrized as an immersion. Hence, by Proposition 3.8, there is no geodesic $\gamma_{1}$ in $M(2)$ such that $\chi_{1}$ can be reparametrized as $F\circ\gamma_{1}$ .) Hence $\chi_{1}$ is not a candidate for an SSR curve from $X$ to $Y$ that is shorter than $\chi$ . However, with a little effort one can check by direct computation that there is an $F$ -minimal geodesic from $X$ to $Y$ that is shorter than $\gamma|_{[t_{1},t_{2}]}$ . (One can compute the length of the minimal geodesic from any of the four points in ${\cal E}_{X}$ to any of the four points in ${\cal E}_{Y}$ , and see that each of these lengths is less than $\ell(\gamma|_{[t_{1},t_{2}]})$ .)

3.3 Geodesic antipodality and two types of non-uniqueness

As noted in Section 3.1, for all $X,Y\in{\rm Sym}^{+}(p)$ there always exists an MSSR curve from $X$ to $Y$ , the projection of some $F$ -minimal geodesic. A priori, different $F$ -minimal geodesics could project to the same MSSR curve or to different MSSR curves. It is natural to ask: Under what conditions on $(X,Y)$ is there a unique MSSR curve from $X$ to $Y$ ? When uniqueness fails, how does it fail, and what can we say about the set ${\cal M}(X,Y)$ ?

For uniqueness to fail for given $X,Y$ , there must be distinct $F$ -minimal geodesics $\gamma_{i}:[0,1]\to M(p)$ , whose endpoints are minimal pairs $((U_{i},D_{i}),(V_{i},{\Lambda}_{i}))\in{\cal E}_{X}\times{\cal E}_{Y},$ $i=1,2$ , such that $F\circ\gamma_{1}\neq F\circ\gamma_{2}$ . The “how” question above concerns the following two possibilities (not mutually exclusive):

“Type I non-uniqueness”: There exist such $\gamma_{i}$ whose endpoints are distinct minimal pairs $((U_{i},D_{i}),(V_{i},{\Lambda}_{i}))$ . 2. 2.

“Type II non-uniqueness”: There exist such $\gamma_{i}$ whose endpoints are the same minimal pair $((U,D),(V,{\Lambda}))$ .

Since for any $D,{\Lambda}\in{\rm Diag}^{+}(p)$ the minimal geodesic from $D$ to ${\Lambda}$ is unique, Type II non-uniqueness with minimal pair $((U,D),(V,{\Lambda}))$ is equivalent to the existence of two or more minimal geodesics from $U$ to $V$ , which is equivalent to each of $U,V$ being in the cut-locus (in $SO(p)$ ) of the other. It will be convenient for us to have some other terminology for such pairs:

Definition 3.10

Call a pair of points $(U,V)$ in $SO(p)\times SO(p)$ geodesically antipodal if one point is in the cut-locus of the other (equivalently, if each point is in the cut-locus of the other) and geodesically non-antipodal otherwise. Call a pair of points $((U,D),(V,{\Lambda}))$ in $M(p)\times M(p)$ geodesically antipodal if $(U,V)$ is a geodesically antipodal pair in $SO(p)\times SO(p)$ , and geodesically non-antipodal otherwise. **

As mentioned earlier, the cut-locus of the identity $I\in SO(p)$ is precisely the set of all involutions in $SO(p)$ . Furthermore, because of the invariance of the Riemannian metric $g_{SO}$ , an element $V\in SO(p)$ is in the cut-locus of element $U$ if and only if $V^{-1}U$ is in the cut-locus of $I$ . Note that, as would be true in any group, if any of the elements $V^{-1}U,UV^{-1},U^{-1}V,VU^{-1}$ is an involution, so are all the others.

Note that a pair $(U,V)$ in $SO(p)$ can be geodesically antipodal without either point being maximally remote from the other. (For example, with $p=4$ , the matrix ${\rm diag}(-1,-1,1,1)$ is an involution, but is closer to the identity $I$ than is the involution $-I$ .) However, if $(U,V)$ is geodesically antipodal, then there exists a (not necessarily unique) closed geodesic in $SO(p)$ containing $U$ and $V$ , isometric to a circle of some radius, such that $U$ and $V$ are antipodal points of this circle in the usual sense.

Proposition 3.7 is a starting-point for understanding the set ${\cal M}(X,Y)$ for all $p$ and all $X,Y\in{\rm Sym}^{+}(p)$ : it assures us that, for any $(U,D)\in{\cal E}_{X}$ , every MSSR curve from $X$ to $Y$ corresponds to some minimal pair whose first element lies in the connected component $[(U,D)]$ of ${\cal E}_{X}$ . But even once we know all the minimal pairs, to completely understand ${\cal M}(X,Y)$ —or even just determine its cardinality—we need a way to tell whether MSSR curves corresponding to two (not necessarily distinct) minimal pairs with first point in $[(U,D)]$ are the same. (This is true whether the non-uniqueness, if any, in ${\cal M}(X,Y)$ is of Type I, Type II, or a mixture of both). Proposition 3.11 below provides such a tool. This proposition, like Proposition 3.7, plays a crucial role in **[7]** (where it is stated without proof), enabling an explicit computation of the sets ${\cal M}(X,Y)$ for $p=3$ .

Proposition 3.11

Let $X,Y\in{\rm Sym}^{+}(p),X\neq Y$ . For $i=1,2$ assume that $\chi_{i}=F\circ\gamma_{i}$ is a minimal smooth scaling-rotation curve from $X$ to $Y$ corresponding to the minimal pair $((UR_{U,i},D),(VR_{V,i}\,P_{g_{i}}^{-1},{\Lambda}_{i})),$ where $R_{U,i}\in G_{D}^{0},R_{V,i}\in G_{\Lambda}^{0}$ , $g_{i}\in{\tilde{S}}_{p}^{+}$ , ${\Lambda}_{i}=\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}$ , and $\gamma_{i}:[0,1]\to M(p)$ is a geodesic. (We do not assume that the two minimal pairs are distinct.) Then $\chi_{1}=\chi_{2}$ if and only if the following two conditions hold.

(i)

Both pairs $(UR_{U_{i}},VR_{V,i}P_{g_{i}}^{-1})$ are geodesically non-antipodal and

[TABLE]

or both pairs are geodesically antipodal and

[TABLE]

where for any $(U^{\prime},D^{\prime})\in M(p)$ , ${\rm proj}_{SO(p)}$ denotes the natural projection $T_{(U^{\prime},D^{\prime})}(M(p))\to T_{U^{\prime}}(SO(p))$ .

(ii)

There exist $g\in{\tilde{S}}_{p}^{+},R\in G_{D,{\Lambda}_{1}}^{0}$ such that

[TABLE]

Equation (3.15) implies equation (3.14), so (3.14) is always a necessary condition for the equality $\chi_{1}=\chi_{2}$ .

In Proposition 3.11, in the geodesically non-antipodal case we use endpoint data to tell whether the projections to ${\rm Sym}^{+}(p)$ of two minimal geodesics from ${\cal E}_{X}$ to ${\cal E}_{Y}$ are equal. We will deduce this proposition from the following theorem, proven in **[10]**, that gives a criterion based on initial-value data to tell whether the projections of two geodesics emanating from ${\cal E}_{X}$ are equal. In this theorem, $G_{D,L}:=G_{D}\mbox{\small\$ \bigcap $\ }G_{L}$ , ${\mathfrak{g}}_{D,L}=:{\mathfrak{g}}_{D}\mbox{\small\$ \bigcap $\ }{\mathfrak{g}}_{L}$ (the Lie algebra of $G_{D,L}$ ), and for $A\in SO(p)$ , ${\rm ad}_{A}:{\mathfrak{so}}(p)\to{\mathfrak{so}}(p)$ is the linear map defined by ${\rm ad}_{A}(B)=[A,B]$ .

Notation 3.12

For $(U,D)\in M(p)$ , $A\in{\mathfrak{so}}(p)$ , $L\in{\rm Diag}(p)$ , and any interval $I$ containing [math], we write $\gamma_{U,D,A,L}$ for the geodesic $I\to M(p)$ defined by $t\mapsto(e^{tA}U,e^{tL}D)$ .

Theorem 3.13 ([10, Theorem 3.8])

For $i=1,2$ let $(U_{i},D_{i})\in M(p)$ , $A_{i}\in{\mathfrak{so}}(p)$ , $L_{i}\in{\rm Diag}(p)$ , and let $\check{A}_{i}=U_{1}^{-1}A_{i}U_{1}$ . Let $I$ be a positive-length interval containing [math]. Then the smooth scaling-rotation curves $\chi_{i}:=F\circ\gamma_{U_{i},D_{i},A_{i},L_{i}}:I\to{\rm Sym}^{+}(p)$ are identical if and only if (i) $\check{A}_{2}-\check{A}_{1}\in{\mathfrak{g}}_{D_{1},L_{1}}$ , (ii) $({\rm ad}_{\check{A}_{2}})^{j}(\check{A}_{1})\in{\mathfrak{g}}_{D_{1},L_{1}}$ for all $j\geq 1$ , and (iii) there exist $R\in G_{D_{1},L_{1}}$ and $g\in{\tilde{S}}_{p}^{+}$ , such that $U_{2}=U_{1}RP_{g}^{-1}$ , $D_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D_{1}$ , and $L_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}L_{1}$ .111In [10, Theorem 3.8], $g$ was actually required to be a particular pre-image of $\pi$ in ${\tilde{S}}_{p}^{+}$ , but the same argument as in the proof of Proposition A.3 of the present paper shows that this restriction can be removed.

To deduce Proposition 3.11 from Theorem 3.13, we first prove two lemmas. Beyond helping us to prove the Proposition, these lemmas may be useful in future analysis of MSSR curves. In these lemmas, for any $X\in{\rm Sym}^{+}(p)$ we write ${\mathfrak{g}}_{X}$ for the Lie algebra of the stabilizer $G_{X}:=\{U\in G:UXU^{T}=X\}$ ; thus ${\mathfrak{g}}_{X}=\{A\in{\mathfrak{so}}(p):AX=XA\}$ . (Observe that the notation $G_{X}$ is consistent with the notation $G_{D}$ introduced earlier for diagonal matrices.)

Lemma 3.14

Let $X,Y\in{\rm Sym}^{+}(p)$ and suppose that $\chi:[0,1]\to{\rm Sym}^{+}(p)$ is a minimal smooth rotation-scaling curve with $X:=\chi(0)\neq Y:=\chi(1)$ . Let $\gamma=\gamma_{U,D,A,L}:[0,1]\to SO(p)\times{\rm Diag}^{+}(p)$ be a geodesic for which $\chi=F\circ\gamma$ . Then $A\in({\mathfrak{g}}_{X})^{\perp}\mbox{\small\$ \bigcap $\ }({\mathfrak{g}}_{Y})^{\perp}$ , where the orthogonal complements are taken in ${\mathfrak{so}}(p)$ .

Proof*: Since $\gamma$ is a smooth curve of minimal length connecting the submanifolds ${\cal E}_{X}$ and ${\cal E}_{Y}$ of $M(p)$ , the velocity vectors $\gamma^{\prime}(0),\gamma^{\prime}(1)$ must be perpendicular to the tangent spaces $T_{\gamma(0)}{\cal E}_{X},T_{\gamma(1)}{\cal E}_{Y}$ , respectively ([2, Proposition 1.5]). Making natural tangent-space identifications, we have $T_{\gamma(0)}{\cal E}_{X}=T_{(U,D)}{\cal E}_{X}=U{\mathfrak{g}}_{D}\oplus\{0\}\subset U{\mathfrak{g}}_{D}\oplus{\rm Diag}(p)$ , where $U{\mathfrak{g}}_{D}:=\{UC:C\in{\mathfrak{g}}_{D}\}$ . Let $\check{A}=U^{-1}AU$ . Since $\gamma^{\prime}(0)=(U\check{A},DL)$ , and the Riemannian metric we are using on $SO(p)$ is left-invariant, the condition $\gamma^{\prime}(0)\perp T_{\gamma(0)}{\cal E}_{X}$ is equivalent to $\check{A}\in({\mathfrak{g}}_{D})^{\perp}$ , hence to $A\in U({\mathfrak{g}}_{D})^{\perp}U^{-1}$ . Using additionally the right-invariance of the metric on ${\mathfrak{so}}(p)$ , we have $U({\mathfrak{g}}_{D})^{\perp}U^{-1}=(U{\mathfrak{g}}_{D}U^{-1})^{\perp}$ . From general group-action properties, it is easily seen that $U{\mathfrak{g}}_{D}U^{-1}={\mathfrak{g}}_{UDU^{-1}}$ . Since $UDU^{-1}=X$ , it follows that $A\in({\mathfrak{g}}_{X})^{\perp}$ . A similar argument at the point $(V,{\Lambda}):=\gamma(1)$ shows that $A\in({\mathfrak{g}}_{V{\Lambda}V^{-1}})^{\perp}=({\mathfrak{g}}_{Y})^{\perp}$ . ** ** ** ** *

**

Lemma 3.15

In the setting of Theorem 3.13, assume that the smooth scaling-rotation curve $\chi_{1}$ is minimal. Then conditions (i) and (ii) in the theorem can be replaced by the single condition $A_{2}=A_{1}$ .

Proof*: With notation as in Theorem 3.13, assume that $\chi_{2}=\chi_{1}$ . Then the Theorem implies that $U^{-1}(A_{2}-A_{1})U\in{\mathfrak{g}}_{D,L}\subset{\mathfrak{g}}_{D},$ implying that $A_{2}-A_{1}\in U{\mathfrak{g}}_{D}U^{-1}={\mathfrak{g}}_{X}$ (as in the proof of Lemma 3.14). But since $\chi_{1}$ is minimal, Lemma 3.14 implies that both $A_{2}$ and $A_{1}$ lie in $({\mathfrak{g}}_{X})^{\perp}$ , hence that $A_{2}-A_{1}\in({\mathfrak{g}}_{X})^{\perp}$ . Hence $A_{2}-A_{1}=0$ , i.e. $A_{2}=A_{1}$ .*

*Conversely, assume that $A_{2}=A_{1}$ . Then conditions (i) and (ii) are satisfied trivially. ** ** ** ** *

**

Proof of Proposition 3.11:* For $i\in\{1,2\}$ let $U_{i}=UR_{U,i},\ V_{i}=VR_{V,i}P_{g_{i}}^{-1},$ and ${\Lambda}_{i}=\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}_{i}\,.$ *

By hypothesis $\chi_{i}=F\circ\gamma_{i}$ , where $\gamma_{i}=\gamma_{U_{i},D,A_{i},L_{i}}:[0,1]\to M(p)$ (for some $A_{i}\in{\mathfrak{so}}(p),L_{i}\in{\rm Diag}(p)$ ) is a minimal geodesic from $(U_{i},D)$ to $(V_{i},{\Lambda}_{i})$ . Hence $L_{i}=\log({\Lambda}_{i}D^{-1})$ and $A_{i}\in\log(V_{i}U_{i}^{-1})$ (we write “ $\in$ ” rather than “ $=$ ” since if $R$ is an involution, “ $\log R$ ”, as we have defined it, is a set with more than one element; see Section 3.1).

It is straightforward to show that $G_{D,L_{i}}=G_{D,{\Lambda}_{i}}$ . From Lemma 3.15, the conditions (i) and (ii) in Theorem 3.13 in the equality-conditions for $\chi_{1}$ and $\chi_{2}$ can be replaced by the single condition $A_{2}=A_{1}$ .

If $A_{2}=A_{1}$ then $V_{2}U_{2}^{-1}=V_{1}U_{1}^{-1}$ , implying that either both pairs $(U_{i},V_{i})$ are geodesically antipodal or both are geodesically non-antipodal. In the converse direction, suppose that the pairs $(U_{i},V_{i})$ are geodesically non-antipodal and that $V_{2}U_{2}^{-1}=V_{1}U_{1}^{-1}$ . Then $A_{2}=\log(V_{2}U_{2}^{-1})=\log(V_{1}U_{1}^{-1})=A_{1}$ . Whether or not the pairs $(U_{i},V_{i})$ are geodesically antipodal, by definition $({\rm proj}_{SO(p)}\gamma_{i}^{\prime}(0))U_{i}^{-1}=A_{i}$ , so if (3.15) holds then $A_{2}=A_{1}$ . Hence the condition $A_{2}=A_{1}$ is equivalent to condition (i) in Proposition 3.11.

Next, letting $D$ play the role of $D_{1}$ in Theorem 3.13, condition (iii) in the Theorem is equivalent to the existence of $g\in{\tilde{S}}_{p}^{+},R\in G_{D,{\Lambda}_{1}}$ such that $D=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D$ , $L_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}L_{1}$ , and $U_{2}=U_{1}RP_{g}^{-1}$ . But for all such $R,\pi$ , we have $RP_{g}^{-1}=R_{0}P_{g_{0}}^{-1}$ for some $R_{0}\in G_{D,{\Lambda}_{1}}^{0}$ and $g_{0}\in{\tilde{S}}_{p}^{+}$ with $\pi_{g_{0}}=\pi_{g}$ . Furthermore, for any $\pi\in S_{p}$ , if $\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D=D$ then $L_{2}=\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}L_{1}\iff{\Lambda}_{2}=\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}_{1}$ . Hence, under the hypotheses of Proposition 3.11, condition (iii) in Theorem 3.13 is equivalent to condition (ii) stated in the Proposition.

*This establishes the “if and only if” statement in the Proposition. The final statement of the proposition follows from the fact that, in the notation of this proof, (3.15) is the equality $A_{2}=A_{1}$ (after multiplying both sides of (3.15) on the right by $U^{-1}$ ), an equality that implies $V_{2}U_{2}^{-1}=\exp(A_{2})=\exp(A_{1})=V_{1}U_{1}^{-1}$ . ** ** ** ** *

**

3.4 Type I and Type II non-uniqueness

Within the scaling-rotation framework, the motivation to understand Type II non-uniqueness is its effect on a true scaling-rotation metric $\rho_{\cal SR}$ on ${\rm Sym}^{+}(p)$ , mentioned earlier, that we construct from $d_{\cal SR}$ in **[8]**. Various constructions and assertions concerning this metric are simplified when we know that Type II non-uniqueness does not occur. But, as we shall see, the study of Type II non-uniqueness also leads to geometric results outside the scaling-rotation framework.

For small enough values of $p$ , Type II non-uniqueness never occurs; for large enough $p$ , it always occurs (see Corollaries 3.19 and 3.21 below). Our main tool for ruling out Type II non-uniqueness is based on a property we call sign-change reducibility (for want of a better term), defined shortly.

To motivate the definition, let $X,Y\in{\rm Sym}^{+}(p)$ and let $((U,D),(V,{\Lambda}))\in{\cal E}_{X}\times{\cal E}_{Y}$ be a minimal pair. Then one minimizer $(g,R_{U},R_{V})$ of the expression in brackets on the right-hand side of (LABEL:fibdist8a) is the triple $(e,I,I)$ , where $e$ is the identity element of ${\tilde{S}}_{p}^{+}$ . Hence for all $g\in{\tilde{S}}_{p}$ with $\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda}={\Lambda}$ —i.e. for all $g\in\tilde{K}_{{\sf J}_{\Lambda}}$ (see Notation 2.1)—we must have $d_{SO}(UP_{g},V)=d_{SO}(U,VP_{g}^{-1})\geq d_{SO}(U,V).$ But ${\cal I}_{p}^{+}\subset\tilde{K}_{\sf J}$ for all ${\sf J}$ , so, in particular, we must have $d_{SO}(UI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},V)\geq d_{SO}(U,V)$ for all ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ .

Definition 3.16

Call a pair of points $(U,V)\in SO(p)\times SO(p)$ sign-change reducible if $d_{SO}(UI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},V)<d_{SO}(U,V)$ for some ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ .**

From the discussion preceding Definition 3.16, we have the following:

Corollary 3.17

*Let $((U,D),(V,{\Lambda}))\in M(p)\times M(p)$ . If $(U,V)\in SO(p)\times SO(p)$ is sign-change reducible, then $((U,D),(V,{\Lambda}))$ is not a minimal pair.

Sign-change reducibility is studied in more detail in Sections 4–7; a long digression from the topic of scaling-rotation distance and MSSR curves is needed (but has bonuses). Below, we summarize some results proven there, and their consequences. Two of the main results are given in the following Proposition (proven in Section 7):

Proposition 3.18

(a) For $p\leq 4$ , every geodesically antipodal pair $(U,V)$ in $SO(p)\times SO(p)$ is sign-change reducible. (b) For $p\geq 11$ , there exist geodesically antipodal pairs $(U,V)$ in $SO(p)\times SO(p)$ that are not sign-change reducible.

Thus the largest dimension $p_{1}$ for which every geodesically antipodal pair $(U,V)$ in $SO(p_{1})\times SO(p_{1})$ is sign-change reducible satisfies $4\leq p_{1}\leq 10$ . A combination of theory and numerical evidence leads the authors to believe that $p_{1}$ is closer to 10 than to 4.

An immediate consequence of Proposition 3.18 (a) is the following. (Again, we do not believe the number “4” here is sharp.)

Corollary 3.19

For $p\leq 4$ , every minimal pair in $M(p)\times M(p)$ is geodesically non-antipodal. Hence for $p\leq 4$ , for all $X,Y\in{\rm Sym}^{+}(p)$ for which $|{\cal M}(X,Y)|>1$ , the non-uniqueness is purely of Type I.

Part of the importance of sign-change reducibility comes from the following:

Proposition 3.20

Suppose that $(U,V)$ is a pair in $SO(p)\times SO(p)$ that is not sign-change reducible. Then there exist $D,{\Lambda}\in{{\cal D}}_{{\sf J}_{\rm top}}$ such that the pair $((U,D),(V,{\Lambda}))$ is minimal.

We will prove this below. But first note that an immediate corollary of Propositions 3.18(b) and 3.20 is:

Corollary 3.21

For $p\geq 11$ , there exist geodesically antipodal, minimal pairs $((U,D),(V,{\Lambda}))\in{\cal S}_{{\sf J}_{\rm top}}\times{\cal S}_{{\sf J}_{\rm top}}\subset M(p)\times M(p)$ . Hence, for $p\geq 11$ , there exist $X,Y\in{\cal S}_{[{\sf J}_{\rm top}]}\subset{\rm Sym}^{+}(p)$ for which the set ${\cal M}(X,Y)$ exhibits Type II non-uniqueness.

Thus sign-change reducibility is more than an ad hoc criterion for ruling out Type II non-uniqueness for small enough $p$ . Proposition 3.20 and Corollary 3.21 show that, in some sense, sign-change reducibility is the only obstruction to having points $X,Y$ in the top stratum of ${\rm Sym}^{+}(p)$ for which ${\cal M}(X,Y)$ exhibits Type II non-uniqueness.

For $X$ or $Y$ not in the top stratum of ${\rm Sym}^{+}(p)$ , the relationship between Type II non-uniqueness and sign-change reducibility of minimal pairs in ${\cal E}_{X}\times{\cal E}_{Y}$ situation is more complicated to analyze. We do not investigate this relationship further in this paper.

To prove Proposition 3.20 we start with a lemma:

Lemma 3.22

Let $c>0$ . There exist $D,{\Lambda}\in{\cal D}_{\rm top}:={\cal D}_{{\sf J}_{\rm top}}$ such that $\|\log(D^{-1}(\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\Lambda})\|^{2}>\|\log(D^{-1}{\Lambda})\|^{2}+c$ for all non-identity $\pi\in S_{p}$ .

Proof*: Let $c_{1}=\sqrt{c/(3p)}$ and let $\{a_{i}\}_{i=1}^{p}$ be a sequence of numbers satisfying $a_{i+1}-a_{i}>(2\sqrt{p}+1)c_{1}$ for $1\leq i\leq p-1$ . Then $|c+a_{j}-a_{i}|>2\sqrt{p}c$ for all $i\neq j$ . Let $D={\rm diag}(e^{a_{1}},\dots,e^{a_{p}})$ and let ${\Lambda}=e^{c_{1}}D$ . Then $D,{\Lambda}\in{\cal D}_{\rm top}$ and $\|\log(D^{-1}{\Lambda})\|^{2}=\|c_{1}I\|^{2}=pc_{1}^{2}.$ *

Let $\pi\in S_{p},\pi\neq{\rm id},$ and let $i$ be such that $\pi^{-1}(i)\neq i$ . Then

[TABLE]

* ** ** ** *

**

Proof of Proposition 3.20. Let $D,{\Lambda}\in{\cal D}_{\rm top}$ be such that

[TABLE]

for all non-identity $\pi\in S_{p}$ ; such $D,{\Lambda}$ exist by Lemma 3.22. Let $X=F(U,D),Y=F(V,{\Lambda})$ . The subgroups $G_{D}^{0},G_{{\Lambda}}^{0}$ of $SO(p)$ are trivial, as are the subgroups $\Gamma_{{\sf J}_{D}}^{0}$ and $\Gamma_{{\sf J}_{{\Lambda}}}^{0}$ of ${\tilde{S}}_{p}^{+}$ . Hence in Proposition 3.7 we have $Z={\tilde{S}}_{p}^{+}$ and

[TABLE]

For all non-identity $\pi\in S_{p}$ and all $g_{1},g_{2}\in{\tilde{S}}_{p}^{+}$ with $\pi_{g_{1}}={\rm id.}$ and $\pi_{g_{2}}=\pi$ , using (3.19) we then have

[TABLE]

Hence the identity permutaton is the only element of $S_{p}$ for which the expression inside the outer braces in (3.4) achieves the minimum over all $\pi\in S_{p}$ . But $\{g\in{\tilde{S}}_{p}^{+}:\pi_{g}={\rm id.}\}$ is precisely the sign-change subgroup ${\cal I}_{p}^{+}$ , and by hypothesis $(U,V)$ is not sign-change reducible. Hence

[TABLE]

*Thus $((U,D),(V,{\Lambda}))$ is a minimal pair. ** ** ** ** *

**

4 Involutions, sign-change reducibility, and distance between subspaces of ${\bf R}^{p}$

In this section we begin our study of sign-change reducibility. This culminates in Section 7 with the proof of Proposition 3.18 (which, as we have seen, implies Corollary 3.21, our main result concerning Type II non-uniqueness), but we discover some other interesting facts along the way. As we shall see, questions concerning the seemingly ad hoc notion of sign-change reducibility can be translated into questions about distances between subspaces of ${\bf R}^{p}$ ; for example, Proposition 4.11 states the equivalence between a sign-change-reducibility question and a question purely about the geometry of the Grassmannian ${\rm Gr}_{m}({\bf R}^{p})$ (endowed with a standard metric). Thus, some unexpected benefits of our investigation of Type II non-uniqueness are results, possibly of independent interest, concerning the geometry of Grassmannians and, more generally, principal angles between subspaces of ${\bf R}^{p}$ .

Since $d_{SO}(U,V)=d_{SO}(V^{-1}U,I)$ for $U,V\in SO(p)$ , the set of distances between geodesically antipodal points in $SO(p)$ is the same as the set of distances between the identity and involutions. Thus to understand which (if any) geodesically antipodal pairs $(U,V)$ in $SO(p)$ are sign-change reducible, it suffices to study the case $(U,V)=(R,I)$ , where $R$ is an involution.

Definition 4.1

Call $R\in SO(p)$ sign-change reducible if $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)<d_{SO}(R,I)$ for some ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ (equivalently, if the pair $(R,I)$ is sign-change reducible). Note that sign-change reducibility of the pair $(U,V)$ , as previously defined in Definition 3.16, is equivalent to sign-change reducibility of $V^{-1}U$ . 2. 2.

For ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}$ , define the level of $\sigma$ , written ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})$ , to be $\#\{i:\sigma_{i}=-1\}$ . 3. 3.

For any involution $R\in SO(p)$ , define the level of $R$ , written ${\rm level}(R)$ , to be $\dim(E_{-1}(R)),$ where $E_{-1}(R)$ is the $(-1)$ -eigenspace of $R$ . We write ${\rm Inv}(p)$ for the set of involutions in $SO(p)$ , and for $0<m\leq p$ we write ${\rm Inv}_{m}(p)$ for the set of involutions in $SO(p)$ of level $m$ . Note that $\dim(E_{-1}(R))$ is even for any $R\in SO(p)$ , so ${\rm Inv}_{m}(p)$ is empty unless $m$ is even and at least 2. Thus ${\rm Inv}(p)=\bigcup_{{\rm even}\ m\geq 2}{\rm Inv}_{m}(p)$ (a disjoint union). 4. 4.

Let $R\in SO(p)$ be an involution. We say that $R$ is reducible by a sign-change of level $m$ if there exists ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ of level $m$ such that $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)\linebreak<d_{SO}(R,I)$ .

Observe that for non-identity ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ , the matrix $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ is an involution in $SO(p)$ , and ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ .

Remark 4.2 (Involutions and Grassmannians)

The space ${\rm Inv}(p)$ can be naturally identified with a disjoint union of Grassmannians, because an involution $R\in SO(p)$ is completely determined by its $(-1)$ -eigenspace $E_{-1}(R)$ . Let ${\rm Gr}_{m}({\bf R}^{p})$ denote the Grassmannian of $m$ -planes in ${\bf R}^{p}$ , and for even $m\in(0,p]$ define $\Phi_{m,p}:{\rm Gr}_{m}({\bf R}^{p})\to{\rm Inv}_{m}(p)$ to be the map carrying $W\in{\rm Gr}_{m}({\bf R}^{p})$ to the involution in $SO(p)$ whose $(-1)$ -eigenspace is $W$ . (Thus $E_{-1}(R)=\Phi_{m,p}^{-1}(R)$ for all $R\in{\rm Inv}_{m}(p)$ .) Concretely, letting $\pi_{V}:{\bf R}^{p}\to V$ denote orthogonal projection onto any subspace $V,$ and letting $P_{V}$ denote the matrix of $\pi_{V}$ with respect to the standard basis of ${\bf R}^{p}$ , the map $\Phi_{m,p}$ is given by

[TABLE]

reflection about the $(p-m)$ -plane $W^{\perp}$ . It is not hard to show that ${\rm Inv}_{m}(p)$ is a submanifold of $SO(p)$ and that $\Phi_{m,p}$ is a diffeomorphism from ${\rm Gr}_{m}({\bf R}^{p})$ to this submanifold. **

Our study of sign-change reduciblity of involutions will make frequent use of the normal form of an element of $SO(p)$ , so we review this before proceeding.

4.1 Normal form and

distance to the identity in $SO(p)$

Let $k=\lfloor\frac{p}{2}\rfloor$ . Recall that every $R\in SO(p)$ has a normal form: a block-diagonal matrix that, for $p$ even, is of the form

[TABLE]

where

[TABLE]

and where $\theta_{i}\in[0,\pi],1\leq i\leq k$ . (This can be derived quickly from the normal form of an antisymmetric matrix, since the compactness of $SO(p)$ guarantees that the exponential map ${\mathfrak{so}}(p)\to SO(p)$ is onto.) For the odd- $p$ * case, the normal-form matrix is the matrix (4.2) with one more row and column appended, and with a 1 in the lower right-hand corner (and zeroes everywhere else in the last row and column). In this case we define $\theta_{k+1}=0$ , so that for both even and odd $p$ we can use the notation ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ for the normal form.*

Note that

[TABLE]

For each $R\in SO(p)$ there exists an orthonormal basis of ${\bf R}^{p}$ with respect to which the linear transformation ${\bf R}^{p}\to{\bf R}^{p}$ , $v\mapsto Rv$ , has matrix ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ . Thus there exists $Q\in O(p)$ such that

[TABLE]

The normal form of a given $R$ is unique up to ordering of the blocks; the multi-set $\{\theta_{1},\dots,\theta_{\lceil p/2\rceil}\}$ is uniquely determined by $R$ . From (4.3) and (4.5) we have

[TABLE]

where $A(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is the block-diagonal matrix obtained by replacing $C(\theta_{i})$ by $\theta_{i}J$ in (4.2), $1\leq i\leq\lfloor p/2\rfloor$ , and, in the odd- $p$ * case, replacing the 1 in the lower right-hand corner by 0. Since the normal form is unique up to block-ordering, it follows that*

[TABLE]

Furthermore, from (4.5) and (4.3) it follows that

[TABLE]

if $p$ is even; for odd $p$ we again just append one more row and column of the middle matrix, with a 1 in the lower right-hand corner. Hence the values $\cos\theta_{i}$ (and therefore the values $\theta_{i}\in[0,\pi]$ ) can be recovered from $R$ as the eigenvalues of $R_{\rm sym}$ , with the multiplicity of an eigenvalue $\lambda$ of $R_{\rm sym}$ equal to twice the multiplicity $m_{\lambda}$ of $\lambda$ in the list $\cos\theta_{1},\dots,\cos\theta_{k}$ in the even- $p$ * case; for odd $p$ the only difference is that multiplicity of the eigenvalue 1 of $R_{\rm sym}$ is $2m_{1}+1$ .*

Remark 4.3 (Normal form, involutions, and distances to identity)

**

Writing $R\in SO(p)$ in the form (4.5), it is easily seen that $R$ is an involution if and only if (i) for each $i$ , $\theta_{i}$ is either [math] or $\pi$ , and (ii) $\theta_{i}=\pi$ for at least one $i$ . For such $R$ , if $\theta_{i}=\pi$ for exactly $m$ values of $i$ , then $\|A(\theta_{1},\dots,\theta_{\lceil p/2\rceil})\|^{2}=m\pi^{2}$ . Hence if $R\in SO(p)$ is an involution of level $m$ , then

[TABLE]

Thus

[TABLE]

Using (4.6) it can also be shown that for every non-involution $R\in SO(p)$ , there is a unique $A\in{\mathfrak{so}}(p)$ of smallest norm such that $\exp(A)=R$ .**

Notation 4.4

**

Given $R\in SO(p)$ and angles $\theta_{1},\dots,\theta_{\lceil p/2\rceil}\in[0,\pi]$ for which ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is a normal form of $R$ , we define “redundant normal-form angles” $\tilde{\theta}_{i}\in[0,\pi]$ , $1\leq i\leq p$ , by

[TABLE]

2.

For any square matrix $A$ we write $E_{\lambda}(A)$ for the $\lambda$ -eigenspace of $A$ .

Note that (4.7) can now be written as

[TABLE]

4.2 Sign-change reducibility, distances in Grassmannians, and a half-angle relation

In this section we state and discuss several results, but defer their proofs to later sections.

For $p\leq 4$ one can show, without appealing to Proposition 4.6 below, that every involution in $SO(p)$ is sign-change reducible. (This sign-change redubility holds for trivial reasons for when $p=2;$ holds for slightly less trivial reasons, mentioned later in Remark 4.12, for $p=3$ ; and can be shown to be hold for $p=4$ using a quaternionic approach.) It is reasonable to wonder whether this holds for all $p$ :

Question 4.5

Let $p\geq 2$ . Is every involution in $SO(p)$ sign-change reducible?

Our motivation for this question is not just generalization for its own sake, however. Potential Type II non-uniqueness complicates several aspects of the analysis of scaling-rotation distance and the associated metric $\rho_{{\cal SR}}$ studied in **[8]**. To understand whether the “Type II non-uniqueness” defined in Section 3.4 can occur, we need to know whether a geodesically antipodal pair in $M(p)$ can be minimal. (As discussed in Section 3.4, a geodesically non-antipodal minimal pair in $M(p)$ uniquely determines an MSSR curve in ${\rm Sym}^{+}(p)$ .) A sufficient condition for any pair $((U,D),(V,{\Lambda}))$ in $M(p)\times M(p)$ to be non-minimal is that the pair $(U,V)\in SO(p)$ be sign-change reducible. Since sign-change reducibility of involutions rules out the possibility of Type II non-uniqueness, and all involutions are sign-change reducible for $p\leq 4$ , it is natural to ask Question 4.5 and wish for the answer to be yes.

The answer, however, is more complicated. We shall see that the answer to Question 4.5 is yes for $p\leq 4$ and no for $p\geq 11$ (we do not know the answer for $5\leq p\leq 10$ ), but that for all $p$ , involutions of high enough level are sign-change reducible—morevover, by a sign-change of the same level:

Proposition 4.6

Let $R\in SO(p)$ be an involution for which ${\rm level}(R)\geq\frac{1}{2}p$ . Then there exists ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ , with ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(R)$ , such that $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)\linebreak<d_{SO}(R,I)$ .

We defer the proof to Section 6.

Since ${\rm level}(R)=\dim(E_{-1}(R))\geq 2$ for every involution $R$ , Proposition 4.6 (once proved) immediately establishes Proposition 3.18(a) and Corollary 3.19: for $p\leq 4$ , all involutions are sign-change reducible, and hence all minimal pairs in $M(p)\times M(p)$ are geodesically non-antipodal.

We shall see below (Proposition 4.11) that sign-change reducibility by a sign-change of the same level is equivalent to a statement purely about the geometry of Grassmannians. For reasons given shortly, it seems likely to the authors that the “same level” condition appearing in Proposition 4.6 is optimal (even without the “ ${\rm level}(R)\geq\frac{1}{2}p$ ” restriction) in the sense that $\min_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)\}$ is achieved by a sign-change matrix ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ for which ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(R)$ . If this is true, then the analysis of whether an involution $R$ is sign-change reducible simplifies; we need only consider ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ of the same level as $R$ . This (potential) simplication is actually of greater value to us than knowing, for a given $R\in{\rm Inv}(p)$ , whether all minimizers of $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)$ have the same level as $R$ , so we state only the following weaker conjecture:

Conjecture 4.7

Let $m\geq 2$ be even, and let $R\in SO(p)$ be an involution of level $m$ . If $R$ is sign-change reducible, then it is reducible by a sign-change of level $m$ .

In Section 6 we will prove the following special case of this conjecture:

Proposition 4.8

Conjecture 4.7 is true for $m=2$ .

*The reason we expect more generally that $\min_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)\}$ is achieved by a * $\sigma$ ** for which ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(R)$ is as follows. Every sign-change matrix $I_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}_{1}}\in{\cal I}_{p}^{+}$ is itself an involution, and satisfies

[TABLE]

Thus for $R\in SO(p)$ sufficiently close to $I_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}_{1}}$ , we have

[TABLE]

The function carrying an involution in $R\in SO(p)$ to ${\rm level}(R)$ is continuous, so for $R\in{\rm Inv}(p)$ sufficiently close to $I_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}_{1}}$ we also have ${\rm level}(R)={\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}_{1})$ . Hence for every $R\in{\rm Inv(p)}$ sufficiently close to a sign-change matrix, $\min_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)\}$ is achieved by a sign-change matrix having the same level as $R$ . It seems plausible that this remains true even without the “sufficiently close to a sign-change matrix” restriction.

As noted in Remark 4.2, for even $m\geq 2$ the space ${\rm Inv}_{m}(p)$ is diffeomorphic to the Grassmannian ${\rm Gr}_{m}({\bf R}^{p})$ . This Grassmannian carries a Riemannian metric induced by Riemannian submersion from $(SO(p),g_{SO})$ . It is known that the associated squared geodesic-distance between two points $W,Z\in{\rm Gr}_{m}({\bf R}^{p})$ is, up to a constant factor, simply the sum of squares of the principal angles between the two $m$ -planes $W,Z$ .222This fact follows from Wong’s results on geodesics in [16], and has been cited elsewhere in the literature (e.g. [4, p. 337]), though the explicit statement does not appear in [16]. Choosing the normalization in which the squared geodesic distance $d_{Gr}(W,Z)^{2}$ equals the sum of squares of the principal angles (equation (5.1) below), we will prove the following in Section 5:

Proposition 4.9

The map $\Phi=\Phi_{m,p}:({\rm Gr}_{m}({\bf R}^{p}),d_{Gr})\to({\rm Inv}_{m}(p),d_{SO})$ (see (4.1)) is an isometry, up to a constant factor of 2:

[TABLE]

for all $W,V\in{\rm Gr}_{m}({\bf R}^{p})$ .

We derive Proposition 4.9 from a general half-angle relation proven in Section 5:

Proposition 4.10

Let $R_{1},R_{2}$ be involutions in $SO(p)$ . For $i=1,2$ let $m_{i}=\dim(E_{-1}(R_{i}))$ , and let $m=\min\{m_{1},m_{2}\}$ . Let $\{\theta_{i}\in[0,\pi]\}_{i=1}^{\lceil p/2\rceil}$ be angles for which ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is a normal form of the product $R_{1}R_{2}$ , and let $\{\tilde{\theta}_{i}\}_{i=1}^{p}$ be as defined in (4.11). Then for some injective map $\iota:\{1,2,\dots,m\}\to\{1,2,\dots,p\}$ , the principal angles between $E_{-1}(R_{1})$ and $E_{-1}(R_{2})$ satisfy

[TABLE]

For every $i\notin{\rm range}(\iota)$ , the angle $\tilde{\theta}_{i}$ is either [math] or $\pi$ .

In other words, as stated in the introduction: for any two involutions $R_{1},R_{2}\in SO(p)$ , each of the principal angles between $E_{-1}(R_{1})$ and $E_{-1}(R_{2})$ is exactly half a correspondingly indexed normal-form angle of $R_{1}R_{2}$ .

Proposition 4.9 can also be proven by purely Riemannian methods, but the proof we give, via Proposition 4.10, is independent in the sense that it does not make any use of a Riemannian metric on ${\rm Gr}_{m}({\bf R}^{p})$ ; see Remark 5.5.

In Section 5, after proving Proposition 4.9 we will use it to deduce the following:

Proposition 4.11

Let $m,p$ be integers with $m$ even and $0<m\leq p$ . Then the following two statements are equivalent:

Every involution $R\in SO(p)$ of level $m$ is sign-change reducible by a sign-change of level $m$ . 2. 2.

For every $W\in{\rm Gr}_{m}({\bf R}^{p})$ , there exists a coordinate $m$ -plane ${\bf R}^{J}$ (see Notation 5.1) such that

[TABLE]

In other words, the sign-change reducibility asserted in Statement 1 of the Proposition is equivalent to a statement purely about the geometry of Grassmannians (with the metric $d_{Gr}$ ), namely that the coordinate $m$ -planes in ${\bf R}^{p}$ form a “lattice” of ${p\choose m}$ points in ${\rm Gr}_{m}({\bf R}^{p})$ such that such that every point in ${\rm Gr}_{m}({\bf R}^{p})$ is within distance $(m\pi^{2}/8)^{1/2}$ of some lattice-point. This gives us a geometric way to tackle Question 4.5, at least for sign-change reducibility of an involution $R$ by a sign-change matrix of the same level. However, the authors do not know a formula for $\min_{J\in{\cal J}_{m}}\{d_{Gr}(W,{\bf R}^{J})\}$ for general $W\in{\rm Gr}_{m}({\bf R}^{p})$ , or (more importantly), a formula for $\max_{W\in{\rm Gr}_{m}({\bf R}^{p})}\left\{\min_{J\in{\cal J}_{m}}\{d_{Gr}(W,{\bf R}^{J})\}\right\}$ .

Note that Proposition 4.6 asserts that statement 1 of Proposition 4.11 is true whenever $m\geq\frac{p}{2}$ . To put into perspective the number $\frac{m}{8}\pi^{2}$ appearing in statement 2 of Proposition 4.11, and better understand the relevance of the comparison between $m$ and $\frac{p}{2}$ , note that the squared diameter of ${\rm Gr}_{m}({\bf R}^{p})$ is $\min\{m,p-m\}\frac{\pi^{2}}{4}$ . So for $m\leq\frac{p}{2}$ , (4.15) is equivalent to

[TABLE]

For $m>\frac{p}{2}$ , the right-hand side of (4.15) is a greater fraction of ${\rm diam}({\rm Gr}_{m}({\bf R}^{p}))^{2}$ , so it is “easier” for statement 2 of Proposition 4.11 to be true for $m>\frac{p}{2}$ than for $m<\frac{p}{2}$ .

Remark 4.12

It is relatively easy to show that for any involution $R$ , there exists $\sigma$ for which $RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ is not an involution. For $p=2,3$ , we have $d_{SO}(I,R)\leq\pi$ for every $R\in SO(p)$ , and $d_{SO}(I,R)=\pi$ for every involution $R$ , so any non-involution is closer to the identity than is any involution. Hence for these values of $p$ , Proposition 4.11 is easy to prove. However, for $p\geq 4$ , given an involution $R$ and a ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ for which $RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ is not an involution, (4.10) shows that we cannot immediately deduce that $d_{SO}(I,RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})<d_{SO}(I,R)$ .**

5 Proofs of the half-angle relation

and results related to Grassmannians: Propositions 4.9, 4.10, and 4.11

The half-angle relation in Proposition 4.10 underlies our proofs of of the most of the other results stated in Section 4.2 (all but Proposition 4.8). When the dimensions of the eigenspaces in Proposition 4.10 are equal, the half-angle relation leads to the elegant distance-relation (4.13). This equidimensonal case is actually the only one we need for the application to Type-II non-uniqueness of MSSR curves. However, the half-angle relation (4.14) holds whether or not $\dim(E_{-1}(R_{1}))=\dim(E_{-1}(R_{2}))$ . Since this fact may be of interest outside the scope of this paper, and is not much harder to prove without the equal-dimensions restriction, we have stated (and will prove) the more general relation.

Section 5.1 is devoted to establishing Proposition 4.10. In Section 5.2, we apply this proposition to establish Propositions 4.9 and 4.11.

5.1 The half-angle relation

We start with some notation.

Notation 5.1

**

For $1\leq i\leq p$ let ${\bf e}_{i}$ denote the $i^{\rm th}$ standard basis vector of ${\bf R}^{p}$ .

2.

For $0\leq m\leq p$ , let ${\cal J}_{m,p}$ denote the collection of $m$ -element subsets of $\{1,\dots,p\}$ .

(a)

For $0\leq m\leq p$ and $J\in{\cal J}_{m,p}$ , define ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}$ by $\sigma_{i}=-1$ for $i\in J$ and $\sigma_{i}=1$ for $i\notin J$ . Similarly, for ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}$ , define $J^{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=\{i\in\{1,\dots,p\}:\sigma_{i}=-1\}$ . (The maps $J\mapsto{\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}$ and ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\mapsto J^{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ are inverse to each other.)

(b)

If $1\leq m\leq p$ and $J=\{i_{1},\dots,i_{m}\}\in{\cal J}_{m,p}$ , with $i_{1}<i_{2}<\dots<i_{m}$ , let ${\sf E}_{J}$ denote the $p\times m$ matrix whose $k^{\rm th}$ column is ${\bf e}_{i_{k}}$ , $1\leq k\leq m$ .

(c)

For $0\leq m\leq p$ and $J\in{\cal J}_{m,p}$ , define ${\bf R}^{J}=\{(x^{1},x^{2},\dots,x^{p})\in{\bf R}^{p}:x^{i}=0\ \mbox{if}\ i\notin J\}$ .

The collection $\{{\bf R}^{J}:J\in{\cal J}_{m,p}\}$ is the set of “coordinate $m$ -planes” in ${\bf R}^{p}$ .

3.

For any $J\subset\{1,\dots,p\}$ , let $J^{\prime}$ denote the complement of $J$ in $\{1,\dots,p\}$ .

4.

For $m_{1},m_{2}\in\{1,2,\dots,p\}$ , $W\in{\rm Gr}_{m_{1}}({\bf R}^{p}),$ $Z\in{\rm Gr}_{m_{2}}({\bf R}^{p})$ , and $J\in{\cal J}_{m_{1},p}$ , writing $m=\min\{m_{1},m_{2}\}$ ,

(a)

let $\phi_{1}(W,Z),\dots,\phi_{m}(W,Z)$ , denote the principal angles between the $m_{1}$ -plane $W$ and the $m_{2}$ -plane $Z$ (see [6, Section 12.4.3]), and

(b)

let $\phi_{J,i}(W)=\phi_{i}(W,{\bf R}^{J})$ , $1\leq i\leq m_{1}$ .

5.

For $1\leq m\leq p$ define $d_{Gr}:{\rm Gr}_{m}({\bf R}^{p})\times{\rm Gr}_{m}({\bf R}^{p})\to{\bf R}$ by

[TABLE]

As noted earlier, $d_{Gr}$ is the distance-function defined by the standard $SO(p)$ -invariant Riemannian metric on ${\rm Gr}_{m}({\bf R}^{p})$ (up to a constant factor).

The following long but far-reaching technical lemma, giving several detailed relations between a general involution in $SO(p)$ and its product with a sign-change matrix, is our key tool for establishing the results stated in Section 4.2. It is best thought of as a series of lemmas, all with the same hypotheses, that have been rolled into one long lemma in order to avoid restating hypotheses and notational definitions. After proving the lemma, we build on it with two corollaries, completing the groundwork for the proofs (in later sections) of the Section 4.2 propositions.

Lemma 5.2

Let $R\in SO(p)$ be an involution, let ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ , assume $0<m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}:={\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})<p$ , and let $J=J^{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ (see Notation 5.1). Viewing ${\bf R}^{p}$ as ${\bf R}^{J^{\prime}}\oplus{\bf R}^{J}$ , below we write every $p\times p$ matrix in the block form $\left[\begin{array}[]{ll}A_{1}&A_{2}\\ A_{3}&A_{4}\end{array}\right]$ , where $A_{1}$ is $(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\times(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ , $A_{2}$ is $(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ , $A_{3}$ is $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ , and $A_{4}$ is $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ . Then:

(i) In this block form,

[TABLE]

where $R_{1}$ is a symmetric $(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\times(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ matrix, $R_{4}$ is a symmetric $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ matrix, and $R_{2}$ is $(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ .

(ii) In the same block form,

[TABLE]

(iii) All eigenvalues of $R_{1}$ and $R_{4}$ lie in the interval $[-1,1]$ .

(iv) For every $\lambda\in(-1,1)$ , if $\lambda$ is an eigenvalue of $R_{1}$ (respectively, $R_{4}$ ), then $-\lambda$ is an eigenvalue of $R_{4}$ (resp. $R_{1}$ ) with the same multiplicity.

(v) Let $l$ denote the number of eigenvalues of $R_{1}$ , counted with multiplicity, lying in the interval $(-1,1)$ . Then $l$ is also the number of eigenvalues of $R_{4}$ , counted with multiplicity, lying in $(-1,1)$ , and $l\leq\min\{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ .

(vi) The inclusion map ${\bf R}^{J^{\prime}}\to{\bf R}^{p}$ defined by $v\mapsto\left[\begin{array}[]{l}v\\ 0\end{array}\right]$ restricts to isomorphisms $E_{\pm 1}(R_{1})\to E_{\pm 1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J^{\prime}}$ . Similarly the inclusion map ${\bf R}^{J}\to{\bf R}^{p}$ defined by $w\mapsto\left[\begin{array}[]{l}0\\ w\end{array}\right]$ restricts to isomorphisms $E_{\pm 1}(R_{4})\to E_{\pm 1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J}$ .

(vii) Let $l_{-}=\dim(E_{1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J}),\ l_{+}=\dim(E_{-1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J^{\prime}})$ .333The $\pm$ subscripts are chosen according to the eigenspaces of $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ rather than $R$ : ${\bf R}^{J}=E_{-1}(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ , ${\bf R}^{J^{\prime}}=E_{1}(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ . Then $\dim(E_{1}(R_{4}))=l_{-}$ and $\dim(E_{-1}(R_{1}))=l_{+}$ . (Thus $l_{-}+l_{+}$ is the multiplicity of $-1$ as an eigenvalue of $(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})_{\rm sym}$ in (5.3), hence of $RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ itself, and therefore yields a lower bound on $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I).$ ) Furthermore,

[TABLE]

and

[TABLE]

(viii) There exist an orthonormal $R_{1}$ -eigenbasis $\{v_{i}\}_{i=1}^{p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ of ${\bf R}^{p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ (i.e. an orthonormal basis of ${\bf R}^{p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ consisting of eigenvectors of $R_{1}$ ) and an $R_{4}$ -eigenbasis $\{w_{i}\}_{i=1}^{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ of ${\bf R}^{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ . For any such bases $\{v_{i}\}$ of ${\bf R}^{p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ , $\{w_{i}\}$ of ${\bf R}^{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ , let $\{\lambda^{\prime}_{i}\},\{\lambda_{i}\}$ be the corresponding eigenvalues (i.e. $R_{1}v_{i}=\lambda_{i}^{\prime}v_{i}$ and $R_{4}w_{i}=\lambda_{i}w_{i}$ ), and define

[TABLE]

Then

[TABLE]

(ordered arbitrarily) is an orthonormal basis of $E_{1}(R)$ , and the set

[TABLE]

(ordered arbitrarily) is an orthonormal basis of $E_{-1}(R)$ . Note that the cardinality of the second set in (5.22) (respectively (5.23)) is $l_{-}$ (resp. $l_{+}$ ).

Proof*: To simplify notation in this proof, we let $m=m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ .*

Since $R\in SO(p)$ is an involution, $R=R^{-1}=R^{T}$ . Hence $R$ is symmetric, implying assertion (i), and ${\bf R}^{p}$ is the orthogonal direct sum of $E_{1}(R)$ and $E_{-1}(R)$ (since the only possible eigenvalues of an involution are $\pm 1$ ).

For (ii), observe that in the block-form decomposition we are using,

[TABLE]

A simple calculation then yields (5.3).

Next, because $R^{2}=I$ , we have the following relations:

[TABLE]

From (5.24) and (5.27), for any $v\in{\bf R}^{p-m},w\in{\bf R}^{m}$ , we have

[TABLE]

It follows from (5.28)–(5.29) that if $\lambda$ is an eigenvalue of $R_{1}$ or $R_{4}$ , then $|\lambda|\leq 1$ , yielding (iii).

To obtain (iv), consider the operators $L:{\bf R}^{m}\to{\bf R}^{p-m}$ and $L^{*}:{\bf R}^{p-m}\to{\bf R}^{m}$ defined by $L(w)=R_{2}w$ and $L^{*}(v)=R_{2}^{T}v$ . Suppose that $R_{1}$ has an eigenvalue $\lambda$ with $|\lambda|<1$ , and let $0\neq v\in E_{\lambda}(R_{1})$ . Let $w=R_{2}^{T}v$ ; note that (5.28) implies $w\neq 0$ . Using (5.26),

[TABLE]

Hence $L^{*}$ maps $E_{\lambda}(R_{1})$ injectively to $E_{-\lambda}(R_{4})$ . Similarly, if $R_{4}$ has an eigenvalue $-\lambda$ with $|\lambda|<1$ , and $L^{*}$ maps $E_{-\lambda}(R_{4})$ injectively to $E_{\lambda}(R_{1})$ .

It follows that, for any $\lambda\in{\bf R}$ with $|\lambda|<1$ , $\lambda$ is an eigenvalue of $R_{1}$ if and only if $-\lambda$ is an eigenvalue of $R_{4}$ , and that the maps

[TABLE]

are isomorphisms. This establishes (iv). Statement (v) is an immediate corollary of (iv).

For (vi), let $\iota:{\bf R}^{J^{\prime}}\to{\bf R}^{p}$ be the first inclusion map in the lemma. Note that $R\left[\begin{array}[]{l}v\\ 0\end{array}\right]=\left[\begin{array}[]{l}R_{1}v\\ R_{2}^{T}v\end{array}\right]$ . If $v\in E_{\lambda}(R_{1})$ with $\lambda=\pm 1$ , equation (5.28) implies that $R_{2}^{T}v=0$ , hence that $R\iota(v)=\lambda\iota(v)$ . Conversely, if $R\iota(v)=\lambda\iota(v)$ , then $R_{1}v=\lambda v$ (and $R_{2}^{T}v=0$ ). Hence $\iota$ carries $E_{\lambda}(R_{1})$ isomorphically to $E_{\lambda}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J^{\prime}}$ . The argument for the inclusion map ${\bf R}^{J}\to{\bf R}^{p}$ is essentially identical. This establishes (vi).

Part (vi) implies that $\dim(E_{1}(R_{4}))=\dim(E_{1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J})=l_{-}$ and that $\dim(E_{-1}(R_{1}))=\dim(E_{-1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J^{\prime}})=l_{+}$ , the first assertion in (vii). To obtain (5.4)–(5.5), note that for any subspaces $V,W$ of ${\bf R}^{p}$ , we have

[TABLE]

(The proof of (5.32) is straightforward linear algebra.) Applying this to the case $V=E_{-1}(R),V^{\perp}=E_{1}(R),W=E_{-1}(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})={\bf R}^{J},W^{\perp}=E_{1}(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})={\bf R}^{J^{\prime}}$ , we have $l_{-}=\dim(V^{\perp}\mbox{\small\$ \bigcap $\ }W)$ and $l_{+}=\dim(V\mbox{\small\$ \bigcap $\ }W^{\perp})$ , so (5.5) follows from (5.32). The inequalities in (5.4) follow directly from (5.5).

(viii) Since $R_{1}$ (respectively $R_{4}$ ) is symmetric, an orthonormal $R_{1}$ -eigenbasis $\{v_{i}\}$ of ${\bf R}^{p-m}$ (resp., orthonormal $R_{4}$ -eigenbasis $\{w_{i}\}$ of ${\bf R}^{m}$ ) exists. Select such eigenbases, and let $\{\lambda_{i}\}$ , $\{\lambda_{i}^{\prime}\}$ be eigenvalues as defined in the Lemma. Note that the second set in (5.22) is a basis of $E_{1}(R_{4})$ , which by (vi) is isomorphic to $E_{1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J}$ . Hence the cardinality of this set is $\dim(E_{1}(R)\mbox{\small\$ \bigcap $\ }{\bf R}^{J})$ , i.e. $l_{-}$ . Similarly, the second set in (5.23) is a basis of $E_{-1}(R_{1})$ and has cardinality $l_{+}$ .

Without loss of generality, we may assume that the eigenvectors $v_{i}$ with eigenvalue $-1$ , if any, are the last $l_{+}$ , and that the eigenvectors $w_{i}$ with eigenvalue 1, if any, are the last $l_{-}$ . Using (5.27), for $1\leq i\leq m$ we have $R_{2}^{T}R_{2}w_{i}=(1-\lambda_{i}^{2})w_{i}$ , while using (5.25) we find $R_{1}R_{2}w_{i}=-\lambda_{i}R_{2}w_{i}$ . Then, using (5.2), a simple calculation shows that $R{\bf w}_{i}=-{\bf w}_{i}$ . Hence ${\bf w}_{i}\in E_{-1}(R)$ for $1\leq i\leq m-l_{-}$ , while from part (vi), ${\bf v}_{i}\in E_{-1}(R)$ for $p-m-l_{+}<i\leq p-m$ .

Let $\langle\cdot\,,\cdot\rangle$ denote the standard inner product on ${\bf R}^{n}$ for any $n$ . As seen in the proof of part (vi), $v\in E_{-1}(R_{1})$ implies $R_{2}^{T}v=0$ . Hence for $p-m-l_{+}<i\leq p-m$ and $1\leq j\leq m-l_{-}$ , $\langle{\bf v}_{i},{\bf w}_{j}\rangle\propto\langle v_{i},R_{2}w_{j}\rangle=\langle R_{2}^{T}v_{i},w_{j}\rangle=0,$ while for $p-m-l_{+}<i,j\leq p-m$ we have $\langle{\bf v}_{i},{\bf v}_{j}\rangle=\langle v_{i},v_{j}\rangle=\delta_{ij}$ . Finally, for $i,j\leq m-l_{-}$ , using the fact that $\langle R_{2}w_{i},R_{2}w_{j}\rangle=\langle w_{i},R_{2}^{T}R_{2}w_{j}\rangle=\linebreak\langle w_{i},(1-\lambda_{i}^{2})w_{j}\rangle$ , a simple computation yields $\langle{\bf w}_{i},{\bf w}_{j}\rangle=\frac{2}{1-\lambda_{i}}\delta_{ij}\ .$ Thus $\{\sqrt{\frac{1-\lambda_{i}}{2}}{\bf w}_{i}:1\leq i\leq m-l_{-}\}\mbox{\small$ \bigcup $}\{{\bf v}_{i}:p-m-l_{+}<i\leq m\}$ is an orthonormal subset of $E_{-1}(R)$ . Using (5.5), the cardinality of this subset is $m-l_{-}+l_{+}={\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})-({\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})-{\rm level}(R))={\rm level}(R)=\dim(E_{-1}(R))$ . Hence (5.23) is an orthonormal basis of $E_{-1}(R)$ .

*The proof that (5.22) is an orthonormal basis of $E_{1}(R)$ is similar. ** ** ** ** *

**

Corollary 5.3

Hypotheses and notation as in Lemma 5.2. Let $l_{+}=\linebreak\dim(E_{-1}(R_{1}))$ and $l_{-}=\dim(E_{1}(R_{4}))$ (as in Lemma 5.2(vii)). In addition let $\{\theta_{i}\in[0,\pi]\}_{i=1}^{\lceil p/2\rceil}$ be angles for which ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is a normal form of $RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ , and let $\{\tilde{\theta}_{i}\}_{i=1}^{p}$ be as defined in (4.11). Let $J_{*}=\{j\in J:0<\tilde{\theta}_{j}<\pi\}$ . Then $|J_{*}|\leq\min\{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ , and

[TABLE]

If ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(R)$ , then

[TABLE]

Proof*: Let $\beta^{\prime}:J^{\prime}\to\{1,\dots,p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ , $\beta:J\to\{1,\dots,m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\},$ be order-preserving bijections. By (4.8), the eigenvalues of $(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})_{\rm sym}$ , counted with multiplicity, are $\{\cos\tilde{\theta}_{i}\}_{i=1}^{p}$ . But from Lemma 5.2(ii), we can read off the eigenvalues of $(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})_{\rm sym}$ from (5.3); they are $\lambda_{1}^{\prime},\dots,\lambda_{p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}^{\prime},-\lambda_{1},\dots,-\lambda_{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ (ordered arbitrarily). Thus, reordering the $\lambda^{\prime}_{j}$ and the $\lambda_{j}$ appropriately, for $1\leq j\leq p$ we have*

[TABLE]

Define $J_{*}=\{j\in J:\lambda_{\beta(j)}\neq\pm 1\}.$ Observe that $J_{*}$ can also be characterized as $\{j\in J:\lambda_{\beta(j)}\neq\pm 1\}$ . Similarly, define $J^{\prime}_{*}=\{j\in J^{\prime}:\lambda^{\prime}_{\beta^{\prime}(j)}\neq\pm 1\}=\{j\in J^{\prime}:0<\tilde{\theta}_{j}<\pi\}.$ By part (v) of Lemma 5.2, $|J^{\prime}_{*}|=|J_{*}|=l\leq\min\{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ , and by part (iv) of the Lemma there is a bijection $b:J_{*}\to J^{\prime}_{*}$ such that $-\lambda_{j}=\lambda^{\prime}_{b(j)}$ for all $j\in J^{\prime}_{*}.$ Hence

[TABLE]

In particular,

[TABLE]

Next, note that

[TABLE]

and similarly $\sum_{j\in J\setminus J_{*}}\tilde{\theta}_{j}^{2}=l_{-}\pi^{2}.$ From (4.12) we therefore have

[TABLE]

establishing (5.33).

*If ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})={\rm level}(R)$ , then equation (5.5) implies that $l_{+}=l_{-}$ , so (5.33) implies the first equality in (5.34). For the second equality, observe that $j\in J\setminus J_{*}$ if and only if $\tilde{\theta}_{j}$ is 0 or $\pi$ . The number of $j$ ’s in $J$ for which $\tilde{\theta}_{j}=\pi$ is exactly $l_{-}$ , while the $j$ ’s in $J$ for which $\tilde{\theta}_{j}=0$ have no effect on $\sum_{j\in J}\tilde{\theta}_{j}^{2}$ . Hence the second equality in (5.34) holds. ** ** ** ** *

**

Corollary 5.4

Hypotheses and notation as in Lemma 5.2, except that we additionally write $m_{R}:={\rm level}(R)$ and $m=\min\{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},m_{R}\}$ . Let $\{\theta_{i}\in[0,\pi]\}_{i=1}^{\lceil p/2\rceil}$ be angles for which ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is a normal form of $RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ , let $\{\tilde{\theta}_{i}\}_{i=1}^{p}$ be as defined in (4.11), let the elements of $J$ be $i_{1}<i_{2}<\dots<i_{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ , and let $\phi_{J,j}=\phi_{J,j}(E_{-1}(R)),1\leq j\leq m$ . Then:

(i) Up to ordering,

[TABLE]

(ii) If $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=m_{R}$ then

[TABLE]

Proof*: (i). Let $\widetilde{W}$ be the $p\times m_{R}$ matrix formed by the columns of the basis (5.23) of $E_{-1}(R)$ , with the elements of the first set in (5.23) comprising the first $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}$ columns , and the elements of the second set comprising the last $l_{+}$ columns. (Here $l_{\pm}$ are defined as in Lemma 5.2(vii).) Without loss of generality we order the $R_{4}$ -eigenvectors $w_{i}$ such that the first $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}$ are the ones for which $\lambda_{i}\neq 1$ .*

Since the columns of $\widetilde{W}$ form an orthonormal basis of $E_{-1}(R)$ , the numbers $\{\cos\phi_{J,i}\}_{i=1}^{m}$ are the singular values of the $m_{R}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ matrix $\widetilde{W}^{T}{\sf E}_{J}$ . (This is true whether $m_{R}\leq m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ or $m_{R}>m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ .) But, relative to the block-decomposition of matrices used in Lemma 5.2, the upper $(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ block of ${\sf E}_{J}$ is [math], and the lower $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ block is $I_{m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}$ . Hence, writing $\widetilde{W}_{*}$ for the $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\times(m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-})$ matrix formed by the last $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ rows of the first $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}$ columns of $\widetilde{W}$ , and noting that $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}=m_{R}-l_{+}$ (by (5.5)), we have $\widetilde{W}^{T}{\sf E}_{J}=\left[\begin{array}[]{c}\widetilde{W}_{*}^{T}\\ 0_{l_{+}\times m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}}\end{array}\right]$ , where the $i^{\rm th}$ row of the $(m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-})\times(p-m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ matrix $\widetilde{W}_{*}^{T}$ is a multiple of $w_{i}^{T}$ . Hence for $i,j\leq m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}=m_{R}-l_{+}$ ,

[TABLE]

and all other entries of the $m_{R}\times m_{R}$ matrix $\widetilde{W}^{T}{\sf E}_{J}(\widetilde{W}^{T}{\sf E}_{J})^{T}$ are 0. But for $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-l_{-}<i\leq m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ , we have $\lambda_{i}=1$ , so $\left((\widetilde{W}^{T}{\sf E}_{J})(\widetilde{W}^{T}{\sf E}_{J})^{T}\right)_{ij}=\frac{1-\lambda_{j}}{2}\delta_{ij}$ for all $i,j\leq m=\min\{m_{R},m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ . Thus the upper left-hand $m\times m$ block of $(\widetilde{W}^{T}{\sf E}_{J})(\widetilde{W}^{T}{\sf E}_{J})^{T}$ (the entire $m_{R}\times m_{R}$ matrix if $m_{R}\leq m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ ) is ${\rm diag}(\frac{1-\lambda_{1}}{2},\dots,\linebreak\frac{1-\lambda_{m}}{2})$ , so the numbers $\sqrt{\frac{1-\lambda_{j}}{2}},1\leq j\leq m,$ are the singular values of $\widetilde{W}^{T}{\sf E}_{J}$ . Thus, up to ordering, the principal angles $\{\phi_{J,i}\}$ are given by

[TABLE]

The bijection $\beta:J\to\{1,\dots,m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}\}$ used in the proof of Corollary 5.3 is simply the inverse of the map $j\mapsto i_{j}$ . Thus from (5.35), we have

[TABLE]

Combining (5.42) with (5.43),

[TABLE]

But $\tilde{\theta}_{i_{j}}\in[0,\pi]$ , so both $\phi_{J,j}$ and $\frac{\tilde{\theta}_{i_{j}}}{2}$ lie in $[0,\frac{\pi}{2}]$ . Hence (5.44) implies that $\phi_{J,j}=\tilde{\theta}_{i_{j}}/2$ , $1\leq j\leq m$ .

(ii) Assume $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=m_{R}$ ; then both equal $m$ . Corollary 5.3 then implies that

[TABLE]

*But from part (i) we have $\tilde{\theta}_{i_{j}}=2\phi_{J,j}$ for $1\leq j\leq m$ , so, using (5.34), $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)^{2}=4d_{Gr}(E_{-1}(R),{\bf R}^{J})^{2}$ , implying (5.40). ** ** ** ** *

**

We are now ready to establish the general half-angle relation:

Proof of Proposition 4.10. Let $U\in O(p)$ and let $T_{U}:{\bf R}^{p}\to{\bf R}^{p}$ be the corresponding orthogonal transformation. For any even $m^{\prime}>0$ and any $R\in{\rm Inv}_{m^{\prime}}(p)$ , we have

[TABLE]

*Now let $T:{\bf R}^{p}\to{\bf R}^{p}$ be an orthogonal transformation carrying $E_{-1}(R_{2})$ to a coordinate plane ${\bf R}^{J}$ , and let $U\in O(p)$ be the matrix for which $T=T_{U}$ . Then $UR_{2}U^{-1}=I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ , where ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}={\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}$ . For $i=1,2$ let $R^{\prime}_{i}=UR_{i}U^{-1}$ . Since $T$ is an orthogonal transformation, the (multi-)set of principal angles between $E_{-1}(R_{1}^{\prime})=T(E_{-1}(R_{1}))$ and $E_{-1}(R_{2}^{\prime})=T(E_{-1}(R_{2}))$ is identical to the (multi-)set of principal angles between $E_{-1}(R_{1})$ and $E_{-1}(R_{2})$ . But $R_{1}^{\prime}I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=R_{1}^{\prime}R_{2}^{\prime}=UR_{1}R_{2}U^{-1}$ , so ${\sf R}(\theta_{1},\dots,\theta_{\lceil p/2\rceil})$ is a normal form of $R_{1}^{\prime}I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ as well as of $R_{1}R_{2}$ . The result now follows from Corollary 5.4(i) and equation (5.36) (the latter being needed only for the final statement of the result). ** ** ** ** *

**

5.2 The proofs of Propositions

4.9 and 4.11

Proof of Proposition 4.9.* Since $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)=d_{SO}(R,I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}^{-1})=d_{SO}(R,I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ , conclusion (ii) of Corollary 5.4 can be written equivalently as:*

[TABLE]

Fix any $J\in{\cal J}_{m,p}$ . Letting “* $\cdot$ **” denote the natural left-action of $SO(p)$ on ${\rm Gr}_{m}({\bf R}^{p})$ , observe that, in the notation of the proof of Proposition 4.10), for all $U\in SO(p)$ and $W\in{\rm Gr}_{m}({\bf R}^{p})$ we have $\Phi(U{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}W)=U\Phi(W)U^{-1}$ (simply another way of writing (5.46).) Clearly $d_{Gr}$ is invariant under this action, and $d_{SO}$ is both left- and right-invariant, so (5.47) implies that*

[TABLE]

Now let $W,V\in{\rm Gr}_{m}({\bf R}^{p})$ . Since the action of $SO(p)$ on ${\rm Gr}_{m}({\bf R}^{p})$ is transitive, there exists $U\in SO(p)$ such that $U{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}{\bf R}^{J}=V.$ Using any such $U$ , we then have

[TABLE]

* ** ** ** *

**

Remark 5.5

Of course, Proposition 4.9 can be deduced from computations with the principal fibration

[TABLE]

the standard Riemannian metric on ${\rm Gr}_{m}({\bf R}^{p})$ (for which $d_{Gr}$ is the geodesic-distance function) is defined so as to make $\pi$ a Riemannian submersion up to a normalization constant. Our proof of Proposition 4.9 is independent of this Riemannian proof in the sense that it establishes equality between the left-hand side of (5.47) and the right-hand side as defined by equation (5.1). Without the a priori knowledge that $d_{Gr}$ is a geodesic-distance function, it is not obvious that $d_{Gr}$ satisfies the triangle inequality, hence whether $d_{Gr}$ is a metric. Thus Proposition 4.9 actually provides an independent proof that $d_{Gr}$ is a metric on ${\rm Gr}_{m}({\bf R}^{p})$ . The only use of Riemannian geometry in this proof is through the knowledge that $d_{SO}$ is, in fact, a metric (because it is a geodesic-distance function).

Proof of Proposition 4.11.* Let “Statement 1” and “Statement 2” be the statements listed as 1 and 2 in the Proposition. As noted in the proof of Proposition 4.9, $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)=d_{SO}(R,I{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})$ , so the inequality $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)<d_{SO}(R,I)$ can be rewritten as*

[TABLE]

*Assume first that Statement 1 is true. Let $W\in{\rm Gr}_{m}({\bf R}^{p})$ . Then $\Phi_{m,p}(W)$ is an involution of level $m$ , so there exists ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ of level $m$ such that $d_{SO}(\Phi_{m,p}(W),I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})^{2}<\frac{m\pi^{2}}{2}$ . Select such a * $\sigma$ ** and let $J=J^{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ . Then $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=\Phi_{m,p}({\bf R}^{J})$ , so

[TABLE]

Hence Statement 2 is true.

Conversely, assume that Statement 2 is true. Let $R\in{\rm Inv}_{m}(p)$ . Then there exists $J\in{\cal J}_{m,p}$ such that $d_{Gr}(\Phi_{m,p}^{-1}(R),{\bf R}^{J})^{2}<\frac{m\pi^{2}}{8}$ . Select such a $J$ and let ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}={\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}$ . Then $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=\Phi_{m,p}({\bf R}^{J})$ , so

[TABLE]

*Hence Statement 1 is true. ** ** ** ** *

**

6

Proofs of sign-change reducibility results, part I: Propositions 4.6 and 4.8

We are now ready to attack the question of sign-change reducibility: given $R\in{\rm Inv}(p)$ , can we find ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ such that $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)<d_{SO}(R,I)$ ? Equations (4.9) and (5.33) tell us that this inequality is satisfied if and only if

[TABLE]

*where $l_{\pm}=l_{\pm}(R,{\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})$ are as in Lemma 5.2(vii). Since $\pi$ is the largest possible value for a normal-form angle in (4.2), it is reasonable to try to look for a * $\sigma$ ** such that $l_{+}$ and $l_{-}$ are as small as possible. However, to achieve (6.1), we have to make sure that we do not make $\sum_{j\in J_{*}}\tilde{\theta}_{j}^{2}$ too large while we are making $l_{\pm}$ small. We next prove a lemma that, via its subsequent corollary, will help us show that for ${\rm level}(R)=m\geq\frac{p}{2}$ , we can choose $J\in{\cal J}_{m,p}$ to make $d_{SO}(RI_{{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}},I)$ as small as is needed to prove Proposition 4.6.

Lemma 6.1

For $1\leq m\leq p$ ,

[TABLE]

Proof*: For $J=(i_{1},\dots,i_{m})\in{\cal J}_{m,p}$ , we have*

[TABLE]

Hence when $m=1$ and when $m=p$ , the left-hand side of (6.2) reduces to $I_{p\times p}$ , which is also true of the right-hand side.

We proceed by induction on $p$ . For each $p\geq 1$ , consider the statement

[TABLE]

We have already established that (6.2) holds for $m=1=p$ , hence that statement $S(1)$ is true. Now suppose that $S(p)$ is true for some given $p$ . To consider $S(p+1)$ , let $\{{\bf e}_{i}\}_{i=1}^{p},\{{\bf e}_{i}^{\prime}\}_{i=1}^{p+1}$ denote the standard bases of ${\bf R}^{p},{\bf R}^{p+1}$ respectively. For $K=\{i_{1},\dots,i_{m}\}\in{\cal J}_{p+1,m}$ with $i_{1}<i_{2}<\dots<i_{m}$ we write $E^{\prime}_{K}$ for the $(p+1)\times m$ matrix whose $j^{\rm th}$ column is ${\bf e}^{\prime}_{i_{j}}$ , $1\leq j\leq m$ . Note that

[TABLE]

Hence for $1\leq m\leq p$ ,

[TABLE]

Hence (6.1) holds with $p$ replaced by $p+1$ , as long as $1\leq m\leq p$ . But we have already established that (6.1) holds whenever $m=p$ ; hence if $p$ is replaced by $p+1$ , the equality holds for $m=p+1$ . Thus (6.1) holds for all $m$ with $1\leq m\leq p+1$ ; i.e. statement $S(p+1)$ is true. By induction, $S(p)$ is true for all $p$ , which is exactly what the Lemma asserts.* ** ** ** *

**

Corollary 6.2

Let $m\in\{1,2,\dots,p\}$ and let $W\in{\rm Gr}_{m}({\bf R}^{p})$ . There exists $J\in{\cal J}_{m,p}$ such that

[TABLE]

Furthermore, the inequality in (6.13) is strict for some $J\in{\cal J}_{m,p}$ unless equality holds in (6.13) for all $J\in{\cal J}_{m,p}$ .

Proof*: Let $\widetilde{W}$ be any $p\times m$ matrix whose columns are an orthonormal basis of $W$ . Using Lemma 6.1,*

[TABLE]

since $\widetilde{W}^{T}\widetilde{W}=I_{m\times m}$ .

Since $|{\cal J}_{m,p}|={p\choose m}$ , the average of ${\rm tr}(\widetilde{W}^{T}{\sf E}_{J}{\sf E}_{J}^{T}\widetilde{W})$ over all $J\in{\cal J}_{m,p}$ is $m{p-1\choose m-1}/{p\choose m}=m^{2}/p$ . Hence ${\rm tr}(\widetilde{W}^{T}{\sf E}_{J}{\sf E}_{J}^{T}\widetilde{W})\geq m^{2}/p$ for at least one $J\in{\cal J}_{m,p}$ , and the inequality is strict for some $J$ unless it is an equality for all $J$ . But for any $Z\in{\rm Gr}_{m}({\bf R}^{p})$ , the principal angles $\phi_{1},\dots,\phi_{m}$ between $W$ and $Z$ are the numbers in $[0,\frac{\pi}{2}]$ for which $\cos\phi_{1},\dots,\cos\phi_{m}$ are the singular values of the $m\times m$ matrix $\widetilde{W}^{T}\widetilde{Z}$ , where $\widetilde{Z}$ is any $p\times m$ matrix whose columns are an orthonormal basis of $Z$ . Since for any $J\in{\cal J}_{m,p}$ the columns of ${\sf E}_{J}$ are an orthnormal basis of ${\bf R}^{J}$ , it follows that $\sum_{i=1}^{m}\cos^{2}\phi_{J,i}={\rm tr}(\widetilde{W}^{T}{\sf E}_{J}(\widetilde{W}^{T}{\sf E}_{J})^{T})={\rm tr}(\widetilde{W}^{T}{\sf E}_{J}{\sf E}_{J}^{T}\widetilde{W})$ . Thus, for some $J$ , $\sum_{i=1}^{m}\cos^{2}\phi_{J,i}\geq\frac{m^{2}}{p}$ , and the inequality is strict for some $J$ unless it is an equality for all $J$ . But for any given $J$ ,

[TABLE]

*and the first inequality in (6.14) is strict if and only if the second is strict. Thus (6.13) holds for some $J$ , and the inequality in (6.13) is strict for some $J$ unless it is an equality for all $J$ . ** ** ** ** *

**

Proof of Proposition 4.6.

If $m=p$ then $p$ is even, $R=-I$ , and for ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(-1,-1,\dots,-1)$ we have $I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=-I$ and $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)=0<d_{SO}(R,I)$ . Henceforth we assume $m<p$ .

Let $W=E_{-1}(R)$ and let $m=\dim(W)$ . Note that

[TABLE]

Let $J\in{\cal J}_{m,p}$ be such that $\sum_{i=1}^{m}\sin^{2}\phi_{J,i}=\min_{K\in{\cal J}_{m,p}}\{\sum_{i=1}^{m}\sin^{2}\phi_{K,i}\}$ . By Corollary 6.2, inequality (6.13) holds, and the inequality is strict unless

[TABLE]

for all $K\in{\cal J}_{m,p}$ . Let ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}={\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}^{J}$ . By Corollary 5.4,

[TABLE]

where $\phi_{J,i}=\phi_{J,i}(W)$ .

The function $f:x\mapsto\frac{\sin x}{x}$ is strictly decreasing on the interval $(0,\frac{\pi}{2}]$ . Hence for all $x\in(0,\frac{\pi}{2}]$ we have $\frac{\sin x}{x}\geq f(\frac{\pi}{2})=\frac{2}{\pi}$ , with equality only if $x=\frac{\pi}{2}$ ; thus for $x\in[0,\frac{\pi}{2}]$ we have $x\leq\frac{\pi}{2}\sin x$ , with equality only if $x=0$ or $x=\frac{\pi}{2}$ . Hence

[TABLE]

Hence $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})\leq d_{SO}(R,I)$ , and this inequality is strict if any of the inequalities (6.18), (6.19), (6) is strict. Inequality (6.18) is strict if $0<\phi_{J,i}<\frac{\pi}{2}$ for some $i$ , and, by our choice of $J$ , (6.19) is strict unless equality holds in (6.16) for all $K\in{\cal J}_{m,p}$ .

We claim that at least one of the inequalities (6.18), (6.19) is strict. Assume this is not so. Then, since equality holds in (6.18) with $J$ replaced by any $K\in{\cal J}_{m,p},$ it follows that for all $K\in{\cal J}_{m,p}$ and $i\in\{1,\dots,m\}$ the angle $\phi_{K,i}$ is either 0 or $\pi/2$ , and that $\sum_{i=1}^{m}\sin^{2}\phi_{K,i}=m(1-\frac{m}{p})$ for all $K$ . But for any $V\in{\rm Gr}_{m}({\bf R}^{p})$ , there always exists $K\in{\cal J}_{m,p}$ for which none of the principal angles $\phi_{K,i}(V,{\bf R}^{K})$ is $\pi/2$ . Choosing such $K$ for our $m$ -plane $W,$ all of the principal angles $\phi_{K,i}$ must therefore be 0 (since they are all either 0 or $\pi/2$ ). But then $\sum_{i=1}^{m}\sin^{2}\phi_{K,i}=0<m(1-\frac{m}{p})$ , a contradiction.

*Thus at least one of the inequalities (6.18), (6.19) is strict, so $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})<d_{SO}(R,I)$ . ** ** ** ** *

**

We will establish Proposition 4.8 (a weak version of Conjecture 4.7) as a consequence of a different weakened version of Conjecture 4.7:

Proposition 6.3

Let $m\geq 2$ be even, let $R\in SO(p)$ be an involution of level $m$ , and let ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}$ . If $d_{SO}(RI_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}},I)<d_{SO}(R,I)$ , then ${\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})<2m$ . (Hence if $R$ is sign-change reducible, then it is reducible by a sign-change of level less than $2m$ .)

This proposition, which we will prove this shortly, reduces Proposition 4.8 into a triviality:

Proof of Proposition 4.8, assuming Proposition 6.3:* The only positive even integer less than $2\times 2$ is 2.** ** ** ** *

**

Proof of Proposition 6.3. Let $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}={\rm level}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})$ . Define $l_{\pm}$ as in Lemma 5.2. From (5.33),

[TABLE]

so

[TABLE]

But by (5.5) we have $l_{-}=l_{+}+m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-m_{R}$ , so substituting into (6.21), we have $2l_{+}+m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}-m_{R}<m_{R}$ ; equivalently,

[TABLE]

*Since $l_{+}\geq 0$ , we must have $m_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}<2m_{R}$ . ** ** ** ** *

**

7 Proofs of sign-change reducibility results, part II: Proposition 3.18

As noted in Section 4.2, Proposition 4.6 proves part (a) of Proposition 3.18. Thus it remains only to prove part (b) of this Proposition.

The combination of Proposition 4.8 and Proposition 4.11 is what will guide our proof of part (b). To establish the result, it suffices to prove that for $p\geq 11$ , the answer to Question 4.5 is no—i.e. that there exist involutions in $SO(p)$ that are not sign-change reducible. Hence it suffices to prove that there exist such involutions of level 2. By Proposition 4.8, it therefore suffices to establish (for $p\geq 11$ ) the existence of involutions that are not sign-change reducible by a sign-change of level 2; thus it suffices to show that Statement 2 of Proposition 4.11 is false when $p\geq 11$ and $m=2$ . For this, we need only produce planes in ${\bf R}^{p}$ for which we can show that (4.15) is false for all $J\in{\cal J}_{2,p}$ . Towards this end, we examine two (families) of examples in which $m=2$ and $p\geq 4$ .

Example 7.1

Let $p=2k$ or $2k+1$ , where $k\geq 2$ . Define vectors $\hat{v},\hat{w}\in{\bf R}^{p}$ by

[TABLE]

The set $\{\hat{v},\hat{w}\}$ is orthonormal. Let $W_{p}={\rm span}\{\hat{v},\hat{w}\}$ , a 2-plane in ${\bf R}^{p}$ . We will compute the principal angles between $W_{p}$ and ${\bf R}^{J}$ for all $J\in{\cal J}_{2,p}$ . Write $J=\{i,j\}$ , where $1\leq i<j\leq p$ . Let $\widetilde{W}$ be the $p\times 2$ matrix whose first column is $\hat{v}$ and whose second column is $\hat{w}$ . Since the columns of $\widetilde{W}$ are an orthonormal basis of $W_{p}$ , the principal angles between $W_{p}$ and ${\bf R}^{J}$ are the arc-cosines of the singular values of $\widetilde{W}^{T}{\sf E}_{J}$ .

First suppose that $p$ is even. We divide the elements $\{i,j\}\in{\cal J}_{2,p}$ into two cases: Case I= $\{\{i,j\}:i<j\leq k\ \mbox{ or}\ k<i<j\}$ ; Case II= $\{\{i,j\}:i\leq k<j\}$ . The principal values of the $2\times 2$ matrix $\widetilde{W}^{T}{\sf E}_{J}$ are easily computed to be [math] and $\frac{4}{p}$ in Case I, and $\frac{2}{p}$ (with multiplicity 2) in Case II. Hence the principal angles are

[TABLE]

so

[TABLE]

We will return to (7.1) shortly, but first let us do the analogous computation for $p$ odd. For $p=2k+1$ , we divide the computation into three cases: Case I= $\{\{i,j\}:i<j\leq k\ \mbox{ or}\ k<i<j\leq 2k\}$ ; Case II= $\{\{i,j\}:i\leq k<j\leq 2k\}$ ; and Case III= $\{\{i,j\}:i\leq 2k,j=2k+1\}$ . The principal values of the matrix $\widetilde{W}^{T}{\sf E}_{J}$ are [math] and $\frac{2}{p}+\frac{2}{p-1}$ in Case I, $\frac{2}{p}$ and $\frac{2}{p-1}$ in Case II, and $\frac{2}{p-1}$ in Case III. Hence the principal angles are

[TABLE]

Clearly $\phi_{J,1}^{2}+\phi_{J,2}^{2}$ is larger in Case III than in Case II, so

[TABLE]

It follows from (7.1) and (7.2) that

[TABLE]

since $m=2$ in Example 7.1. Hence for large enough $p$ , Statement 2 in Proposition 4.11 is false, and therefore so is Statement 1. This already shows that for all $p$ sufficiently large, there exist geodesically antipodal pairs $(U,V)$ in $SO(p)\times SO(p)$ that are not sign-change reducible. However, to get the quantitative statement in Proposition 3.18(b), we have to continue working.

It can be shown444The authors did not find this exercise in Calculus 1 entirely trivial, but are nonetheless leaving it to the reader.* that for $0<x\leq 1$ ,*

[TABLE]

hence that in (7.1) in (7.2), the second of the two expressions being compared is the smaller. Thus

[TABLE]

Since $m=2$ in Example 7.1, $\sqrt{m\pi^{2}/8}=\frac{\pi}{2}$ , so equation (7.5) shows that (4.15) (with $W=W_{p}$ ) is false for all $J\in{\cal J}_{2,p}$ if $\sqrt{2}\cos^{-1}(c_{p})\geq\frac{\pi}{2}$ ; equivalently, if $c_{p}\leq\cos\frac{\pi}{2\sqrt{2}}\approx 0.4440.$ This translates to $2\lfloor\frac{p}{2}\rfloor\geq 2\sec^{2}\frac{\pi}{2\sqrt{2}}\approx 10.14$ . Hence the answer to Question 4.5 is definitely “no” for all $p\geq 12$ . To complete the proof of Proposition 3.18(b), it remains only to show that this “12” can be reduced to “11”. We will accomplish this with the next example.

Example 7.2

Let $p=2k+1$ , where $k\geq 2.$ Define vectors $v,w,\hat{v},\hat{w}\in{\bf R}^{p}$ by

[TABLE]

As in the previous example, $\{\hat{v},\hat{w}\}$ is an orthonormal basis of a plane $W_{p}^{\prime}$ . Just as in Example 7.1, we can compute the principal angles between $W_{p}^{\prime}$ and ${\bf R}^{J}$ for all $J\in{\cal J}_{2,p}$ . We define Cases I and II and III just as in the odd- $p$ case of the previous example. The principal values of the relevant $2\times 2$ matrices are [math] and $\frac{2}{p}+2/(p-1)$ in Case I, $\frac{2}{p}$ and $\frac{2}{p-1}$ in Case II, and

[TABLE]

in Case III. Hence

[TABLE]

Numerically, we find that for $p=11$ , the middle line of (LABEL:counterex-2_distsq) is the smallest of the three lines, so

[TABLE]

*Since this number is larger than $\frac{\pi^{2}}{4}$ , the answer to Question 4.5 is no for $p=11$ . This completes the proof of Proposition 3.18. ** ** ** ** *

**

Remarks 7.3

(1) We considered Example 7.2 only for odd $p$ because for even $p$ , the principal angles $\phi_{J,i}(W_{p}^{\prime})$ turn out to be the same as for $\phi_{J,i}(W_{p})$ in Example 7.1. In Example 7.2, we can also compute numerically that for $p=5,7$ , and $9$ , we have $\min_{J\in{\cal J}_{2,p}}\{d_{Gr}(W_{p}^{\prime},{\bf R}^{J})^{2}\}<\frac{\pi^{2}}{4}$ . However, we cannot conclude that the answer to Question 4.5 is “yes” for $p\leq 10$ , since we have not proven that this example represents the worst case, i.e. that $\min_{J\in{\cal J}_{2,p}}\{d_{Gr}(W_{p}^{\prime},{\bf R}^{J})\}\geq\min_{J\in{\cal J}_{2,p}}\{d_{Gr}(W,{\bf R}^{J})\}$ for all $W\in{\rm Gr}_{m}({\bf R}^{p})$ . Thus Question 4.5 remains open for $5\leq p\leq 10$ . However, based on computations, it seems likely to the authors that the largest $p$ for which the answer to Question 4.5 is yes is closer to 10 than to 4.

(2) The number $\frac{\pi^{2}}{2}$ in(7) is exactly the squared diameter of ${\rm Gr}_{2}({\bf R}^{p})$ for all $p\geq 4$ . Thus, (7) shows that as $p\to\infty$ , the distance between $W_{p}$ and the closest coordinate plane(s) ${\bf R}^{J}$ is approaching the largest possible distance between two points in ${\rm Gr}_{2}({\bf R}^{p})$ .

Appendix A Partitions and Fibers

A.1 Partitions and eigenstructure

The strata of each of the stratified spaces in this paper are labeled naturally either by ${\rm Part}(\{1,\dots,p\})$ or by ${\rm Part}(p).$

The natural left-action of the symmetric group $S_{p}$ on $\{1,2,\dots,p\}$ induces left-actions of $S_{p}$ on ${\rm Part}(\{1,\dots,p\})$ and ${\bf R}^{p}$ . There is a canonical bijection between the quotient ${\rm Part}(\{1,\dots,p\})/S_{p}$ and the set ${\rm Part}(p)$ , so we implicitly regard these as the same set. For ${\sf J}\in{\rm Part}(\{1,\dots,p\})$ , we write $[{\sf J}]$ for the image of ${\sf J}$ in ${\rm Part}(p)$ under the quotient map.

The sets ${\rm Part}(\{1,\dots,p\})$ and ${\rm Part}(p)$ are partially ordered by the refinement relation. For ${\sf J},{\sf K}\in{\rm Part}(\{1,\dots,p\})$ , we write ${\sf J}\leq{\sf K}$ if ${\sf K}$ refines ${\sf J}$ . Similarly, for $[{\sf J}],[{\sf K}]\in{\rm Part}(p)$ we write $[{\sf J}]\leq[{\sf K}]$ if $[{\sf K}]$ refines $[{\sf J}]$ . In each of these partially ordered sets there is a well-defined “highest” (most refined) and “lowest” (least refined) element; we denote these with the subscripts “top” and “bot” respectively.

Notation A.1

For $D={\rm diag}(d_{1},\dots,d_{p})\in{\rm Diag}(p)$ , let ${\sf J}_{D}$ denote the partition of $\{1,2,\dots,p\}$ determined by the equivalence relation $i\sim_{D}j\iff d_{i}=d_{j}$ .
For $\emptyset\neq J\subset\{1,2,\dots,p\}$ , let ${\bf R}^{J}\subset{\bf R}^{p}$ denote the subspace $\{(x_{1},\dots,x_{p})\in{\bf R}^{p}\mid x_{j}=0\ \forall j\notin J\}$ . For a partition ${\sf J}=\{J_{1},\dots,J_{r}\}$ of $\{1,2,\dots,p\}$ (where the $J_{i}$ are the blocks of ${\sf J}$ ), let $\{W_{1},\dots,W_{r}\}=\{W_{1}^{\sf J},\dots,W_{r}^{\sf J}\}\linebreak=\{{\bf R}^{J_{1}},\dots,{\bf R}^{J_{r}}\}$ denote the corresponding subspaces of ${\bf R}^{p}$ ; note that we have an orthogonal decomposition ${\bf R}^{p}={\bf R}^{J_{1}}\oplus\dots\oplus{\bf R}^{J_{r}}$ . Define the subgroup $G_{\sf J}\subset SO(p)$ by

[TABLE]

We write $G_{\sf J}^{0}$ for the identity component of $G_{\sf J}$ .**

As the reader may check, the above definition of $G_{\sf J}$ agrees with the definition in Section 2: for all $D\in{\rm Diag}(p)$ we have $G_{D}=G_{{\sf J}_{D}}$ .

For any subgroup $H\subset O(p)$ , we write $S(H)$ for $H\mbox{\small\$ \bigcap $\ }SO(p)$ . Note that

[TABLE]

where $O(W_{i})$ denotes the orthogonal group of the subspace $W_{i}$ , which we identify with a subgroup of $O(p)$ . Hence, writing $k_{i}=|J_{i}|$ , we have

[TABLE]

A.2 Signed permutations and signed-permutation matrices

Let ${\cal I}_{p}=({\bf Z}_{2})^{p}$ . The role of ${\bf Z}_{2}$ will be as the group of signs, so we write its elements as $\pm 1$ . We write the identity element of ${\cal I}_{p}$ as ${\bf 1}$ . For $\epsilon\in{\bf Z}_{2}$ and ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(\sigma_{1},\sigma_{2}\dots,\sigma_{p})\in{\cal I}_{p}$ we define $\epsilon{\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(\epsilon\sigma_{1},\epsilon\sigma_{2},\dots,\epsilon\sigma_{p})$ .

Both ${\cal I}_{p}$ and $S_{p}$ have natural representations on ${\bf R}^{p}$ via sign-changes and permutations of coordinates, respectively. These representations, which we denote respectively as ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\mapsto I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}$ and $\pi\mapsto P_{\pi}$ , embed ${\cal I}_{p}$ and $S_{p}$ as subgroups of $O(p)$ , together generating the group of “signed-permutation matrices”. Abstractly, this group is a semidirect product ${\tilde{S}}_{p}={\cal I}_{p}\rtimes S_{p}$ , a split extension of $S_{p}$ by ${\cal I}_{p}$ , embedded naturally in $O(p)$ via $({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)\mapsto I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}P_{\pi}$ . Defining homomorphisms ${\rm sgn}:{\cal I}_{p}\to{\bf Z}_{2}$ and $\widetilde{{\rm sgn}}:{\tilde{S}}_{p}\to{\bf Z}_{2}$ by ${\rm sgn}(\sigma_{1},\dots,\sigma_{p})=\prod_{i=1}^{p}\sigma_{i}$ and $\widetilde{{\rm sgn}}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)={\rm sgn}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}){\rm sgn}(\pi)$ (where ${\rm sgn}(\pi)$ is the sign of the permutation $\pi$ ), we have $\widetilde{{\rm sgn}}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)=\det(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}P_{\pi})$ . Thus the group ${\tilde{S}}_{p}^{+}$ of even signed-permutations, defined in Section 2, is simply the kernel of $\widetilde{{\rm sgn}}$ , and we have a short exact sequence

[TABLE]

Since ${\cal I}_{p}^{+}\cong({\bf Z}_{2})^{p-1}$ (non-canonically), ${\tilde{S}}_{p}^{+}$ is an extension of $S_{p}$ by $({\bf Z}_{2})^{p-1}$ , and $|{\tilde{S}}_{p}^{+}|=2^{p-1}p!$ .

The group ${\tilde{S}}_{p}$ is a well-studied group encountered in other settings (rather different from this paper’s) as $W(B_{p})$ , the Weyl group of the simple Lie algebra $B_{p}={\mathfrak{so}}(2p+1,{\bf C})$ **[13]**. Thus ${\tilde{S}}_{p}^{+}$ is an index-two subgroup of $W(B_{p})$ . The application to eigenstructure motivates viewing ${\tilde{S}}_{p}^{+}$ as an extension of $S_{p}$ : an element of ${\rm Sym}^{+}(p)$ determines an element of ${\rm Diag}^{+}(p)$ up to the action of $S_{p}$ , but this action does not lift canonically to a fiber-preserving action of $S_{p}$ on $M(p)$ (at least not for $p$ even; see below); we need to extend $S_{p}$ to a larger group to obtain such an action. For each $X\in{\rm Sym}^{+}(p)$ , the fiber ${\cal E}_{X}$ can be identified with positively oriented orthonormal $X$ -eigenbases of ${\bf R}^{p}$ ; the action of ${\tilde{S}}_{p}^{+}$ sends one such $X$ -eigenbasis to another.

A familiar index-two subgroup of ${\tilde{S}}_{p}$ different from ${\tilde{S}}_{p}^{+}$ is the kernel of the map $({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)\mapsto{\rm sgn}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}})$ . For $p\geq 4$ , the latter subgroup is the Weyl group $W(D_{p})$ of the simple Lie algebra $D_{p}={\mathfrak{so}}(2p,{\bf C})$ . However, the analog of (A.5) for $W(D_{p})$ splits for all $p$ , while (A.5) splits if and only if $p$ is odd. For $p$ odd, the map ${\tilde{S}}_{p}^{+}\to W(D_{p})$ defined by $({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)\mapsto({\rm sgn}({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}){\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)$ is an isomorphism, but it is known that for $p$ even, ${\tilde{S}}_{p}^{+}$ is not isomorphic to $W(D_{p})$ **[12, p. 151]**.

Remark A.2

For a subspace $W\subset{\bf R}^{p}$ and $\epsilon\in{\bf Z}_{2}=\{\pm 1\}$ , let $O_{\epsilon}(W)\subset O(W)$ denote the set of orthogonal transformations with determinant $\epsilon$ . In the setting of (A.2), the connected components of $G_{\sf J}$ are $O_{\epsilon_{1}}(W_{1})\times O_{\epsilon_{2}}(W_{2})\times\dots\times O_{\epsilon_{r}}(W_{r})$ , subject to the restriction $\prod_{i}\epsilon_{i}=1$ . Thus a labeling of the blocks of an $r$ -block partition ${\sf J}$ yields a 1-1 correspondence between ${\cal I}_{r}^{+}$ and the set of connected components of $G_{\sf J}$ . In particular, the number of connected components is $2^{r-1}$ . **

Identifying ${\rm Diag}(p)$ with ${\bf R}^{p}$ , the natural left-action of $S_{p}$ on ${\bf R}^{p}$ yields a left-action of $S_{p}$ on ${\rm Diag}(p)$ . For $D\in{\rm Diag}(p)$ , we will write $[D]$ for its image in the quotient space ${\rm Diag}(p)/S_{p}$ .

Note that the action of $S_{p}$ on ${\rm Diag}^{+}(p)\subset{\rm Diag}(p)$ lifts to an action of ${\tilde{S}}_{p}$ on ${\rm Diag}^{+}(p)$ ,

[TABLE]

It is easily seen that $P_{g}DP_{g}^{-1}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D$ for all $g\in{\tilde{S}}_{p},D\in{\rm Diag}(p)$ .

A.3 Structure of the fibers

The starting point for a systematic description of the fibers of $F$ is the following proposition. The group-action notation is as in (3.6).

Proposition A.3

Let $X\in{\rm Sym}^{+}(p)$ and $(U,D)\in{\cal E}_{X}=F^{-1}(X)$ . Then

[TABLE]

Proof*: This is a simple corollary of [10, Theorem 3.3]. Details are left to the reader. ** ** ** ** *

**

Corollary A.4

Let $X\in{\rm Sym}^{+}(p)$ and $(U,D)\in{\cal E}_{X}$ . Then

[TABLE]

Proof*: Clearly the right-hand side of (A.8) is contained in the right-hand side of (A.7), so it suffices to prove the opposite inclusion.*

Let $R\in G_{D},g\in{\tilde{S}}_{p}^{+}$ . Enumerate the blocks of ${\sf J}:={\sf J}_{D}$ as $J_{1},\dots,J_{r},$ and let $W_{i}$ be as in Notation A.1. As noted in Remark A.2, the enumeration of the blocks of ${\sf J}$ yields a 1-1 correspondence between ${\cal I}_{r}^{+}$ and the connected components of $G_{\sf J}$ . Let $R$ lie in the component of $G_{\sf J}$ labeled by $(\epsilon_{1},\dots,\epsilon_{r})\in{\cal I}_{r}^{+}$ . The cardinality of $\{j:\epsilon_{j}=-1\}$ is some even number $k$ . Let ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}$ , where for $1\leq i\leq p$ we set

[TABLE]

*Then $R_{1}:=I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}R\in G_{\sf J}^{0}$ . But also $\left|\{i:\sigma_{i}=-1\}\right|=k$ , so ${\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{p}^{+}\subset{\tilde{S}}_{p}^{+}$ , and $P_{g}I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}=P_{g_{1}}$ for some $g_{1}\in{\tilde{S}}_{p}^{+}$ with $\pi_{g_{1}}=\pi_{g}$ . Hence $P_{g}R=(P_{g}I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}})(I_{\mbox{\scriptsize\boldmath$ \sigma $\unboldmath}\mbox{}}R)=P_{g_{1}}R_{1}$ , so $(U(P_{g}R)^{-1},\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D)=(U(P_{g_{1}}R_{1})^{-1},\pi_{g_{1}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}D),$ which lies in the right-hand side of (A.8). The desired inclusion follows. ** ** ** ** *

**

To complete our characterization of the fibers of $F$ , we introduce one more bit of notation:

Notation A.5

For ${\sf J}=\{J_{1},\dots,J_{r}\}\in{\rm Part}(\{1,\dots,p\})$ , define

[TABLE]

(a subgroup of ${\cal I}_{p}^{+}$ ).

The groups ${\cal I}_{\sf J}^{+}$ generalize ${\cal I}_{p}^{+}$ ; we have ${\cal I}_{{\sf J}_{\rm top}}^{+}={\cal I}_{p}^{+}$ . Observe that an equivalent definition of the group $\Gamma_{\sf J}^{0}$ defined in Notation 2.1 is $\Gamma_{\sf J}^{0}=\{({\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}},\pi)\in{\tilde{S}}_{p}^{+}:{\mbox{\boldmath$ \sigma $\unboldmath}\mbox{}}\in{\cal I}_{\sf J}^{+},\pi\in K_{\sf J}\}$ . Thus, analogously to (A.5), we have a short exact sequence

[TABLE]

Next, observe that the action (3.6) of ${\tilde{S}}_{p}^{+}$ on $M(p)$ induces, for each $X\in{\rm Sym}^{+}(p)$ , an action of ${\tilde{S}}_{p}^{+}$ on ${\rm Comp}({\cal E}_{X})$ , given by

[TABLE]

This leads us to:

Proposition A.6

Let $X\in{\rm Sym}^{+}(p)$ . Then every $(U,D)\in{\cal E}_{X}$ determines a bijection between ${\rm Comp}({\cal E}_{X})$ and the set ${\tilde{S}}_{p}^{+}/\Gamma^{0}_{{\sf J}_{D}}$ .

Proof*: Two elements $(U,D),(U^{\prime},D^{\prime})$ lie in the same component of ${\cal E}_{X}$ if and only if and only if $D^{\prime}=D$ and $U^{\prime}=UR$ for some $R\in G_{D}^{0}$ . Thus it is clear from (A.8) that the action (A.11) of ${\tilde{S}}_{p}^{+}$ on ${\rm Comp}({\cal E}_{X})$ is transitive. Therefore for any $(U,D)\in{\cal E}_{X}$ , the map ${\tilde{S}}_{p}^{+}\to{\rm Comp}({\cal E}_{X}),g\mapsto g{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}[(U,D)]$ , induces a bijection ${\tilde{S}}_{p}^{+}/{\rm Stab}([(U,D)])\to{\rm Comp}({\cal E}_{X})$ , where ${\rm Stab}([(U,D)])$ is the stabilizer of $[(U,D)]$ under the action (A.11). But, as is easily checked, ${\rm Stab}[(U,D)]$ is exactly the group $\Gamma_{{\sf J}_{D}}^{0}$ . ** ** ** ** *

**

An important special case of Proposition A.6 is the case in which all eigenvalues of $X$ are distinct. In this case, ${\sf J}_{D}={{\sf J}_{\rm top}}=\{\{1\},\{2\},\dots,\{p\}\}$ and $\Gamma^{0}_{{\sf J}_{D}}=\{{\rm id.}\}$ . Thus the action of ${\tilde{S}}_{p}^{+}$ on ${\rm Comp}({\cal E}_{X})$ is free as well as transitive. Furthermore $G_{D}^{0}=\{I\}$ , so each connected component of ${\cal E}_{X}$ is a single point; ${\rm Comp}({\cal E}_{X})={\cal E}_{X}$ . Thus ${\cal E}_{X}$ itself is an orbit of ${\tilde{S}}_{p}^{+}$ , and any choice of $(U,D)\in{\cal E}_{X}$ yields a bijection ${\tilde{S}}_{p}^{+}\to{\cal E}_{X}$ , $g\mapsto g{\mbox{ \raisebox{-1.084pt}{\bf{\Large$ \cdot $}}}}(U,D)$ .

Corollary A.7

Let $X\in{\rm Sym}^{+}(p)$ , $(U,D)\in{\cal E}_{X}$ , and let $k_{1},\dots k_{r}$ be the parts of the partition $[{\sf J}_{D}]$ of $p$ . Then ${\cal E}_{X}$ is diffeomorphic to a disjoint union of $2^{r-1}\frac{p!}{k_{1}!k_{2}!\dots k_{r}!}$ copies of $SO(k_{1})\times SO(k_{2})\times\dots\times SO(k_{r})$ .

Proof*: Let ${\sf J}={\sf J}_{D}$ . It is clear from (2.1) that each connected component of ${\cal E}_{X}$ is a submanifold of $M(p)$ diffeomorphic to $G_{D}^{0}=G_{\sf J}^{0}$ , which from (A.4) is isomorphic (hence diffeomorphic) to $SO(k_{1})\times SO(k_{2})\times\dots\times SO(k_{r})$ . From Proposition A.6, the number of connected components is $|{\tilde{S}}_{p}^{+}/\Gamma^{0}_{{\sf J}_{D}}|=|{\tilde{S}}_{p}^{+}|/|\Gamma^{0}_{{\sf J}_{D}}|$ . As noted earlier, $|{\tilde{S}}_{p}^{+}|=2^{p-1}p!$ , while from (A.10) we have $|\Gamma^{0}_{{\sf J}_{D}}|=|{\cal I}_{\sf J}^{+}|\,|K_{\sf J}|$ . It is easily seen that ${\cal I}_{\sf J}^{+}$ is isomorphic to $({\bf Z}_{2})^{p-r}$ , and that $K_{\sf J}$ is isomorphic to $S_{k_{1}}\times S_{k_{2}}\times\dots\times S_{k_{r}}$ , and hence that $|K_{\sf J}|=k_{1}!k_{2}!\dots k_{r}!$ . The result follows. ** ** ** ** *

**

Remark A.8

An alternate, instructive route to Corollary A.7 is the following. (We merely sketch the ideas; the reader may fill in the details.) For ${\sf J}\in{\rm Part}(\{1,\dots,p\})$ , define ${\cal Q}_{\sf J}=\{P_{g}R:g\in{\tilde{S}}_{p}^{+},R\in G_{\sf J}\}\subset SO(p)$ . Thus the set ${\cal Q}_{\sf J}$ is a finite union of left-cosets of $G_{\sf J}$ , each of which is diffeomorphic to the compact submanifold $G_{\sf J}\subset SO(p)$ . If $X\in{\rm Sym}^{+}(p),$ $(U,D)\in{\cal E}_{X}$ , and ${\sf J}={\sf J}_{D}$ , the map ${\cal Q}_{\sf J}\to M(p)$ , $Q\mapsto(UQ^{-1},QDQ^{-1})$ , is an embedding with image ${\cal E}_{X}$ . Hence ${\cal E}_{X}$ is a submanifold of $M(p)$ diffeomorphic to ${\cal Q}_{\sf J}$ . But for any closed subgroups $H_{1},H_{2}$ of a compact Lie group $G$ , the set $H_{1}H_{2}:=\{h_{1}h_{2}:h_{1}\in H_{1},h_{2}\in H_{2}\}\subset G$ is a submanifold of $G$ and a principal $H_{2}$ -bundle over $H_{1}/(H_{1}\mbox{\small\$ \bigcap $\ }H_{2})$ , with projection map given by $h_{1}h_{2}\mapsto h_{1}(H_{1}\mbox{\small\$ \bigcap $\ }H_{2})$ . Applying this to the case $H_{1}={\tilde{S}}_{p}^{+},H_{2}=G_{\sf J}$ , $G=SO(p)$ , we have $H_{1}\mbox{\small\$ \bigcap $\ }H_{2}=\Gamma_{\sf J}$ , so ${\cal Q}_{\sf J}$ is a principal $G_{\sf J}$ -bundle over the finite set ${\tilde{S}}_{p}^{+}/\Gamma_{\sf J}$ . But the natural map ${\tilde{S}}_{p}^{+}/\Gamma_{\sf J}\to S_{p}/K_{\sf J},\ g\Gamma_{\sf J}\mapsto{\rm proj}_{2}(g)K_{\sf J}$ (where ${\rm proj}_{2}$ is as in (A.5)), is a bijection, so ${\cal Q}_{\sf J}$ may be viewed as a principal $G_{\sf J}$ -bundle over $S_{p}/K_{\sf J}$ . The cardinality of this base-space is $|S_{p}|/|K_{\sf J}|$ , which is simply the multinomial coefficient $\frac{p!}{k_{1}!k_{2}!\dots k_{r}!}$ if $[{\sf J}]=(k_{1},\dots,k_{r})\in{\rm Part}(p)$ . Thus ${\cal E}_{X}$ is diffeomorphic to $\frac{p!}{k_{1}!k_{2}!\dots k_{r}!}$ copies of $G_{\sf J}$ , and each copy of $G_{\sf J}$ is diffeomorphic to $2^{r-1}$ copies of $SO(k_{1})\times\dots\times SO(k_{r})$ . **

Appendix B Stratification of ${\rm Sym}^{+}(p)$ , $M(p)$ , and related spaces

We provide here a brief outline of the stratifications relevant to this paper. For a more detailed discussion, see **[7, Section 2.7]**.

As noted in Section 2, $SO(p)$ acts on ${\rm Sym}^{+}(p)$ via $(U,X)\mapsto UXU^{T}.$ As with any group-action, elements $X,Y\in{\rm Sym}^{+}(p)$ are said to have the same orbit type if their stabilizers are conjugate; in this case the fibers ${\cal E}_{X},{\cal E}_{Y}$ are diffeomorphic. The orbit-type stratification of any manifold under the action of a compact Lie group is known to be a Whitney stratification (**[5, p. 21]**).

We use ${\rm Part}(\{1,\dots,p\})$ to define stratifications of the spaces ${\rm Diag}^{+}(p)$ and $M(p)$ , and use ${\rm Part}(p)$ to define stratifications of ${\rm Diag}^{+}(p)/S_{p}$ and ${\rm Sym}^{+}(p)$ . The commutative diagram in Figure 1 indicates the relationships among these spaces and label-sets. We define strata as the diagram suggests: for ${\sf J}\in{\rm Part}(\{1,\dots,p\})$ and $[{\sf K}]\in{\rm Part}(p)$ , (i) ${\cal D}_{\sf J}:={\rm lbl}^{-1}({\sf J})\subset{\rm Diag}^{+}(p)$ , (ii) ${\cal D}_{[{\sf K}]}:=\overline{{\rm lbl}}^{\,-1}([{\sf K}])\subset{\rm Diag}^{+}(p)/S_{p}$ , (iii) ${\cal S}_{\sf J}:={\rm proj}_{2}^{-1}({\cal D}_{\sf J})=SO(p)\times{\cal D}_{\sf J}\subset M(p),$ and (iv) ${\cal S}_{[{\sf K}]}:=\overline{{\rm proj}_{2}}^{\,-1}({\cal D}_{[{\sf K}]})$ . The maps ${\rm lbl},\overline{{\rm lbl}}$ label elements of ${\rm Diag}^{+}(p),\linebreak{\rm Diag}^{+}(p)/S_{p}$ by partitions of the set $\{1,\dots,p\}$ and the integer $p$ , respectively; ${\rm proj}_{2}:M(p)=SO(p)\times{\rm Diag}^{+}(p)\to{\rm Diag}^{+}(p)$ is projection onto the second factor; and $\overline{{\rm proj}_{2}}$ is the map induced by ${\rm proj}_{2}$ on the indicated quotients.

For $X\in{\cal S}_{[{\sf K}]},$ we may call the partition $[{\sf K}]\in{\rm Part}(p)$ the eigenvalue-multiplicity type of $X$ . The stratification of ${\rm Sym}^{+}(p)$ by eigenvalue-multiplicity type is identical to the orbit-type stratification.

In any stratified space, there is a natural partial ordering $\leq$ on the set of strata ${\cal T}_{i}$ defined by declaring ${\cal T}_{1}\leq{\cal T}_{2}$ if ${\cal T}_{1}\subset\overline{{\cal T}_{2}}$ . Using this partial ordering of strata for the spaces in the left-hand square in Figure 1, it is easily checked that all the maps in Figure 1 are either order-preserving themselves (in the case of ${\rm quo}_{2}$ ) or induce order-preserving maps on the corresponding sets of strata (in the case of all the other maps). In particular, each of the stratified spaces in the left-hand square in Figure 1 has a top stratum and a bottom stratum.

References

[1]

L. J. Billera, S. P. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees, Adv. in Appl. Math. 27 (4) (2001) 733–767. *

[2]

J. Cheeger, D. G. Ebin, Comparison Theorems in Riemannian Geometry, North Holland/American Elsevier, Amsterdam, 1975. *

[3]

J. Damon, J. Marron, Backwards principal component analysis and principal nested relations, J. Math. Imaging and Vision 50 (1) (2014), 107–114. *

[4]

A. Edelman, T. A. Arias, S. T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (2) (1998) 303–353. *

[5]

C. G. Gibson, K. Wirthmüller, A. A. du Plessis, E. J. N. Looijenga, Topological Stability of Smooth Mappings, Lecture Notes in Mathematics, Vol. 552, Springer-Verlag, Berlin, 1976. *

[6]

G. H. Golub, C. F. Van Loan, Matrix Computations, 2nd edition, The Johns Hopkins University Press, 1989. *

[7]

D. Groisser, S. Jung, A. Schwartzman, Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: minimal smooth scaling-rotation curves in low dimensions, Electronic J. Stat. 11 (1), 1092–1159. *

[8]

D. Groisser, S. Jung, A. Schwartzman, A scaling-rotation metric on the space of symmetric positive-definite matrices, in preparation. *

[9]

T. Hotz, S. Huckemann, H. Le, J. S. Marron, J. C. Mattingly, E. Miller, J. Nolen, M. Owen, V. Patrangenaru, S. Skwerer, Sticky central limit theorems on open books, Ann. Appl. Prob. 23 (6) (2013) 2238–2258. *

[10]

S. Jung, A. Schwartzman, D. Groisser, Scaling-rotation distance and interpolation of symmetric positive-definite matrices, Siam J. Matrix Anal. Appl., 36 (3) (2015) 1180–1201. *

[11]

D. G. Kendall, D. Barden, T. K. Carne, H. Le, *Shape and Shape Theory, Wiley Series in Probability and Statistics, John Wiley & Sons Ltd., Chichester, 1999.

[12] H. Pahlings, Characterization of groups by their character tables, Comm. Alg. 4 (2) (1976), 111–153.
[13] H. Samelson, Notes on Lie Algebras, Van Nostrand Reinhold Company, 1969.
[14]

A. Schwartzman, Random ellipsoids and false discovery rates: statistics for diffusion tensor imaging data, Ph.D. thesis, Stanford University (2006).

[15]

A. Schwartzman, W. F. Mascarenhas, J. E. Taylor, Inference for eigenvalues and eigenvectors of Gaussian symmetric matrices, Ann. Statist. 36 (6) (2008) 2886–2919.

[16]

Y.-C. Wong, Differential geometry of Grassmann manifolds, Proc. Nat. Acad. Sci. U.S.A. 57 (1967) 589–594.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. J. Billera, S. P. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees , Adv. in Appl. Math. 27 (4) (2001) 733–767. · doi ↗
2[2] J. Cheeger, D. G. Ebin, Comparison Theorems in Riemannian Geometry , North Holland/American Elsevier, Amsterdam, 1975.
3[3] J. Damon, J. Marron, Backwards principal component analysis and principal nested relations, J. Math. Imaging and Vision 50 (1) (2014), 107–114.
4[4] A. Edelman, T. A. Arias, S. T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (2) (1998) 303–353.
5[5] C. G. Gibson, K. Wirthmüller, A. A. du Plessis, E. J. N. Looijenga, Topological Stability of Smooth Mappings , Lecture Notes in Mathematics, Vol. 552, Springer-Verlag, Berlin, 1976.
6[6] G. H. Golub, C. F. Van Loan, Matrix Computations , 2nd edition, The Johns Hopkins University Press, 1989.
7[7] D. Groisser, S. Jung, A. Schwartzman, Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: minimal smooth scaling-rotation curves in low dimensions, Electronic J. Stat. 11 (1) , 1092–1159.
8[8] D. Groisser, S. Jung, A. Schwartzman, A scaling-rotation metric on the space of symmetric positive-definite matrices, in preparation.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Uniqueness questions in

Abstract

keywords:

MSC:

1 Introduction

2 Notational preliminaries

Notation 2.1

3 The scaling-rotation framework and some results

3.1 Smooth scaling-rotation curves

Definition 3.1

Proposition 3.2

3.2

Definition 3.3** ([10, Definition 3.10])**

Definition 3.4

Remark 3.5

Definition 3.6

Proposition 3.7

Proposition 3.8

Remark 3.9

3.3 Geodesic antipodality and two types of non-uniqueness

Definition 3.10

Proposition 3.11

Notation 3.12

Theorem 3.13** ([10, Theorem 3.8])**

Lemma 3.14

Lemma 3.15

3.4 Type I and Type II non-uniqueness

Definition 3.16

Corollary 3.17

Proposition 3.18

Corollary 3.19

Proposition 3.20

Corollary 3.21

Lemma 3.22

4 Involutions, sign-change reducibility, and distance between subspaces of Rp{\bf R}^{p}Rp

Definition 4.1

Remark 4.2** (Involutions and Grassmannians)**

4.1 Normal form and

Remark 4.3** (Normal form, involutions, and distances to identity)**

Notation 4.4

4.2 Sign-change reducibility, distances in Grassmannians, and a half-angle relation

Question 4.5

Proposition 4.6

Conjecture 4.7

Proposition 4.8

Proposition 4.9

Proposition 4.10

Proposition 4.11

Remark 4.12

5 Proofs of the half-angle relation

5.1 The half-angle relation

Notation 5.1

Lemma 5.2

Corollary 5.3

Corollary 5.4

5.2 The proofs of Propositions

Remark 5.5

6

Lemma 6.1

Corollary 6.2

Proposition 6.3

7 Proofs of sign-change reducibility results, part II: Proposition 3.18

Example 7.1

Example 7.2

Remarks 7.3

Appendix A Partitions and Fibers

A.1 Partitions and eigenstructure

Notation A.1

A.2 Signed permutations and signed-permutation matrices

Remark A.2

A.3 Structure of the fibers

Proposition A.3

Corollary A.4

Notation A.5

Definition 3.3 ([10, Definition 3.10])

Theorem 3.13 ([10, Theorem 3.8])

4 Involutions, sign-change reducibility, and distance between subspaces of ${\bf R}^{p}$

Remark 4.2 (Involutions and Grassmannians)

Remark 4.3 (Normal form, involutions, and distances to identity)

Appendix B Stratification of ${\rm Sym}^{+}(p)$ , $M(p)$ , and related spaces