Uniqueness questions in a scaling-rotation geometry on the space of symmetric positive-definite matrices
David Groisser, Sungkyu Jung, Armin Schwartzman

TL;DR
This paper investigates the geometric structure of the space of symmetric positive-definite matrices, focusing on the uniqueness of minimal scaling-rotation curves and their relation to Grassmannian geometry, with results for dimensions up to 4 and from 11 onwards.
Contribution
It characterizes conditions for the uniqueness of minimal scaling-rotation curves in the eigen-decomposition geometry of SPD matrices, linking fiber structure and Grassmannian geometry.
Findings
Characterizes when MSSR curves are unique for given matrices.
Provides results on Grassmannian geometry for p ≤ 4 and p ≥ 11.
Introduces a half-angle formula for principal angles between subspaces.
Abstract
Jung et al. (2015) introduced a geometric structure on , the set of symmetric positive-definite matrices, based on eigen-decomposition. Eigenstructure determines both a stratification of , defined by eigenvalue multiplicities, and fibers of the "eigen-composition" map . When is equipped with a suitable Riemannian metric, the fiber structure leads to notions of scaling-rotation distance between , the distance in between fibers and , and minimal smooth scaling-rotation (MSSR) curves, images in of minimal-length geodesics connecting two fibers. In this paper we study the geometry of the triple , focusing on some basic questions: For which is there a unique MSSR curve from to ?…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Uniqueness questions in
a scaling-rotation geometry on the space of symmetric positive-definite matrices
David Groisser
Sungkyu Jung
Armin Schwartzman
Department of Mathematics, University of Florida, Gainesville, FL 32611, USA
Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15260, USA
Division of Biotatistics, University of California, San Diego, CA 92903, USA
Abstract
Jung et al. (2015) introduced a geometric structure on , the set of symmetric positive-definite matrices, based on eigen-decomposition. Eigenstructure determines both a stratification of , defined by eigenvalue multiplicities, and fibers of the “eigen-composition” map . When is equipped with a suitable Riemannian metric, the fiber structure leads to notions of scaling-rotation distance between , the distance in between fibers and , and minimal smooth scaling-rotation (MSSR) curves, images in of minimal-length geodesics connecting two fibers. In this paper we study the geometry of the triple , focusing on some basic questions: For which is there a unique MSSR curve from to ? More generally, what is the set of MSSR curves from to ? This set is influenced by two potential types of non-uniqueness. We translate the question of whether the second type can occur into a question about the geometry of Grassmannians , with even, that we answer for and . Our method of proof also yields an interesting half-angle formula concerning principal angles between subspaces of whose dimensions may or may not be equal. The general- results concerning MSSR curves and scaling-rotation distance that we establish here underpin the explicit results in Groisser et al. (2017). Addressing the uniqueness-related questions requires a thorough understanding of the fiber structure of , which we also provide.
keywords:
eigen-decomposition , stratified spaces , scaling-rotation distance , signed-permutation group , geometric structures on quotient spaces , principal angles , geometry of Grassmannians
MSC:
[2010] 53C99 , 53C15 , 57R15 , 53C22 , 51F25 , 15A18 , 58A35
††journal: arXivmytitlenotemytitlenotefootnotetext: This work was supported by NIH grant R21EB012177 and NSF grant DMS-1307178.
1 Introduction
In this work, we investigate a geometric structure on , the set of symmetric positive-definite (SPD) matrices, , and special curves that this structure gives rise to. Both the geometric structure and these special curves are built from eigen-decomposition of SPD matrices.
Let denote the set of diagonal matrices with positive diagonal entries. By an (orthonormal) eigen-decomposition of we will mean a pair such that . The space of such decompositions,
[TABLE]
thus comes naturally equipped with a smooth surjective map defined by
[TABLE]
For each we call the set the fiber over . However, is not a fiber bundle over with projection ; the map is not even a submersion. (Rather, the relation of to is reminiscent of the notion of blow-up in algebraic geometry: can be viewed as a sort of blow-up of along several subvarieties.) The natural action , , endows with a stratification by orbit-type, and the derivative of is nonsingular only on the pre-image of the“top” stratum. This stratification is identical to the stratification by “eigenvalue-multiplicity type”, in which the strata are labeled by partition of the integer . Eigenvalue multiplicities also determine a more refined stratification of the space , in which the strata are labeled by partitions of the set . Appendix B reviews these stratifications.
The fiber structure of formalizes the notion of minimal smooth scaling-rotation curves [10]. In 2006, motivated by applications to diffusion-tensor imaging, Schwartzman [14] introduced smooth scaling-rotation curves as a way of interpolating between SPD matrices in such a way that eigenvectors and eigenvalues both change at uniform speed. Minimal smooth scaling-rotation curves were defined in [10] as smooth curves of shortest length as determined by an appropriate Riemannian metric on —curves that minimize a suitable measure of the amount of scaling and rotation needed to transform one SPD matrix into another.
More precisely, each factor of is a Lie group, and for our Riemannian metric on we take a product metric determined by choosing bi-invariant metrics on the factors. We define smooth scaling-rotation (SSR) curves in to be the projections to of geodesics in . In this scaling-rotation framework, the “distance” between any two matrices is defined to be the distance between the fibers and (nonzero if since each fiber is compact). We use the term -minimal geodesic for a minimal-length geodesic connecting two fibers and , and minimal pair for the pair of endpoints of such a geodesic. A minimal smooth scaling-rotation (MSSR) curve is the image under of an -minimal geodesic.
As shown in [10], restricts to a metric on the top stratum of , but is not a metric on all of . In [8], we show that generates a true metric on and investigate features of this metric. But fully understanding the geometry of the metric relies on first understanding MSSR curves, the function , and related issues we address in the present article.
This paper is devoted primarily to uniqueness-related issues that arise in studying MSSR curves, and to some unanticipated geometric results (described in more detail below), potentially of independent interest, that were discovered as a result of studying these issues.
A thorough understanding of the fibers of is key to analyzing several features of the scaling-rotation framework, including these uniqueness-related issues. Appendix A provides a thorough picture of the fiber structure of , including its inextricable tie to the group of “even signed-permutations”, a group not to be confused with a more familiar group of the same order and similar-sounding description in terms of signs and permutations, the Weyl group of the simple Lie algebra . Some results proven in Appendix A are applied earlier in the main body of this paper, and some were previously stated without proof in [7] and applied there.
The uniqueness-related results in this paper contribute to a rigorous and systematic description of the geometry and topology of the triple , and to a firm foundation for further study of the scaling-rotation framework, such as in [7] and [8].
Some of the uniqueness issues we study are related directly to (non-)uniqueness of MSSR curves themselves, while others are related more directly to (non-)uniqueness of minimal pairs. It is easy to see that for all , at least one MSSR curve from to exists; however, such curves are not always unique. The dependence on and of the set of such curves is quite intricate, and relates strongly to the stratified nature of . Non-uniqueness issues for minimal pairs are important because computing and MSSR curves from to requires finding minimal pairs in . Even when the resulting MSSR curve from to is unique, minimal pairs in are never unique, because acts on in a nontrivial isometric, fiber-preserving fashion. This action carries minimal pairs to minimal pairs. For some , there are also minimal pairs that are not related to each other by this action.
The broad structure of this paper is as follows. Section 2 establishes notation. Sections 3 and 4 contain the statements of most of our main results, which we will describe below, and those proofs that can be given quickly. The proofs of many of our results—especially the “bonus” results that are applicable outside the scaling-rotation framework entirely—are quite long; these occupy Sections 5–7.
We devote the remainder of this introduction to a more detailed outline of the paper, and more detailed descriptions of the questions we study and the results we achieve.
In Section 3.1 we review the basics of SSR curves, before restricting attention to MSSR curves in Section 3.2 and beyond. In Section 3.2 we discuss the computational-complexity problem arising from the non-uniqueness of minimal pairs. Proposition 3.7 takes advantage of the -action by using double-cosets in to reduce the complexity of computing , of characterizing all minimal pairs, and of finding all MSSR curves from to . Proposition 3.7 was applied in [7] to help derive closed-form formulas for and MSSR curves for . Also discussed and proven in Section 3.2 is a general result about scaling-rotation curves: All such curves are either constant-maps or immersions. This result is important for an understanding of MSSR curves.
In Section 3.3, we begin to address uniqueness questions for MSSR curves, the most basic of which is: under what conditions on is there more than one MSSR curve from to ? A more refined version of this question is: for each pair , what is the set explicitly? By characterizing all minimal pairs, Proposition 3.7 provides a starting point for answering this question. Among this proposition’s outcomes is also the fact that, for each pair , every MSSR curve from to is represented by a minimal pair whose first point lies in any given connected component of . But to completely understand —or even just determine its cardinality—we still need a way to tell whether MSSR curves corresponding to two (not necessarily distinct) minimal pairs with first point in a given connected component of are the same. Proposition 3.11 gives a necessary and sufficient criterion. This result was applied in [7], where it enabled an explicit computation of the sets for when and do not both lie in the top stratum.
In Section 3.3 we also define two different ways that non-uniqueness of MSSR curves can occur. Given , for there to be more than one MSSR curve from to , there must exist distinct shortest-length geodesics from to such that . There are essentially two ways, not mutually exclusive, that this can happen: (i) there can exist such () whose endpoint-pairs are distinct minimal pairs, and (ii) there exist such whose endpoint-pairs are the same minimal pair. We call these possibilities “Type I” and “Type II” non-uniqueness, respectively. Proposition 3.11 applies to both.
The study of Type II non-uniqueness, which we begin in Section 3.4, turns out to be especially fruitful. A minimal pair has more than one minimal geodesic connecting its points if and only if the pair is geodesically antipodal (Definition 3.10), which is equivalent to being an involution. Our chief tool for determining whether such minimal pairs exist is a property we call sign-change reducibilty: we say that the pair is sign-change reducible if can be reduced by multiplying or by a (positive-determinant) “sign-change matrix”, a diagonal matrix each of whose diagonal entries is .
We show in Proposition 3.20 that if is not sign-change reducible, then there exist in the top stratum of such that is a minimal pair. We show in Proposition 3.18 that for , every geodesically antipodal pair is sign-change reducible, and that for , there exist geodesically antipodal pairs that are not sign-change reducible. From these propositions we deduce that Type II non-uniqueness never occurs for (Corollary 3.19), and that it always occurs for some if (Corollary 3.21). We do not believe that either of the numbers 4 and 11 above is sharp; our methods are simply not conclusive when .
Together, Proposition 3.20 and Corollary 3.21 show that sign-change reducibility is the only obstruction to having points in the top stratum of for which the set exhibits Type II non-uniqueness.
Even without Proposition 3.18, for it is rather trivial that all geodesically antipodal pairs are sign-change reducible, and for an independent proof relying on quaternions is also possible. However, our proof of the part of Proposition 3.18 makes no use of quaternions, and unifies these low- results.
Our proof of Proposition 3.18, completed in Section 7 after laying groundwork in Sections 4–6, takes us in unexpected directions, with unanticipated consequences. We initially introduced the notion of sign-change reducibility into our scaling-rotation-curve study as an ad hoc tool to help us determine whether Type II non-uniqueness of MSSR curves, impossible for , is ever possible. This is equivalent to answering the question “Are all geodesically antipodal pairs in sign-change reducible?” But as we show in Proposition 4.11, a refined version of the latter question is equivalent to a question purely about the geometry of Grassmannians equipped with a standard Riemannian metric: for even and positive, is every -dimensional subspace of within a certain distance of a coordinate -plane? (This question can, of course, be asked without restricting the parity of , but the above equivalence leads us to consider only even in this paper.) By constructing examples, we show that for , the answer to the Grassmannian question is no for . This, combined with the equivalence result in Proposition 4.11, yields the “” part of Proposition 3.18 mentioned above. The “” part of Proposition 3.18 is proven by other means (via the more technical Proposition 4.6).
While the possibility of Type-II non-uniqueness is what led us to the question above about Grassmannians, this question and our study of it may be of independent interest. Our study led us to investigate several related questions concerning distances between (even-dimensional) subspaces of and (even-dimensional) coordinate planes not necessarily of the same dimension. Perhaps the most unexpected of these is a half-angle relation stated in Proposition 4.10 and proven in Section 5: for any two involutions , each of the principal angles between the -eigenspaces of and is exactly half a correspondingly indexed normal-form angle of . This relationship holds whether or not the dimensions of the -eigenspaces are equal. When the dimensions are equal, we use this relationship to show that a natural correspondence between and a connected component of the set of involutions in is a metric-space isometry up to a constant factor of 2 (Proposition 4.9). This isometric relation is also deducible (and may already be known) from a purely Riemannian approach, but our proof uses essentially no Riemannian geometry (see Remark 5.5 for a more precise statement, and an additional interpretation of what our proof of Proposition 4.9 shows).
The most important results coming from our study of sign-change reducibility are stated in Section 4, with the proofs deferred to Sections 5, 6, and 7. These results include those mentioned above, and one more whose statement involves terminology not included in this Introduction: Proposition 4.8, a special case of a more general conjecture we make about sign-change reducibility (Conjecture 4.7). Key to almost all of these results is the technical Lemma 5.2, which establishes several facts concerning the product of a general involution in and a positive-determinant sign-change matrix.
We mention in passing that there is a vast body of literature devoted to defining and studying “distance-functions” (not necessarily true metrics) on different from the scaling-rotation distance and metric ; for a discussion and comparison see [7] and the references therein.
2 Notational preliminaries
In this paper, when a group acts from the left on a set in a previously specified way, we generally denote the action simply by (g,x)\mapsto g\,{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}x.
Let denote the set of partitions of , and the set of partitions of the integer . Let denote the set of diagonal matrices. Each naturally determines an element according to “which eigenvalues are equal” (see Notation A.1). The group acts on on the left via . The stabilizer of under this action depends only on , and if . For each we may define a subgroup by declaring to be for any for which . We write for the identity component of respectively. See Appendix A for an alternative definition of and additional facts concerning these groups.
We call an element of a signed-permutation matrix if every entry of is either [math] or , and call a signed-permutation matrix even if it lies in . The set of even signed-permutation matrices forms a subgroup of order . As discussed in Appendix A (Section A.2), we view this subgroup as a canonical copy of an “abstract” group of even signed-permutations, an extension of the symmetric group . We will typically denote an even signed-permutation by the letter , and the corresponding matrix by . We denote the natural epimomorphism by . The group plays a critical role in understanding the fibers of (starting with Corollary 3.2 in the next section) and in simplifying computations of . This group, which is not encountered in geometry as often as another group of the same order, is discussed in greater detail in Appendix A.
We define {\cal I}_{p}^{+}={\tilde{S}}_{p}^{+}\mbox{\small\ \bigcap\ }{\rm Diag}^{+}(p), and call elements of (even) sign-change matrices. We view as a copy of a (certain) index-two subgroup of , as discussed in Appendix B. We will typically denote an element of the abstract group by the letter , and the corresponding matrix by I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}.
Notation 2.1
(a) For , define (i) \Gamma_{\sf J}={\tilde{S}}_{p}^{+}\mbox{\small\ \bigcap\ }G_{\sf J}, (ii) \Gamma_{\sf J}^{0}=\Gamma_{\sf J}\mbox{\small\ \bigcap\ }G_{\sf J}^{0}={\tilde{S}}_{p}^{+}\mbox{\small\ \bigcap\ }G_{\sf J}^{0}, (iii) , and (iv) . Observe that , and that K_{\sf J}=\{\pi\in S_{p}\mid\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D=D\ \mbox{for {\em some} D{\sf J}{D}={\sf J}}\}=\{\pi\in S_{p}\mid\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D=D\ \mbox{for {\em all} D{\sf J}{D}={\sf J}}\}.
(b) For any and , define
[TABLE]
the connected component of containing . We write for the set of connected components of .
(c) For any Lie group and closed subgroup , we write and for the spaces of left- and right-cosets, respectively, of in .
3 The scaling-rotation framework and some results
for scaling-rotation curves
The Lie groups and carry natural bi-invariant Riemannian metrics. If we endow with a product Riemannian metric constructed from these, the geodesics in are easily computed. We define smooth scaling-rotation (SSR) curves in to be the projections to of the geodesics in , i.e. curves of the form . (In [14] and [10] these were called simply “scaling-rotation curves”. In Section 3.2 we explain why we have added “smooth” to this name.)
3.1 Smooth scaling-rotation curves
The Lie algebra is the space of antisymmetric matrices. The bi-invariant Riemannian metric on we will use is defined at the identity by
[TABLE]
(The requirement of bi-invariance determines a Riemannian metric on up to a constant factor unless , of course, but for all the inner product (3.1) is a multiple of the Killing form.)
Since the abelian Lie group is an open subset of the vector space , for each we will identify canonically with . With this identification understood, the invariant Riemannian metric we use is defined by
[TABLE]
where and Up to a constant factor, is the unique (bi-)invariant metric on that is also invariant under the natural action of the symmetric group .
Naturally identifying of with , the Riemannian metric we will use is
[TABLE]
where is an arbitrary parameter that can be chosen as desired for applications.
Definition 3.1
A smooth scaling-rotation (SSR) curve is a curve in of the form , where is a geodesic defined on some interval . **
In this paper, we use curve sometimes to mean a parametrized curve (a map with domain some interval), and sometimes to mean an equivalence class of such maps, where two maps are regarded as equivalent if one is a monotone reparametrization of the other. Also, we use the noun geodesic sometimes to mean a complete geodesic and sometimes to mean a geodesic segment. Our intended meanings should always be clear from context.
The geodesics in are exactly the curves of the form , where is a geodesic in and is a geodesic in . Since the metrics and are bi-invariant, the geodesics in and can be obtained as either left-translates or right-translates of geodesics through the identity. For agreement with [10] and [7], in this paper we use right-translates.
It well known that in the Riemannian manifold , the cut-locus of the identity is the set of all involutions, . For every non-involution , there is a unique of smallest norm such that (see Section 4.1); we define . If is an involution, there is not a unique such , but all minimal-norm ’s with have the same norm, which we denote . (Thus is a well-defined real number for all , even when there is no uniquely defined element“” in .) With this understood, the geodesic-distance function on is given by
[TABLE]
where in (3.5) and for the rest of this paper, denotes the Frobenius norm on matrices: for any matrix .
The invariances of the metrics and lead to the following proposition, key to many of our results (e.g. Proposition, 3.7, Proposition A.6, and Corollary A.7).
Proposition 3.2
The map defined by
[TABLE]
*is a free, isometric, left-action of on that preserves every fiber of .
3.2
Scaling-rotation distance and MSSR curves
Definition 3.3** ([10, Definition 3.10])**
For , the scaling-rotation distance between and is defined by
[TABLE]
Definition 3.4
Let be a piecewise-smooth curve in and let denote the length of . For , we call an -minimal geodesic (from to ) if , and . We call a pair of points a minimal pair if and for some -minimal geodesic . A minimal smooth scaling-rotation (MSSR) curve from to is a curve in of the form where is an -minimal geodesic. We say that the MSSR curve corresponds to the minimal pair formed by the endpoints of . We let denote the set of MSSR curves from to **
Obviously an -minimal geodesic is a minimal geodesic in the usual sense of Riemannian geometry: it is a curve of shortest length among all piecewise-smooth curves with the same endpoints. (From the general theory of geodesics, the image of any such curve is actually smooth.) Thus a definition equivalent to (3.7) is
[TABLE]
Thus an -minimal geodesic can alternatively be defined as a geodesic of minimal length among all geodesics starting in one given fiber and ending in another.
Every fiber of is compact (an explicit description is given in Corollary A.7), so the infimum in (3.7) is always achieved. Hence for all , there always exists an -minimal geodesic, a minimal pair in , and an MSSR curve from to .
Remark 3.5
Observe that we have not defined a Riemannian metric on , so there is no “automatic” meaning attached to the phrase length of a smooth curve in . However, for an SSR curve in we define the length of to be \ell(\chi):=\inf\{\ell(\gamma):\gamma\ \mbox{is a geodesic in M(p) and}\ F\circ\gamma=\chi\}. With this definition, (3.8) becomes
[TABLE]
A priori, given , a concrete computation of involves computing the distance in between each connected component of and each connected component of , then taking the minimum over all component-pairs. For , the number of connected components of is (see Proposition A.6 in Appendix A), which tends to be a rather large number (see Corollary A.7). It is obvious from Propositions 3.2 and A.6) that computing all the distances between fiber-components is redundant. It is not so obvious exactly how much redundancy there is (more than one might guess just from looking at these two propositions). As a practical matter, it is desirable to reduce the number of component-pair computations as much as possible, taking advantage of less-obvious redundancy. We will do this in Proposition 3.7 below. This proposition plays a crucial role in [7], where for we apply it to compute all scaling-rotation distances, and to help compute and classify all MSSR curves. The proof of Proposition 3.7 (which is given only in the present paper, not in [7]) relies on the characterization of fibers given in Appendix A as Corollary A.4.
Definition 3.6
Recall that given any group and subgroups , an double-coset is an equivalence class under the equivalence relation on defined by declaring if there exist such that . The set of equivalence classes under this relation is denoted . By a set of representatives of we mean a subset of consisting of exactly one element from each double-coset. Since every left or right coset is also a double-coset, this defines “set of representatives” for ordinary cosets as well.
Proposition 3.7
*Let and let . Let be any set of representatives of . Then the scaling-rotation distance from to is given by
where*
[TABLE]
Every MSSR curve from to corresponds to some minimal pair whose first element lies in the connected component of .
Proof*: From Corollary A.4 we have*
[TABLE]
By Proposition 3.2, for all we have
[TABLE]
Proposition 3.2 implies that the action of on carries a geodesic with endpoints g_{1}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}(UR_{U},D),g_{2}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}(VR_{V},{\Lambda}) into a geodesic with endpoints (UR_{U},D),(g_{1}^{-1}g_{2}){\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}(VR_{V},{\Lambda}) and that satisfies . Hence, every smooth scaling-rotation (SSR) curve from to is of the form where is a geodesic with and .
Suppose are two such geodesics, with \gamma_{i}(1)=(VR_{V}P_{g_{i}}^{-1},\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda}), . If , with and , then
[TABLE]
*where and . The same argument as in the preceding paragraph shows that the SSR curve determined by the pair ((UR_{U},D),(VR_{V}P_{g_{2}}^{-1},\pi_{g_{2}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda})) is the same as the SSR curve determined by the pair ((UR_{U,1},D),(VR_{V,1}\,P_{g_{1}}^{-1},\pi_{g_{1}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda})). Hence any representative of a given double-coset determines the same set of SSR curves as does any other representative of that double-coset. The Proposition now follows.
We end this subsection with a discussion and results that motivate our inclusion of the word smooth in“smooth scaling-rotation curve”. By its definition, every SSR curve is a smooth map, but it is not clear whether the image of is “geometrically smooth”, i.e. locally (in ) a smooth submanifold or submanifold-with-boundary of . For the image of to be geometrically smooth in this sense, must admit a regular parametrization, one that is an immersion. It turns out that all SSR curves do, except for those whose images are single points:
Proposition 3.8
If is a non-constant geodesic, then is either an immersion or a constant map.
Proof*: Let be a non-constant -minimal geodesic and let .*
Let and let . Since is a geodesic there exist unique such that . Non-constancy implies . Direct computation yields
[TABLE]
where and denotes matrix commutator.
Suppose that is such that . Then
[TABLE]
Multiplying on left by and on the right by yields where . But because is diagonal, the diagonal entries of any commutator are zero. Since is a diagonal matrix, this implies that . But is invertible, so the second equality implies . Thus for all , and plugging this into (3.13) with we find . It follows that commutes with for every . Hence for all .
*Thus either is nonzero for every or is constant.
As noted in [10], the “scaling-rotation distance” is not a metric on ; it does not satisfy the triangle inequality. In [8], we show that the pseudometric generated by the semimetric is a true metric on . (It is not trivial to show that for .) Effectively, the construction enlarges the class of scaling-rotation (SR) curves considered in (3.9) from smooth maps to piecewise-smooth maps (with redefined correspondingly). This definition of the scaling-rotation metric is analogous to the definition of “distance between two points in a Riemannian manifold”: the infimum of the lengths of piecewise-smooth curves joining the points. But some minimal-length SR curves are geometrically non-smooth (having corners); an MSSR curve from to has minimal length only among smooth scaling-rotation curves from to . (This phenomenon does not occur in Riemannian geometry; in a Riemannian manifold, minimal piecewise-smooth curves between two points are always geometrically smooth.) It is for this reason we have made “smooth” part of the terminology used in Definition 3.1.
Remark 3.9
It seems likely that a non-constant MSSR curve is actually an embedding (for this, it suffices that be injective, since is compact), but we have not proven this. There do exist non-minimal non-constant SSR curves that are not one-to-one. One example is any nonconstant periodic SSR curve: where and is any nonzero element for which there exists such that . (For , the latter condition is redundant.) The restriction of this curve to is an SSR curve of positive length from to . A nonperiodic example with is the following. Let J=\left(\begin{array}[]{rr}0&-1\\ 1&0\end{array}\right), , D(t)=\left(\begin{array}[]{rr}e^{1-t}&0\\ 0&e^{t}\end{array}\right). Then the curve is a geodesic in . Let be the SSR curve . Then, as the reader may check, if we have if (and only if) for some integer we have and . Now let be non-negative integers, let and let , and . Then is an SSR curve from to with self-crossings. Note that the presence of self-crossings does not directly imply that is not an MSSR curve: if we remove the closed curve from , the piecewise-smooth curve from to that remains is not an SSR curve. (As the reader may check, the set is linearly independent, so cannot be reparametrized as an immersion. Hence, by Proposition 3.8, there is no geodesic in such that can be reparametrized as .) Hence is not a candidate for an SSR curve from to that is shorter than . However, with a little effort one can check by direct computation that there is an -minimal geodesic from to that is shorter than . (One can compute the length of the minimal geodesic from any of the four points in to any of the four points in , and see that each of these lengths is less than .)
3.3 Geodesic antipodality and two types of non-uniqueness
As noted in Section 3.1, for all there always exists an MSSR curve from to , the projection of some -minimal geodesic. A priori, different -minimal geodesics could project to the same MSSR curve or to different MSSR curves. It is natural to ask: Under what conditions on is there a unique MSSR curve from to ? When uniqueness fails, how does it fail, and what can we say about the set ?
For uniqueness to fail for given , there must be distinct -minimal geodesics , whose endpoints are minimal pairs , such that . The “how” question above concerns the following two possibilities (not mutually exclusive):
“Type I non-uniqueness”: There exist such whose endpoints are distinct minimal pairs . 2. 2.
“Type II non-uniqueness”: There exist such whose endpoints are the same minimal pair .
Since for any the minimal geodesic from to is unique, Type II non-uniqueness with minimal pair is equivalent to the existence of two or more minimal geodesics from to , which is equivalent to each of being in the cut-locus (in ) of the other. It will be convenient for us to have some other terminology for such pairs:
Definition 3.10
Call a pair of points in geodesically antipodal if one point is in the cut-locus of the other (equivalently, if each point is in the cut-locus of the other) and geodesically non-antipodal otherwise. Call a pair of points in geodesically antipodal if is a geodesically antipodal pair in , and geodesically non-antipodal otherwise. **
As mentioned earlier, the cut-locus of the identity is precisely the set of all involutions in . Furthermore, because of the invariance of the Riemannian metric , an element is in the cut-locus of element if and only if is in the cut-locus of . Note that, as would be true in any group, if any of the elements is an involution, so are all the others.
Note that a pair in can be geodesically antipodal without either point being maximally remote from the other. (For example, with , the matrix is an involution, but is closer to the identity than is the involution .) However, if is geodesically antipodal, then there exists a (not necessarily unique) closed geodesic in containing and , isometric to a circle of some radius, such that and are antipodal points of this circle in the usual sense.
Proposition 3.7 is a starting-point for understanding the set for all and all : it assures us that, for any , every MSSR curve from to corresponds to some minimal pair whose first element lies in the connected component of . But even once we know all the minimal pairs, to completely understand —or even just determine its cardinality—we need a way to tell whether MSSR curves corresponding to two (not necessarily distinct) minimal pairs with first point in are the same. (This is true whether the non-uniqueness, if any, in is of Type I, Type II, or a mixture of both). Proposition 3.11 below provides such a tool. This proposition, like Proposition 3.7, plays a crucial role in **[7]** (where it is stated without proof), enabling an explicit computation of the sets for .
Proposition 3.11
Let . For assume that is a minimal smooth scaling-rotation curve from to corresponding to the minimal pair where , , {\Lambda}_{i}=\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda}, and is a geodesic. (We do not assume that the two minimal pairs are distinct.) Then if and only if the following two conditions hold.
- (i)
Both pairs are geodesically non-antipodal and
[TABLE]
or both pairs are geodesically antipodal and
[TABLE]
where for any , denotes the natural projection .
- (ii)
There exist such that
[TABLE]
Equation (3.15) implies equation (3.14), so (3.14) is always a necessary condition for the equality .
In Proposition 3.11, in the geodesically non-antipodal case we use endpoint data to tell whether the projections to of two minimal geodesics from to are equal. We will deduce this proposition from the following theorem, proven in **[10]**, that gives a criterion based on initial-value data to tell whether the projections of two geodesics emanating from are equal. In this theorem, G_{D,L}:=G_{D}\mbox{\small\ \bigcap\ }G_{L}, {\mathfrak{g}}_{D,L}=:{\mathfrak{g}}_{D}\mbox{\small\ \bigcap\ }{\mathfrak{g}}_{L} (the Lie algebra of ), and for , is the linear map defined by .
Notation 3.12
For , , , and any interval containing [math], we write for the geodesic defined by .
Theorem 3.13** ([10, Theorem 3.8])**
For let , , , and let . Let be a positive-length interval containing [math]. Then the smooth scaling-rotation curves are identical if and only if (i) , (ii) for all , and (iii) there exist and , such that , D_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D_{1}, and L_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}L_{1}.111In [10, Theorem 3.8], was actually required to be a particular pre-image of in , but the same argument as in the proof of Proposition A.3 of the present paper shows that this restriction can be removed.
To deduce Proposition 3.11 from Theorem 3.13, we first prove two lemmas. Beyond helping us to prove the Proposition, these lemmas may be useful in future analysis of MSSR curves. In these lemmas, for any we write for the Lie algebra of the stabilizer ; thus . (Observe that the notation is consistent with the notation introduced earlier for diagonal matrices.)
Lemma 3.14
Let and suppose that is a minimal smooth rotation-scaling curve with . Let be a geodesic for which . Then A\in({\mathfrak{g}}_{X})^{\perp}\mbox{\small\ \bigcap\ }({\mathfrak{g}}_{Y})^{\perp}, where the orthogonal complements are taken in .
Proof*: Since is a smooth curve of minimal length connecting the submanifolds and of , the velocity vectors must be perpendicular to the tangent spaces , respectively ([2, Proposition 1.5]). Making natural tangent-space identifications, we have , where . Let . Since , and the Riemannian metric we are using on is left-invariant, the condition is equivalent to , hence to . Using additionally the right-invariance of the metric on , we have . From general group-action properties, it is easily seen that . Since , it follows that . A similar argument at the point shows that . ** ** ** ** *
**
Lemma 3.15
In the setting of Theorem 3.13, assume that the smooth scaling-rotation curve is minimal. Then conditions (i) and (ii) in the theorem can be replaced by the single condition .
Proof*: With notation as in Theorem 3.13, assume that . Then the Theorem implies that implying that (as in the proof of Lemma 3.14). But since is minimal, Lemma 3.14 implies that both and lie in , hence that . Hence , i.e. .*
*Conversely, assume that . Then conditions (i) and (ii) are satisfied trivially. ** ** ** ** *
**
Proof of Proposition 3.11:* For let and {\Lambda}_{i}=\pi_{g_{i}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda}_{i}\,.*
By hypothesis , where (for some ) is a minimal geodesic from to . Hence and (we write “” rather than “” since if is an involution, “ ”, as we have defined it, is a set with more than one element; see Section 3.1).
It is straightforward to show that . From Lemma 3.15, the conditions (i) and (ii) in Theorem 3.13 in the equality-conditions for and can be replaced by the single condition .
If then , implying that either both pairs are geodesically antipodal or both are geodesically non-antipodal. In the converse direction, suppose that the pairs are geodesically non-antipodal and that . Then . Whether or not the pairs are geodesically antipodal, by definition , so if (3.15) holds then . Hence the condition is equivalent to condition (i) in Proposition 3.11.
Next, letting play the role of in Theorem 3.13, condition (iii) in the Theorem is equivalent to the existence of such that D=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D, L_{2}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}L_{1}, and . But for all such , we have for some and with . Furthermore, for any , if \pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D=D then L_{2}=\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}L_{1}\iff{\Lambda}_{2}=\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda}_{1}. Hence, under the hypotheses of Proposition 3.11, condition (iii) in Theorem 3.13 is equivalent to condition (ii) stated in the Proposition.
*This establishes the “if and only if” statement in the Proposition. The final statement of the proposition follows from the fact that, in the notation of this proof, (3.15) is the equality (after multiplying both sides of (3.15) on the right by ), an equality that implies . ** ** ** ** *
**
3.4 Type I and Type II non-uniqueness
Within the scaling-rotation framework, the motivation to understand Type II non-uniqueness is its effect on a true scaling-rotation metric on , mentioned earlier, that we construct from in **[8]**. Various constructions and assertions concerning this metric are simplified when we know that Type II non-uniqueness does not occur. But, as we shall see, the study of Type II non-uniqueness also leads to geometric results outside the scaling-rotation framework.
For small enough values of , Type II non-uniqueness never occurs; for large enough , it always occurs (see Corollaries 3.19 and 3.21 below). Our main tool for ruling out Type II non-uniqueness is based on a property we call sign-change reducibility (for want of a better term), defined shortly.
To motivate the definition, let and let be a minimal pair. Then one minimizer of the expression in brackets on the right-hand side of (LABEL:fibdist8a) is the triple , where is the identity element of . Hence for all with \pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda}={\Lambda}—i.e. for all (see Notation 2.1)—we must have But for all , so, in particular, we must have d_{SO}(UI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},V)\geq d_{SO}(U,V) for all {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}.
Definition 3.16
Call a pair of points sign-change reducible if d_{SO}(UI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},V)<d_{SO}(U,V) for some {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}.**
From the discussion preceding Definition 3.16, we have the following:
Corollary 3.17
*Let . If is sign-change reducible, then is not a minimal pair.
Sign-change reducibility is studied in more detail in Sections 4–7; a long digression from the topic of scaling-rotation distance and MSSR curves is needed (but has bonuses). Below, we summarize some results proven there, and their consequences. Two of the main results are given in the following Proposition (proven in Section 7):
Proposition 3.18
(a) For , every geodesically antipodal pair in is sign-change reducible. (b) For , there exist geodesically antipodal pairs in that are not sign-change reducible.
Thus the largest dimension for which every geodesically antipodal pair in is sign-change reducible satisfies . A combination of theory and numerical evidence leads the authors to believe that is closer to 10 than to 4.
An immediate consequence of Proposition 3.18 (a) is the following. (Again, we do not believe the number “4” here is sharp.)
Corollary 3.19
For , every minimal pair in is geodesically non-antipodal. Hence for , for all for which , the non-uniqueness is purely of Type I.
Part of the importance of sign-change reducibility comes from the following:
Proposition 3.20
Suppose that is a pair in that is not sign-change reducible. Then there exist such that the pair is minimal.
We will prove this below. But first note that an immediate corollary of Propositions 3.18(b) and 3.20 is:
Corollary 3.21
For , there exist geodesically antipodal, minimal pairs . Hence, for , there exist for which the set exhibits Type II non-uniqueness.
Thus sign-change reducibility is more than an ad hoc criterion for ruling out Type II non-uniqueness for small enough . Proposition 3.20 and Corollary 3.21 show that, in some sense, sign-change reducibility is the only obstruction to having points in the top stratum of for which exhibits Type II non-uniqueness.
For or not in the top stratum of , the relationship between Type II non-uniqueness and sign-change reducibility of minimal pairs in situation is more complicated to analyze. We do not investigate this relationship further in this paper.
To prove Proposition 3.20 we start with a lemma:
Lemma 3.22
Let . There exist such that \|\log(D^{-1}(\pi{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\Lambda})\|^{2}>\|\log(D^{-1}{\Lambda})\|^{2}+c for all non-identity .
Proof*: Let and let be a sequence of numbers satisfying for . Then for all . Let and let . Then and *
Let and let be such that . Then
[TABLE]
* ** ** ** *
**
Proof of Proposition 3.20. Let be such that
[TABLE]
for all non-identity ; such exist by Lemma 3.22. Let . The subgroups of are trivial, as are the subgroups and of . Hence in Proposition 3.7 we have and
[TABLE]
For all non-identity and all with and , using (3.19) we then have
[TABLE]
Hence the identity permutaton is the only element of for which the expression inside the outer braces in (3.4) achieves the minimum over all . But is precisely the sign-change subgroup , and by hypothesis is not sign-change reducible. Hence
[TABLE]
*Thus is a minimal pair. ** ** ** ** *
**
4 Involutions, sign-change reducibility, and distance between subspaces of
In this section we begin our study of sign-change reducibility. This culminates in Section 7 with the proof of Proposition 3.18 (which, as we have seen, implies Corollary 3.21, our main result concerning Type II non-uniqueness), but we discover some other interesting facts along the way. As we shall see, questions concerning the seemingly ad hoc notion of sign-change reducibility can be translated into questions about distances between subspaces of ; for example, Proposition 4.11 states the equivalence between a sign-change-reducibility question and a question purely about the geometry of the Grassmannian (endowed with a standard metric). Thus, some unexpected benefits of our investigation of Type II non-uniqueness are results, possibly of independent interest, concerning the geometry of Grassmannians and, more generally, principal angles between subspaces of .
Since for , the set of distances between geodesically antipodal points in is the same as the set of distances between the identity and involutions. Thus to understand which (if any) geodesically antipodal pairs in are sign-change reducible, it suffices to study the case , where is an involution.
Definition 4.1
Call sign-change reducible if d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)<d_{SO}(R,I) for some {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} (equivalently, if the pair is sign-change reducible). Note that sign-change reducibility of the pair , as previously defined in Definition 3.16, is equivalent to sign-change reducibility of . 2. 2.
For {\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}, define the level of , written {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}), to be . 3. 3.
For any involution , define the level of , written , to be where is the -eigenspace of . We write for the set of involutions in , and for we write for the set of involutions in of level . Note that is even for any , so is empty unless is even and at least 2. Thus (a disjoint union). 4. 4.
Let be an involution. We say that is reducible by a sign-change of level if there exists {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} of level such that d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)\linebreak<d_{SO}(R,I).
Observe that for non-identity {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}, the matrix I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} is an involution in , and {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}).
Remark 4.2** (Involutions and Grassmannians)**
The space can be naturally identified with a disjoint union of Grassmannians, because an involution is completely determined by its -eigenspace . Let denote the Grassmannian of -planes in , and for even define to be the map carrying to the involution in whose -eigenspace is . (Thus for all .) Concretely, letting denote orthogonal projection onto any subspace and letting denote the matrix of with respect to the standard basis of , the map is given by
[TABLE]
reflection about the -plane . It is not hard to show that is a submanifold of and that is a diffeomorphism from to this submanifold. **
Our study of sign-change reduciblity of involutions will make frequent use of the normal form of an element of , so we review this before proceeding.
4.1 Normal form and
distance to the identity in
Let . Recall that every has a normal form: a block-diagonal matrix that, for even, is of the form
[TABLE]
where
[TABLE]
and where . (This can be derived quickly from the normal form of an antisymmetric matrix, since the compactness of guarantees that the exponential map is onto.) For the odd-* case, the normal-form matrix is the matrix (4.2) with one more row and column appended, and with a 1 in the lower right-hand corner (and zeroes everywhere else in the last row and column). In this case we define , so that for both even and odd we can use the notation for the normal form.*
Note that
[TABLE]
For each there exists an orthonormal basis of with respect to which the linear transformation , , has matrix . Thus there exists such that
[TABLE]
The normal form of a given is unique up to ordering of the blocks; the multi-set is uniquely determined by . From (4.3) and (4.5) we have
[TABLE]
where is the block-diagonal matrix obtained by replacing by in (4.2), , and, in the odd-* case, replacing the 1 in the lower right-hand corner by 0. Since the normal form is unique up to block-ordering, it follows that*
[TABLE]
Furthermore, from (4.5) and (4.3) it follows that
[TABLE]
if is even; for odd we again just append one more row and column of the middle matrix, with a 1 in the lower right-hand corner. Hence the values (and therefore the values ) can be recovered from as the eigenvalues of , with the multiplicity of an eigenvalue of equal to twice the multiplicity of in the list in the even-* case; for odd the only difference is that multiplicity of the eigenvalue 1 of is .*
Remark 4.3** (Normal form, involutions, and distances to identity)**
**
Writing in the form (4.5), it is easily seen that is an involution if and only if (i) for each , is either [math] or , and (ii) for at least one . For such , if for exactly values of , then . Hence if is an involution of level , then
[TABLE]
Thus
[TABLE]
Using (4.6) it can also be shown that for every non-involution , there is a unique of smallest norm such that .**
Notation 4.4
**
Given and angles for which is a normal form of , we define “redundant normal-form angles” , , by
[TABLE]
- 2.
For any square matrix we write for the -eigenspace of .
Note that (4.7) can now be written as
[TABLE]
4.2 Sign-change reducibility, distances in Grassmannians, and a half-angle relation
In this section we state and discuss several results, but defer their proofs to later sections.
For one can show, without appealing to Proposition 4.6 below, that every involution in is sign-change reducible. (This sign-change redubility holds for trivial reasons for when holds for slightly less trivial reasons, mentioned later in Remark 4.12, for ; and can be shown to be hold for using a quaternionic approach.) It is reasonable to wonder whether this holds for all :
Question 4.5
Let . Is every involution in sign-change reducible?
Our motivation for this question is not just generalization for its own sake, however. Potential Type II non-uniqueness complicates several aspects of the analysis of scaling-rotation distance and the associated metric studied in **[8]**. To understand whether the “Type II non-uniqueness” defined in Section 3.4 can occur, we need to know whether a geodesically antipodal pair in can be minimal. (As discussed in Section 3.4, a geodesically non-antipodal minimal pair in uniquely determines an MSSR curve in .) A sufficient condition for any pair in to be non-minimal is that the pair be sign-change reducible. Since sign-change reducibility of involutions rules out the possibility of Type II non-uniqueness, and all involutions are sign-change reducible for , it is natural to ask Question 4.5 and wish for the answer to be yes.
The answer, however, is more complicated. We shall see that the answer to Question 4.5 is yes for and no for (we do not know the answer for ), but that for all , involutions of high enough level are sign-change reducible—morevover, by a sign-change of the same level:
Proposition 4.6
Let be an involution for which . Then there exists {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}, with {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(R), such that d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)\linebreak<d_{SO}(R,I).
We defer the proof to Section 6.
Since for every involution , Proposition 4.6 (once proved) immediately establishes Proposition 3.18(a) and Corollary 3.19: for , all involutions are sign-change reducible, and hence all minimal pairs in are geodesically non-antipodal.
We shall see below (Proposition 4.11) that sign-change reducibility by a sign-change of the same level is equivalent to a statement purely about the geometry of Grassmannians. For reasons given shortly, it seems likely to the authors that the “same level” condition appearing in Proposition 4.6 is optimal (even without the “” restriction) in the sense that \min_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)\} is achieved by a sign-change matrix {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} for which {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(R). If this is true, then the analysis of whether an involution is sign-change reducible simplifies; we need only consider {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} of the same level as . This (potential) simplication is actually of greater value to us than knowing, for a given , whether all minimizers of d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I) have the same level as , so we state only the following weaker conjecture:
Conjecture 4.7
Let be even, and let be an involution of level . If is sign-change reducible, then it is reducible by a sign-change of level .
In Section 6 we will prove the following special case of this conjecture:
Proposition 4.8
Conjecture 4.7 is true for .
*The reason we expect more generally that \min_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)\} is achieved by a * ** for which {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(R) is as follows. Every sign-change matrix I_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}_{1}}\in{\cal I}_{p}^{+} is itself an involution, and satisfies
[TABLE]
Thus for sufficiently close to I_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}_{1}}, we have
[TABLE]
The function carrying an involution in to is continuous, so for sufficiently close to I_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}_{1}} we also have {\rm level}(R)={\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}_{1}). Hence for every sufficiently close to a sign-change matrix, \min_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}}\{d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)\} is achieved by a sign-change matrix having the same level as . It seems plausible that this remains true even without the “sufficiently close to a sign-change matrix” restriction.
As noted in Remark 4.2, for even the space is diffeomorphic to the Grassmannian . This Grassmannian carries a Riemannian metric induced by Riemannian submersion from . It is known that the associated squared geodesic-distance between two points is, up to a constant factor, simply the sum of squares of the principal angles between the two -planes .222This fact follows from Wong’s results on geodesics in [16], and has been cited elsewhere in the literature (e.g. [4, p. 337]), though the explicit statement does not appear in [16]. Choosing the normalization in which the squared geodesic distance equals the sum of squares of the principal angles (equation (5.1) below), we will prove the following in Section 5:
Proposition 4.9
The map (see (4.1)) is an isometry, up to a constant factor of 2:
[TABLE]
for all .
We derive Proposition 4.9 from a general half-angle relation proven in Section 5:
Proposition 4.10
Let be involutions in . For let , and let . Let be angles for which is a normal form of the product , and let be as defined in (4.11). Then for some injective map , the principal angles between and satisfy
[TABLE]
For every , the angle is either [math] or .
In other words, as stated in the introduction: for any two involutions , each of the principal angles between and is exactly half a correspondingly indexed normal-form angle of .
Proposition 4.9 can also be proven by purely Riemannian methods, but the proof we give, via Proposition 4.10, is independent in the sense that it does not make any use of a Riemannian metric on ; see Remark 5.5.
In Section 5, after proving Proposition 4.9 we will use it to deduce the following:
Proposition 4.11
Let be integers with even and . Then the following two statements are equivalent:
Every involution of level is sign-change reducible by a sign-change of level . 2. 2.
For every , there exists a coordinate -plane (see Notation 5.1) such that
[TABLE]
In other words, the sign-change reducibility asserted in Statement 1 of the Proposition is equivalent to a statement purely about the geometry of Grassmannians (with the metric ), namely that the coordinate -planes in form a “lattice” of points in such that such that every point in is within distance of some lattice-point. This gives us a geometric way to tackle Question 4.5, at least for sign-change reducibility of an involution by a sign-change matrix of the same level. However, the authors do not know a formula for for general , or (more importantly), a formula for .
Note that Proposition 4.6 asserts that statement 1 of Proposition 4.11 is true whenever . To put into perspective the number appearing in statement 2 of Proposition 4.11, and better understand the relevance of the comparison between and , note that the squared diameter of is . So for , (4.15) is equivalent to
[TABLE]
For , the right-hand side of (4.15) is a greater fraction of , so it is “easier” for statement 2 of Proposition 4.11 to be true for than for .
Remark 4.12
It is relatively easy to show that for any involution , there exists for which RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} is not an involution. For , we have for every , and for every involution , so any non-involution is closer to the identity than is any involution. Hence for these values of , Proposition 4.11 is easy to prove. However, for , given an involution and a {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} for which RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} is not an involution, (4.10) shows that we cannot immediately deduce that d_{SO}(I,RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})<d_{SO}(I,R).**
5 Proofs of the half-angle relation
and results related to Grassmannians: Propositions 4.9, 4.10, and 4.11
The half-angle relation in Proposition 4.10 underlies our proofs of of the most of the other results stated in Section 4.2 (all but Proposition 4.8). When the dimensions of the eigenspaces in Proposition 4.10 are equal, the half-angle relation leads to the elegant distance-relation (4.13). This equidimensonal case is actually the only one we need for the application to Type-II non-uniqueness of MSSR curves. However, the half-angle relation (4.14) holds whether or not . Since this fact may be of interest outside the scope of this paper, and is not much harder to prove without the equal-dimensions restriction, we have stated (and will prove) the more general relation.
Section 5.1 is devoted to establishing Proposition 4.10. In Section 5.2, we apply this proposition to establish Propositions 4.9 and 4.11.
5.1 The half-angle relation
We start with some notation.
Notation 5.1
**
For let denote the standard basis vector of .
- 2.
For , let denote the collection of -element subsets of .
- (a)
For and , define {\mbox{\boldmath\sigma \unboldmath}\mbox{}}^{J}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p} by for and for . Similarly, for {\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}, define J^{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=\{i\in\{1,\dots,p\}:\sigma_{i}=-1\}. (The maps J\mapsto{\mbox{\boldmath\sigma \unboldmath}\mbox{}}^{J} and {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\mapsto J^{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} are inverse to each other.)
- (b)
If and , with , let denote the matrix whose column is , .
- (c)
For and , define .
The collection is the set of “coordinate -planes” in .
- 3.
For any , let denote the complement of in .
- 4.
For , , and , writing ,
- (a)
let , denote the principal angles between the -plane and the -plane (see [6, Section 12.4.3]), and
- (b)
let , .
- 5.
For define by
[TABLE]
As noted earlier, is the distance-function defined by the standard -invariant Riemannian metric on (up to a constant factor).
The following long but far-reaching technical lemma, giving several detailed relations between a general involution in and its product with a sign-change matrix, is our key tool for establishing the results stated in Section 4.2. It is best thought of as a series of lemmas, all with the same hypotheses, that have been rolled into one long lemma in order to avoid restating hypotheses and notational definitions. After proving the lemma, we build on it with two corollaries, completing the groundwork for the proofs (in later sections) of the Section 4.2 propositions.
Lemma 5.2
Let be an involution, let {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}, assume 0<m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}:={\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})<p, and let J=J^{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} (see Notation 5.1). Viewing as , below we write every matrix in the block form \left[\begin{array}[]{ll}A_{1}&A_{2}\\ A_{3}&A_{4}\end{array}\right], where is (p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\times(p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}), is (p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}, is m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times(p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}), and is m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}. Then:
(i) In this block form,
[TABLE]
where is a symmetric (p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\times(p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}) matrix, is a symmetric m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} matrix, and is (p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}.
(ii) In the same block form,
[TABLE]
(iii) All eigenvalues of and lie in the interval .
(iv) For every , if is an eigenvalue of (respectively, ), then is an eigenvalue of (resp. ) with the same multiplicity.
(v) Let denote the number of eigenvalues of , counted with multiplicity, lying in the interval . Then is also the number of eigenvalues of , counted with multiplicity, lying in , and l\leq\min\{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}.
(vi) The inclusion map defined by v\mapsto\left[\begin{array}[]{l}v\\ 0\end{array}\right] restricts to isomorphisms E_{\pm 1}(R_{1})\to E_{\pm 1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J^{\prime}}. Similarly the inclusion map defined by w\mapsto\left[\begin{array}[]{l}0\\ w\end{array}\right]restricts to isomorphisms E_{\pm 1}(R_{4})\to E_{\pm 1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J}.
(vii) Let l_{-}=\dim(E_{1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J}),\ l_{+}=\dim(E_{-1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J^{\prime}}).333The subscripts are chosen according to the eigenspaces of I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} rather than : {\bf R}^{J}=E_{-1}(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}), {\bf R}^{J^{\prime}}=E_{1}(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}). Then and . (Thus is the multiplicity of as an eigenvalue of (RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})_{\rm sym} in (5.3), hence of RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} itself, and therefore yields a lower bound on d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I).) Furthermore,
[TABLE]
and
[TABLE]
(viii) There exist an orthonormal -eigenbasis \{v_{i}\}_{i=1}^{p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}} of {\bf R}^{p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}} (i.e. an orthonormal basis of {\bf R}^{p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}} consisting of eigenvectors of ) and an -eigenbasis \{w_{i}\}_{i=1}^{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}} of {\bf R}^{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}. For any such bases of {\bf R}^{p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}, of {\bf R}^{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}, let be the corresponding eigenvalues (i.e. and ), and define
[TABLE]
Then
[TABLE]
(ordered arbitrarily) is an orthonormal basis of , and the set
[TABLE]
(ordered arbitrarily) is an orthonormal basis of . Note that the cardinality of the second set in (5.22) (respectively (5.23)) is (resp. ).
Proof*: To simplify notation in this proof, we let m=m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}.*
Since is an involution, . Hence is symmetric, implying assertion (i), and is the orthogonal direct sum of and (since the only possible eigenvalues of an involution are ).
For (ii), observe that in the block-form decomposition we are using,
[TABLE]
A simple calculation then yields (5.3).
Next, because , we have the following relations:
[TABLE]
From (5.24) and (5.27), for any , we have
[TABLE]
It follows from (5.28)–(5.29) that if is an eigenvalue of or , then , yielding (iii).
To obtain (iv), consider the operators and defined by and . Suppose that has an eigenvalue with , and let . Let ; note that (5.28) implies . Using (5.26),
[TABLE]
Hence maps injectively to . Similarly, if has an eigenvalue with , and maps injectively to .
It follows that, for any with , is an eigenvalue of if and only if is an eigenvalue of , and that the maps
[TABLE]
are isomorphisms. This establishes (iv). Statement (v) is an immediate corollary of (iv).
For (vi), let be the first inclusion map in the lemma. Note that R\left[\begin{array}[]{l}v\\ 0\end{array}\right]=\left[\begin{array}[]{l}R_{1}v\\ R_{2}^{T}v\end{array}\right]. If with , equation (5.28) implies that , hence that . Conversely, if , then (and ). Hence carries isomorphically to E_{\lambda}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J^{\prime}}. The argument for the inclusion map is essentially identical. This establishes (vi).
Part (vi) implies that \dim(E_{1}(R_{4}))=\dim(E_{1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J})=l_{-} and that \dim(E_{-1}(R_{1}))=\dim(E_{-1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J^{\prime}})=l_{+}, the first assertion in (vii). To obtain (5.4)–(5.5), note that for any subspaces of , we have
[TABLE]
(The proof of (5.32) is straightforward linear algebra.) Applying this to the case V=E_{-1}(R),V^{\perp}=E_{1}(R),W=E_{-1}(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})={\bf R}^{J},W^{\perp}=E_{1}(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})={\bf R}^{J^{\prime}}, we have l_{-}=\dim(V^{\perp}\mbox{\small\ \bigcap\ }W) and l_{+}=\dim(V\mbox{\small\ \bigcap\ }W^{\perp}), so (5.5) follows from (5.32). The inequalities in (5.4) follow directly from (5.5).
(viii) Since (respectively ) is symmetric, an orthonormal -eigenbasis of (resp., orthonormal -eigenbasis of ) exists. Select such eigenbases, and let , be eigenvalues as defined in the Lemma. Note that the second set in (5.22) is a basis of , which by (vi) is isomorphic to E_{1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J}. Hence the cardinality of this set is \dim(E_{1}(R)\mbox{\small\ \bigcap\ }{\bf R}^{J}), i.e. . Similarly, the second set in (5.23) is a basis of and has cardinality .
Without loss of generality, we may assume that the eigenvectors with eigenvalue , if any, are the last , and that the eigenvectors with eigenvalue 1, if any, are the last . Using (5.27), for we have , while using (5.25) we find . Then, using (5.2), a simple calculation shows that . Hence for , while from part (vi), for .
Let denote the standard inner product on for any . As seen in the proof of part (vi), implies . Hence for and , while for we have . Finally, for , using the fact that , a simple computation yields Thus \{\sqrt{\frac{1-\lambda_{i}}{2}}{\bf w}_{i}:1\leq i\leq m-l_{-}\}\mbox{\small\bigcup}\{{\bf v}_{i}:p-m-l_{+}<i\leq m\} is an orthonormal subset of . Using (5.5), the cardinality of this subset is m-l_{-}+l_{+}={\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})-({\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})-{\rm level}(R))={\rm level}(R)=\dim(E_{-1}(R)). Hence (5.23) is an orthonormal basis of .
*The proof that (5.22) is an orthonormal basis of is similar. ** ** ** ** *
**
Corollary 5.3
Hypotheses and notation as in Lemma 5.2. Let and (as in Lemma 5.2(vii)). In addition let be angles for which is a normal form of RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}, and let be as defined in (4.11). Let . Then |J_{*}|\leq\min\{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}, and
[TABLE]
If {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(R), then
[TABLE]
Proof*: Let \beta^{\prime}:J^{\prime}\to\{1,\dots,p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}, \beta:J\to\{1,\dots,m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}, be order-preserving bijections. By (4.8), the eigenvalues of (RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})_{\rm sym}, counted with multiplicity, are . But from Lemma 5.2(ii), we can read off the eigenvalues of (RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})_{\rm sym} from (5.3); they are \lambda_{1}^{\prime},\dots,\lambda_{p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}^{\prime},-\lambda_{1},\dots,-\lambda_{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}} (ordered arbitrarily). Thus, reordering the and the appropriately, for we have*
[TABLE]
Define Observe that can also be characterized as . Similarly, define By part (v) of Lemma 5.2, |J^{\prime}_{*}|=|J_{*}|=l\leq\min\{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}, and by part (iv) of the Lemma there is a bijection such that for all Hence
[TABLE]
In particular,
[TABLE]
Next, note that
[TABLE]
and similarly From (4.12) we therefore have
[TABLE]
establishing (5.33).
*If {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})={\rm level}(R), then equation (5.5) implies that , so (5.33) implies the first equality in (5.34). For the second equality, observe that if and only if is 0 or . The number of ’s in for which is exactly , while the ’s in for which have no effect on . Hence the second equality in (5.34) holds. ** ** ** ** *
**
Corollary 5.4
Hypotheses and notation as in Lemma 5.2, except that we additionally write and m=\min\{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},m_{R}\}. Let be angles for which is a normal form of RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}, let be as defined in (4.11), let the elements of be i_{1}<i_{2}<\dots<i_{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}, and let . Then:
(i) Up to ordering,
[TABLE]
(ii) If m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=m_{R} then
[TABLE]
Proof*: (i). Let be the matrix formed by the columns of the basis (5.23) of , with the elements of the first set in (5.23) comprising the first m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-} columns , and the elements of the second set comprising the last columns. (Here are defined as in Lemma 5.2(vii).) Without loss of generality we order the -eigenvectors such that the first m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-} are the ones for which .*
Since the columns of form an orthonormal basis of , the numbers are the singular values of the m_{R}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} matrix . (This is true whether m_{R}\leq m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} or m_{R}>m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}.) But, relative to the block-decomposition of matrices used in Lemma 5.2, the upper (p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} block of is [math], and the lower m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} block is I_{m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}. Hence, writing for the m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\times(m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-}) matrix formed by the last m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} rows of the first m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-} columns of , and noting that m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-}=m_{R}-l_{+} (by (5.5)), we have \widetilde{W}^{T}{\sf E}_{J}=\left[\begin{array}[]{c}\widetilde{W}_{*}^{T}\\ 0_{l_{+}\times m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}}\end{array}\right], where the row of the (m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-})\times(p-m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}) matrix is a multiple of . Hence for i,j\leq m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-}=m_{R}-l_{+},
[TABLE]
and all other entries of the matrix are 0. But for m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-l_{-}<i\leq m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}, we have , so for all i,j\leq m=\min\{m_{R},m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\}. Thus the upper left-hand block of (the entire matrix if m_{R}\leq m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}) is , so the numbers are the singular values of . Thus, up to ordering, the principal angles are given by
[TABLE]
The bijection \beta:J\to\{1,\dots,m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}\} used in the proof of Corollary 5.3 is simply the inverse of the map . Thus from (5.35), we have
[TABLE]
Combining (5.42) with (5.43),
[TABLE]
But , so both and lie in . Hence (5.44) implies that , .
(ii) Assume m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=m_{R}; then both equal . Corollary 5.3 then implies that
[TABLE]
*But from part (i) we have for , so, using (5.34), d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)^{2}=4d_{Gr}(E_{-1}(R),{\bf R}^{J})^{2}, implying (5.40). ** ** ** ** *
**
We are now ready to establish the general half-angle relation:
Proof of Proposition 4.10. Let and let be the corresponding orthogonal transformation. For any even and any , we have
[TABLE]
*Now let be an orthogonal transformation carrying to a coordinate plane , and let be the matrix for which . Then UR_{2}U^{-1}=I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}, where {\mbox{\boldmath\sigma \unboldmath}\mbox{}}={\mbox{\boldmath\sigma \unboldmath}\mbox{}}^{J}. For let . Since is an orthogonal transformation, the (multi-)set of principal angles between and is identical to the (multi-)set of principal angles between and . But R_{1}^{\prime}I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=R_{1}^{\prime}R_{2}^{\prime}=UR_{1}R_{2}U^{-1}, so is a normal form of R_{1}^{\prime}I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} as well as of . The result now follows from Corollary 5.4(i) and equation (5.36) (the latter being needed only for the final statement of the result). ** ** ** ** *
**
5.2 The proofs of Propositions
Proof of Proposition 4.9.* Since d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)=d_{SO}(R,I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}^{-1})=d_{SO}(R,I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}), conclusion (ii) of Corollary 5.4 can be written equivalently as:*
[TABLE]
Fix any . Letting “* **” denote the natural left-action of on , observe that, in the notation of the proof of Proposition 4.10), for all and we have \Phi(U{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}W)=U\Phi(W)U^{-1} (simply another way of writing (5.46).) Clearly is invariant under this action, and is both left- and right-invariant, so (5.47) implies that*
[TABLE]
Now let . Since the action of on is transitive, there exists such that U{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}{\bf R}^{J}=V. Using any such , we then have
[TABLE]
* ** ** ** *
**
Remark 5.5
Of course, Proposition 4.9 can be deduced from computations with the principal fibration
[TABLE]
the standard Riemannian metric on (for which is the geodesic-distance function) is defined so as to make a Riemannian submersion up to a normalization constant. Our proof of Proposition 4.9 is independent of this Riemannian proof in the sense that it establishes equality between the left-hand side of (5.47) and the right-hand side as defined by equation (5.1). Without the a priori knowledge that is a geodesic-distance function, it is not obvious that satisfies the triangle inequality, hence whether is a metric. Thus Proposition 4.9 actually provides an independent proof that is a metric on . The only use of Riemannian geometry in this proof is through the knowledge that is, in fact, a metric (because it is a geodesic-distance function).
Proof of Proposition 4.11.* Let “Statement 1” and “Statement 2” be the statements listed as 1 and 2 in the Proposition. As noted in the proof of Proposition 4.9, d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)=d_{SO}(R,I{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}), so the inequality d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)<d_{SO}(R,I) can be rewritten as*
[TABLE]
*Assume first that Statement 1 is true. Let . Then is an involution of level , so there exists {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} of level such that d_{SO}(\Phi_{m,p}(W),I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})^{2}<\frac{m\pi^{2}}{2}. Select such a * ** and let J=J^{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}. Then I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=\Phi_{m,p}({\bf R}^{J}), so
[TABLE]
Hence Statement 2 is true.
Conversely, assume that Statement 2 is true. Let . Then there exists such that . Select such a and let {\mbox{\boldmath\sigma \unboldmath}\mbox{}}={\mbox{\boldmath\sigma \unboldmath}\mbox{}}^{J}. Then I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=\Phi_{m,p}({\bf R}^{J}), so
[TABLE]
*Hence Statement 1 is true. ** ** ** ** *
**
6
Proofs of sign-change reducibility results, part I: Propositions 4.6 and 4.8
We are now ready to attack the question of sign-change reducibility: given , can we find {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+} such that d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)<d_{SO}(R,I)? Equations (4.9) and (5.33) tell us that this inequality is satisfied if and only if
[TABLE]
*where l_{\pm}=l_{\pm}(R,{\mbox{\boldmath\sigma \unboldmath}\mbox{}}) are as in Lemma 5.2(vii). Since is the largest possible value for a normal-form angle in (4.2), it is reasonable to try to look for a * ** such that and are as small as possible. However, to achieve (6.1), we have to make sure that we do not make too large while we are making small. We next prove a lemma that, via its subsequent corollary, will help us show that for , we can choose to make d_{SO}(RI_{{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}^{J}},I) as small as is needed to prove Proposition 4.6.
Lemma 6.1
For ,
[TABLE]
Proof*: For , we have*
[TABLE]
Hence when and when , the left-hand side of (6.2) reduces to , which is also true of the right-hand side.
We proceed by induction on . For each , consider the statement
[TABLE]
We have already established that (6.2) holds for , hence that statement is true. Now suppose that is true for some given . To consider , let denote the standard bases of respectively. For with we write for the matrix whose column is , . Note that
[TABLE]
Hence for ,
[TABLE]
Hence (6.1) holds with replaced by , as long as . But we have already established that (6.1) holds whenever ; hence if is replaced by , the equality holds for . Thus (6.1) holds for all with ; i.e. statement is true. By induction, is true for all , which is exactly what the Lemma asserts.* ** ** ** *
**
Corollary 6.2
Let and let . There exists such that
[TABLE]
Furthermore, the inequality in (6.13) is strict for some unless equality holds in (6.13) for all .
Proof*: Let be any matrix whose columns are an orthonormal basis of . Using Lemma 6.1,*
[TABLE]
since .
Since , the average of over all is . Hence for at least one , and the inequality is strict for some unless it is an equality for all . But for any , the principal angles between and are the numbers in for which are the singular values of the matrix , where is any matrix whose columns are an orthonormal basis of . Since for any the columns of are an orthnormal basis of , it follows that . Thus, for some , , and the inequality is strict for some unless it is an equality for all . But for any given ,
[TABLE]
*and the first inequality in (6.14) is strict if and only if the second is strict. Thus (6.13) holds for some , and the inequality in (6.13) is strict for some unless it is an equality for all . ** ** ** ** *
**
Proof of Proposition 4.6.
If then is even, , and for {\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(-1,-1,\dots,-1) we have I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=-I and d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)=0<d_{SO}(R,I). Henceforth we assume .
Let and let . Note that
[TABLE]
Let be such that . By Corollary 6.2, inequality (6.13) holds, and the inequality is strict unless
[TABLE]
for all . Let {\mbox{\boldmath\sigma \unboldmath}\mbox{}}={\mbox{\boldmath\sigma \unboldmath}\mbox{}}^{J}. By Corollary 5.4,
[TABLE]
where .
The function is strictly decreasing on the interval . Hence for all we have , with equality only if ; thus for we have , with equality only if or . Hence
[TABLE]
Hence d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})\leq d_{SO}(R,I), and this inequality is strict if any of the inequalities (6.18), (6.19), (6) is strict. Inequality (6.18) is strict if for some , and, by our choice of , (6.19) is strict unless equality holds in (6.16) for all .
We claim that at least one of the inequalities (6.18), (6.19) is strict. Assume this is not so. Then, since equality holds in (6.18) with replaced by any it follows that for all and the angle is either 0 or , and that for all . But for any , there always exists for which none of the principal angles is . Choosing such for our -plane all of the principal angles must therefore be 0 (since they are all either 0 or ). But then , a contradiction.
*Thus at least one of the inequalities (6.18), (6.19) is strict, so d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})<d_{SO}(R,I). ** ** ** ** *
**
We will establish Proposition 4.8 (a weak version of Conjecture 4.7) as a consequence of a different weakened version of Conjecture 4.7:
Proposition 6.3
Let be even, let be an involution of level , and let {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}. If d_{SO}(RI_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}},I)<d_{SO}(R,I), then {\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}})<2m. (Hence if is sign-change reducible, then it is reducible by a sign-change of level less than .)
This proposition, which we will prove this shortly, reduces Proposition 4.8 into a triviality:
Proof of Proposition 4.8, assuming Proposition 6.3:* The only positive even integer less than is 2.** ** ** ** *
**
Proof of Proposition 6.3. Let m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}={\rm level}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}). Define as in Lemma 5.2. From (5.33),
[TABLE]
so
[TABLE]
But by (5.5) we have l_{-}=l_{+}+m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-m_{R}, so substituting into (6.21), we have 2l_{+}+m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}-m_{R}<m_{R}; equivalently,
[TABLE]
*Since , we must have m_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}<2m_{R}. ** ** ** ** *
**
7 Proofs of sign-change reducibility results, part II: Proposition 3.18
As noted in Section 4.2, Proposition 4.6 proves part (a) of Proposition 3.18. Thus it remains only to prove part (b) of this Proposition.
The combination of Proposition 4.8 and Proposition 4.11 is what will guide our proof of part (b). To establish the result, it suffices to prove that for , the answer to Question 4.5 is no—i.e. that there exist involutions in that are not sign-change reducible. Hence it suffices to prove that there exist such involutions of level 2. By Proposition 4.8, it therefore suffices to establish (for ) the existence of involutions that are not sign-change reducible by a sign-change of level 2; thus it suffices to show that Statement 2 of Proposition 4.11 is false when and . For this, we need only produce planes in for which we can show that (4.15) is false for all . Towards this end, we examine two (families) of examples in which and .
Example 7.1
Let or , where . Define vectors by
[TABLE]
The set is orthonormal. Let , a 2-plane in . We will compute the principal angles between and for all . Write , where . Let be the matrix whose first column is and whose second column is . Since the columns of are an orthonormal basis of , the principal angles between and are the arc-cosines of the singular values of .
First suppose that is even. We divide the elements into two cases: Case I= ; Case II= . The principal values of the matrix are easily computed to be [math] and in Case I, and (with multiplicity 2) in Case II. Hence the principal angles are
[TABLE]
so
[TABLE]
We will return to (7.1) shortly, but first let us do the analogous computation for odd. For , we divide the computation into three cases: Case I= ; Case II= ; and Case III= . The principal values of the matrix are [math] and in Case I, and in Case II, and in Case III. Hence the principal angles are
[TABLE]
Clearly is larger in Case III than in Case II, so
[TABLE]
It follows from (7.1) and (7.2) that
[TABLE]
since in Example 7.1. Hence for large enough , Statement 2 in Proposition 4.11 is false, and therefore so is Statement 1. This already shows that for all sufficiently large, there exist geodesically antipodal pairs in that are not sign-change reducible. However, to get the quantitative statement in Proposition 3.18(b), we have to continue working.
It can be shown444The authors did not find this exercise in Calculus 1 entirely trivial, but are nonetheless leaving it to the reader.* that for ,*
[TABLE]
hence that in (7.1) in (7.2), the second of the two expressions being compared is the smaller. Thus
[TABLE]
Since in Example 7.1, , so equation (7.5) shows that (4.15) (with ) is false for all if ; equivalently, if This translates to . Hence the answer to Question 4.5 is definitely “no” for all . To complete the proof of Proposition 3.18(b), it remains only to show that this “12” can be reduced to “11”. We will accomplish this with the next example.
Example 7.2
Let , where Define vectors by
[TABLE]
As in the previous example, is an orthonormal basis of a plane . Just as in Example 7.1, we can compute the principal angles between and for all . We define Cases I and II and III just as in the odd- case of the previous example. The principal values of the relevant matrices are [math] and in Case I, and in Case II, and
[TABLE]
in Case III. Hence
[TABLE]
Numerically, we find that for , the middle line of (LABEL:counterex-2_distsq) is the smallest of the three lines, so
[TABLE]
*Since this number is larger than , the answer to Question 4.5 is no for . This completes the proof of Proposition 3.18. ** ** ** ** *
**
Remarks 7.3
(1) We considered Example 7.2 only for odd because for even , the principal angles turn out to be the same as for in Example 7.1. In Example 7.2, we can also compute numerically that for , and , we have . However, we cannot conclude that the answer to Question 4.5 is “yes” for , since we have not proven that this example represents the worst case, i.e. that for all . Thus Question 4.5 remains open for . However, based on computations, it seems likely to the authors that the largest for which the answer to Question 4.5 is yes is closer to 10 than to 4.
(2) The number in(7) is exactly the squared diameter of for all . Thus, (7) shows that as , the distance between and the closest coordinate plane(s) is approaching the largest possible distance between two points in .
Appendix A Partitions and Fibers
A.1 Partitions and eigenstructure
The strata of each of the stratified spaces in this paper are labeled naturally either by or by
The natural left-action of the symmetric group on induces left-actions of on and . There is a canonical bijection between the quotient and the set , so we implicitly regard these as the same set. For , we write for the image of in under the quotient map.
The sets and are partially ordered by the refinement relation. For , we write if refines . Similarly, for we write if refines . In each of these partially ordered sets there is a well-defined “highest” (most refined) and “lowest” (least refined) element; we denote these with the subscripts “top” and “bot” respectively.
Notation A.1
-
For , let denote the partition of determined by the equivalence relation .
-
For , let denote the subspace . For a partition of (where the are the blocks of ), let denote the corresponding subspaces of ; note that we have an orthogonal decomposition . Define the subgroup by
[TABLE]
We write for the identity component of .**
As the reader may check, the above definition of agrees with the definition in Section 2: for all we have .
For any subgroup , we write for H\mbox{\small\ \bigcap\ }SO(p). Note that
[TABLE]
where denotes the orthogonal group of the subspace , which we identify with a subgroup of . Hence, writing , we have
[TABLE]
A.2 Signed permutations and signed-permutation matrices
Let . The role of will be as the group of signs, so we write its elements as . We write the identity element of as . For and {\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(\sigma_{1},\sigma_{2}\dots,\sigma_{p})\in{\cal I}_{p} we define \epsilon{\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(\epsilon\sigma_{1},\epsilon\sigma_{2},\dots,\epsilon\sigma_{p}).
Both and have natural representations on via sign-changes and permutations of coordinates, respectively. These representations, which we denote respectively as {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\mapsto I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}} and , embed and as subgroups of , together generating the group of “signed-permutation matrices”. Abstractly, this group is a semidirect product , a split extension of by , embedded naturally in via ({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)\mapsto I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}P_{\pi}. Defining homomorphisms and by and \widetilde{{\rm sgn}}({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)={\rm sgn}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}){\rm sgn}(\pi) (where is the sign of the permutation ), we have \widetilde{{\rm sgn}}({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)=\det(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}P_{\pi}). Thus the group of even signed-permutations, defined in Section 2, is simply the kernel of , and we have a short exact sequence
[TABLE]
Since (non-canonically), is an extension of by , and .
The group is a well-studied group encountered in other settings (rather different from this paper’s) as , the Weyl group of the simple Lie algebra **[13]**. Thus is an index-two subgroup of . The application to eigenstructure motivates viewing as an extension of : an element of determines an element of up to the action of , but this action does not lift canonically to a fiber-preserving action of on (at least not for even; see below); we need to extend to a larger group to obtain such an action. For each , the fiber can be identified with positively oriented orthonormal -eigenbases of ; the action of sends one such -eigenbasis to another.
A familiar index-two subgroup of different from is the kernel of the map ({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)\mapsto{\rm sgn}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}). For , the latter subgroup is the Weyl group of the simple Lie algebra . However, the analog of (A.5) for splits for all , while (A.5) splits if and only if is odd. For odd, the map defined by ({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)\mapsto({\rm sgn}({\mbox{\boldmath\sigma \unboldmath}\mbox{}}){\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi) is an isomorphism, but it is known that for even, is not isomorphic to **[12, p. 151]**.
Remark A.2
For a subspace and , let denote the set of orthogonal transformations with determinant . In the setting of (A.2), the connected components of are , subject to the restriction . Thus a labeling of the blocks of an -block partition yields a 1-1 correspondence between and the set of connected components of . In particular, the number of connected components is . **
Identifying with , the natural left-action of on yields a left-action of on . For , we will write for its image in the quotient space .
Note that the action of on lifts to an action of on ,
[TABLE]
It is easily seen that P_{g}DP_{g}^{-1}=\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D for all .
A.3 Structure of the fibers
The starting point for a systematic description of the fibers of is the following proposition. The group-action notation is as in (3.6).
Proposition A.3
Let and . Then
[TABLE]
Proof*: This is a simple corollary of [10, Theorem 3.3]. Details are left to the reader. ** ** ** ** *
**
Corollary A.4
Let and . Then
[TABLE]
Proof*: Clearly the right-hand side of (A.8) is contained in the right-hand side of (A.7), so it suffices to prove the opposite inclusion.*
Let . Enumerate the blocks of as and let be as in Notation A.1. As noted in Remark A.2, the enumeration of the blocks of yields a 1-1 correspondence between and the connected components of . Let lie in the component of labeled by . The cardinality of is some even number . Let {\mbox{\boldmath\sigma \unboldmath}\mbox{}}=(\sigma_{1},\dots,\sigma_{p})\in{\cal I}_{p}, where for we set
[TABLE]
*Then R_{1}:=I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}R\in G_{\sf J}^{0}. But also , so {\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{p}^{+}\subset{\tilde{S}}_{p}^{+}, and P_{g}I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}=P_{g_{1}} for some with . Hence P_{g}R=(P_{g}I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}})(I_{\mbox{\scriptsize\boldmath\sigma \unboldmath}\mbox{}}R)=P_{g_{1}}R_{1}, so (U(P_{g}R)^{-1},\pi_{g}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D)=(U(P_{g_{1}}R_{1})^{-1},\pi_{g_{1}}{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}D), which lies in the right-hand side of (A.8). The desired inclusion follows. ** ** ** ** *
**
To complete our characterization of the fibers of , we introduce one more bit of notation:
Notation A.5
For , define
[TABLE]
(a subgroup of ).
The groups generalize ; we have . Observe that an equivalent definition of the group defined in Notation 2.1 is \Gamma_{\sf J}^{0}=\{({\mbox{\boldmath\sigma \unboldmath}\mbox{}},\pi)\in{\tilde{S}}_{p}^{+}:{\mbox{\boldmath\sigma \unboldmath}\mbox{}}\in{\cal I}_{\sf J}^{+},\pi\in K_{\sf J}\}. Thus, analogously to (A.5), we have a short exact sequence
[TABLE]
Next, observe that the action (3.6) of on induces, for each , an action of on , given by
[TABLE]
This leads us to:
Proposition A.6
Let . Then every determines a bijection between and the set .
Proof*: Two elements lie in the same component of if and only if and only if and for some . Thus it is clear from (A.8) that the action (A.11) of on is transitive. Therefore for any , the map {\tilde{S}}_{p}^{+}\to{\rm Comp}({\cal E}_{X}),g\mapsto g{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}[(U,D)], induces a bijection , where is the stabilizer of under the action (A.11). But, as is easily checked, is exactly the group . ** ** ** ** *
**
An important special case of Proposition A.6 is the case in which all eigenvalues of are distinct. In this case, and . Thus the action of on is free as well as transitive. Furthermore , so each connected component of is a single point; . Thus itself is an orbit of , and any choice of yields a bijection , g\mapsto g{\mbox{ \raisebox{-1.084pt}{\bf{\Large\cdot}}}}(U,D).
Corollary A.7
Let , , and let be the parts of the partition of . Then is diffeomorphic to a disjoint union of copies of .
Proof*: Let . It is clear from (2.1) that each connected component of is a submanifold of diffeomorphic to , which from (A.4) is isomorphic (hence diffeomorphic) to . From Proposition A.6, the number of connected components is . As noted earlier, , while from (A.10) we have . It is easily seen that is isomorphic to , and that is isomorphic to , and hence that . The result follows. ** ** ** ** *
**
Remark A.8
An alternate, instructive route to Corollary A.7 is the following. (We merely sketch the ideas; the reader may fill in the details.) For , define . Thus the set is a finite union of left-cosets of , each of which is diffeomorphic to the compact submanifold . If , and , the map , , is an embedding with image . Hence is a submanifold of diffeomorphic to . But for any closed subgroups of a compact Lie group , the set is a submanifold of and a principal -bundle over H_{1}/(H_{1}\mbox{\small\ \bigcap\ }H_{2}), with projection map given by h_{1}h_{2}\mapsto h_{1}(H_{1}\mbox{\small\ \bigcap\ }H_{2}). Applying this to the case , , we have H_{1}\mbox{\small\ \bigcap\ }H_{2}=\Gamma_{\sf J}, so is a principal -bundle over the finite set . But the natural map (where is as in (A.5)), is a bijection, so may be viewed as a principal -bundle over . The cardinality of this base-space is , which is simply the multinomial coefficient if . Thus is diffeomorphic to copies of , and each copy of is diffeomorphic to copies of . **
Appendix B Stratification of , , and related spaces
We provide here a brief outline of the stratifications relevant to this paper. For a more detailed discussion, see **[7, Section 2.7]**.
As noted in Section 2, acts on via As with any group-action, elements are said to have the same orbit type if their stabilizers are conjugate; in this case the fibers are diffeomorphic. The orbit-type stratification of any manifold under the action of a compact Lie group is known to be a Whitney stratification (**[5, p. 21]**).
We use to define stratifications of the spaces and , and use to define stratifications of and . The commutative diagram in Figure 1 indicates the relationships among these spaces and label-sets. We define strata as the diagram suggests: for and , (i) , (ii) , (iii) and (iv) . The maps label elements of by partitions of the set and the integer , respectively; is projection onto the second factor; and is the map induced by on the indicated quotients.
For we may call the partition the eigenvalue-multiplicity type of . The stratification of by eigenvalue-multiplicity type is identical to the orbit-type stratification.
In any stratified space, there is a natural partial ordering on the set of strata defined by declaring if . Using this partial ordering of strata for the spaces in the left-hand square in Figure 1, it is easily checked that all the maps in Figure 1 are either order-preserving themselves (in the case of ) or induce order-preserving maps on the corresponding sets of strata (in the case of all the other maps). In particular, each of the stratified spaces in the left-hand square in Figure 1 has a top stratum and a bottom stratum.
References
- [1]
L. J. Billera, S. P. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees, Adv. in Appl. Math. 27 (4) (2001) 733–767. *
- [2]
J. Cheeger, D. G. Ebin, Comparison Theorems in Riemannian Geometry, North Holland/American Elsevier, Amsterdam, 1975. *
- [3]
J. Damon, J. Marron, Backwards principal component analysis and principal nested relations, J. Math. Imaging and Vision 50 (1) (2014), 107–114. *
- [4]
A. Edelman, T. A. Arias, S. T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (2) (1998) 303–353. *
- [5]
C. G. Gibson, K. Wirthmüller, A. A. du Plessis, E. J. N. Looijenga, Topological Stability of Smooth Mappings, Lecture Notes in Mathematics, Vol. 552, Springer-Verlag, Berlin, 1976. *
- [6]
G. H. Golub, C. F. Van Loan, Matrix Computations, 2nd edition, The Johns Hopkins University Press, 1989. *
- [7]
D. Groisser, S. Jung, A. Schwartzman, Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: minimal smooth scaling-rotation curves in low dimensions, Electronic J. Stat. 11 (1), 1092–1159. *
- [8]
D. Groisser, S. Jung, A. Schwartzman, A scaling-rotation metric on the space of symmetric positive-definite matrices, in preparation. *
- [9]
T. Hotz, S. Huckemann, H. Le, J. S. Marron, J. C. Mattingly, E. Miller, J. Nolen, M. Owen, V. Patrangenaru, S. Skwerer, Sticky central limit theorems on open books, Ann. Appl. Prob. 23 (6) (2013) 2238–2258. *
- [10]
S. Jung, A. Schwartzman, D. Groisser, Scaling-rotation distance and interpolation of symmetric positive-definite matrices, Siam J. Matrix Anal. Appl., 36 (3) (2015) 1180–1201. *
- [11]
D. G. Kendall, D. Barden, T. K. Carne, H. Le, *Shape and Shape Theory, Wiley Series in Probability and Statistics, John Wiley & Sons Ltd., Chichester, 1999.
- [12] H. Pahlings, Characterization of groups by their character tables, Comm. Alg. 4 (2) (1976), 111–153.
- [13] H. Samelson, Notes on Lie Algebras, Van Nostrand Reinhold Company, 1969.
- [14]
A. Schwartzman, Random ellipsoids and false discovery rates: statistics for diffusion tensor imaging data, Ph.D. thesis, Stanford University (2006).
- [15]
A. Schwartzman, W. F. Mascarenhas, J. E. Taylor, Inference for eigenvalues and eigenvectors of Gaussian symmetric matrices, Ann. Statist. 36 (6) (2008) 2886–2919.
- [16]
Y.-C. Wong, Differential geometry of Grassmann manifolds, Proc. Nat. Acad. Sci. U.S.A. 57 (1967) 589–594.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. J. Billera, S. P. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees , Adv. in Appl. Math. 27 (4) (2001) 733–767. · doi ↗
- 2[2] J. Cheeger, D. G. Ebin, Comparison Theorems in Riemannian Geometry , North Holland/American Elsevier, Amsterdam, 1975.
- 3[3] J. Damon, J. Marron, Backwards principal component analysis and principal nested relations, J. Math. Imaging and Vision 50 (1) (2014), 107–114.
- 4[4] A. Edelman, T. A. Arias, S. T. Smith, The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl. 20 (2) (1998) 303–353.
- 5[5] C. G. Gibson, K. Wirthmüller, A. A. du Plessis, E. J. N. Looijenga, Topological Stability of Smooth Mappings , Lecture Notes in Mathematics, Vol. 552, Springer-Verlag, Berlin, 1976.
- 6[6] G. H. Golub, C. F. Van Loan, Matrix Computations , 2nd edition, The Johns Hopkins University Press, 1989.
- 7[7] D. Groisser, S. Jung, A. Schwartzman, Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: minimal smooth scaling-rotation curves in low dimensions, Electronic J. Stat. 11 (1) , 1092–1159.
- 8[8] D. Groisser, S. Jung, A. Schwartzman, A scaling-rotation metric on the space of symmetric positive-definite matrices, in preparation.
