Multidimensional Scaling on Metric Measure Spaces
Henry Adams, Mark Blumstein, Lara Kassab

TL;DR
This paper extends classical multidimensional scaling (MDS) theory to infinite metric measure spaces, exploring optimality, embeddings of spheres, and convergence properties, thereby broadening the understanding of MDS in more general geometric contexts.
Contribution
It introduces a generalized notion of MDS for infinite metric measure spaces, analyzing embeddings of spheres and convergence behavior, which advances the theoretical framework of MDS.
Findings
Generalization of MDS to infinite metric measure spaces
Analysis of MDS embeddings of spheres like $S^1$ and $S^n$
Results on convergence of MDS embeddings under space convergence
Abstract
Multidimensional scaling (MDS) is a popular technique for mapping a finite metric space into a low-dimensional Euclidean space in a way that best preserves pairwise distances. We overview the theory of classical MDS, along with its optimality properties and goodness of fit. Further, we present a notion of MDS on infinite metric measure spaces that generalizes these optimality properties. As a consequence we can study the MDS embeddings of the geodesic circle into for all , and ask questions about the MDS embeddings of the geodesic -spheres into . Finally, we address questions on convergence of MDS. For instance, if a sequence of metric measure spaces converges to a fixed metric measure space , then in what sense do the MDS embeddings of these spaces converge to the MDS embedding of ?
| Elements | Classical MDS | Infinite MDS |
|---|---|---|
| Data | with | |
| Distance Representation | ||
| Linear Operator | ||
| Eigenvalues | ||
| Eigenvectors | ||
| Embedding in or | ||
| Strain Minimization |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Multidimensional scaling on metric measure spaces
Henry Adams
,
Mark Blumstein
and
Lara Kassab
Abstract.
Multidimensional scaling (MDS) is a popular technique for mapping a finite metric space into a low-dimensional Euclidean space in a way that best preserves pairwise distances. We overview the theory of classical MDS, along with its optimality properties and goodness of fit. Further, we present a notion of MDS on infinite metric measure spaces that generalizes these optimality properties. As a consequence we can study the MDS embeddings of the geodesic circle into for all , and ask questions about the MDS embeddings of the geodesic -spheres into . Finally, we address questions on convergence of MDS. For instance, if a sequence of metric measure spaces converges to a fixed metric measure space , then in what sense do the MDS embeddings of these spaces converge to the MDS embedding of ?
1. Introduction
Given objects and a notion of dissimilarity between them, the classical multidimensional scaling (MDS) algorithm extracts a configuration of points in Euclidean space whose pairwise distances “best” approximate the given dissimilarities. A typical source of dissimilarity data is the distance between high-dimensional objects, in which case MDS serves as a non-linear dimensionality reduction and visualization technique. As such, the MDS algorithm is a popular technique for pattern recognition problems. In this paper, we survey the classical algorithm, and describe an extension to (possibly infinite) metric measure spaces.
The coordinates extracted from an MDS embedding satisfy a least squares optimization problem. While there are several popular choices of MDS loss function (metric or non-metric), we primarily focus on the classical algorithm which minimizes a form of loss function known as strain. The classical algorithm is algebraic and not iterative, simple to implement, and guaranteed to discover a configuration which optimizes the strain function. Furthermore, if the input dissimilarities can be realized as distances in a Euclidean space, then classical MDS is guaranteed to recover such a configuration (unique up to translation and orthogonal transformation). However, not all dissimilarity data admits a Euclidean realization. In this case MDS produces a mapping into Euclidean space that distorts the inter-point pairwise distances as little as possible. We make these ideas precise in Section 2.
The classical story is told using finite samples of points, finite dissimilarity matrices, and finite embedding coordinates. Our goal is to extend to an infinite setting, where our input dissimilarity data is replaced by a metric measure space: a metric space (with possibly infinitely many points) equipped with some probability measure. This allows us to consider spaces whose points are weighted unequally, along with notions of convergence as more and more points are sampled from an infinite shape.
In more detail, a metric measure space is a triple where is a compact metric space, and is a Borel probability measure on . In Section 4 we generalize the the classical MDS algorithm to metric measure spaces, and we show that this generalization minimizes the infinite analogue of strain. As a motivating example, we consider the MDS embedding of the circle with the (non-Euclidean) geodesic metric, and equipped with the uniform measure. By using the properties of circulant matrices, we identify the MDS embeddings of evenly-spaced points from the geodesic circle into , for all . As the number of points tends to infinity, these embeddings lie along the curve
[TABLE]
As this example illustrates, it is useful to consider the situation where a sequence of metric measure spaces converges to a fixed metric measure space as . We survey various notions of convergence in Section 6.
Convergence is well-understood when each metric space has the same finite number of points, for example by Sibson’s perturbation analysis [22]. However, we are also interested in convergence when the number of points varies and is possibly infinite. We survey results of [1, 14] on the convergence of MDS when points are sampled from a metric space according to a probability measure , in the limit as . The law of large numbers describes how the finite measures converge to as . In [13], we reprove these results when instead we are given an arbitrary sequence of probability measures . The measures may now be unequally weighted, or have infinite support, for example.
Organization
We present an overview on the theory of classical MDS in Section 2. In Section 3, we present necessary background information on operator theory and infinite-dimensional linear algebra. We define a notion of MDS for infinite metric measure spaces in Section 4. In Section 5, we identify the MDS embeddings of the geodesic circle into , for all , as a motivating example. Lastly, in Section 6, we survey different notions of convergence of MDS.
Related Work
The reader is referred to the introduction of [25] and to [10, 12] for some aspects of the history of MDS. There are a variety of papers that study some notion of robustness or convergence of MDS, including [1, 21, 22, 23]. Furthermore, [19, Section 3.3] considers embedding new points in psuedo-Euclidean spaces, [11, Section 3] considers infinite MDS in the case where the underlying space is an interval (equipped with some metric), and [7, Section 6.3] discusses MDS on large numbers of objects.
2. Classical Scaling
Multidimensional scaling (MDS) is a set of statistical techniques concerned with the problem of using information about the dissimilarities between objects in order to construct a configuration of points in Euclidean space. The input dissimilarities between the objects need not be based on Euclidean distances.
Definition 2.1**.**
An matrix is called a dissimilarity matrix if it is symmetric and
[TABLE]
The first property above is called refectivity (the dissimilarity between an object and itself is zero), and the second property is called nonnegativity. Symmetry requires that the dissimilarity from object to is the same as that from to . Note that there is no need to satisfy the triangle inequality. A dissimilarity matrix is called Euclidean if there exists a configuration of points in some Euclidean space whose interpoint distances are given by .
The goal of MDS is to map the objects to a configuration (or embedding) of points in so that the given dissimilarities are well-approximated by the Euclidean distances . The different notions of approximation give rise to the different types of MDS.
If the dissimilarity matrix can be realized exactly as the distance matrix of some set of points in (i.e. if the dissimilarity matrix is Euclidean), then MDS will find such a realization. Furthermore, MDS can be used to identify the minimum Euclidean dimension admitting such an isometric embedding. However, some dissimilarity matrices or metric spaces are inherently non-Euclidean (cannot be embedded into for any ). When a dissimilarity matrix is not Euclidean, then MDS produces a mapping into that distorts the interpoint pairwise distances as little as possible. Though we introduce MDS below, the reader is also referred to [2, 8, 12] for more complete introductions.
Classical multidimensional scaling (cMDS) is also known as Principal Coordinates Analysis (PCoA), Torgerson Scaling, or Torgerson–Gower scaling. The cMDS algorithm minimizes a loss function called strain, and one of the main advantages of cMDS is that its algorithm is algebraic and not iterative. Therefore, it is simple to implement, and it is guaranteed to discover the optimal configuration in . In this section, we describe the cMDS algorithm, and then discuss some of its optimality properties and goodness of fit.
As an illustrative example, we consider ten U.S. cities equipped with the road distance between them, which is a non-Euclidean distance. The classical MDS algorithm produces a two dimensional configuration of points (see Figure 1), where the points represent the different cities. The Euclidean pairwise distances (distances as the crow flies) between the cities in the MDS embedding are the Euclidean distances that best approximate the road distances between them.
Let be an dissimilarity matrix. Let , where . Define the matrix to be the double mean-centering of , with entries given by
[TABLE]
Since is a symmetric matrix, it follows that and are each symmetric, and therefore has real eigenvalues.
Assume for convenience that there are at least positive eigenvalues for matrix , where . By the spectral theorem of symmetric matrices, let , with containing unit-length eigenvectors of as its columns, and with the diagonal matrix containing the eigenvalues of in decreasing order along its diagonal. Let be the diagonal matrix of the largest eigenvalues sorted in descending order, and let be the matrix of the corresponding eigenvectors in . The coordinates of the MDS embedding into are then given by the matrix . More precisely, the MDS embedding consists of the points in given by the rows of . The procedure for classical MDS can be summarized in the following algorithm.
We give a small example.
Example 2.2**.**
We implement Algorithm 1 on the following dissimilarity matrix .
[TABLE]
The eigenvalues of are , , [math], and , and the MDS embedding of in is drawn in Figure 2.
This dissimilarity matrix is not Euclidean. Indeed, label the points , , , in order of their row/column in . In any isometric embedding in , the points , , would be mapped to an equilateral triangle. The point would need to get mapped to the midpoint of each edge of this triangle, which is impossible in Euclidean space.
The following fundamental criterion determines algebraically whether a dissimilarity matrix is Euclidean or not.
Theorem 2.3**.**
[2, Theorem 14.2.1]** Let be a dissimilarity matrix, and define by equation (1). Then is Euclidean if and only if is a positive semi-definite matrix.
Moreover, if is positive semi-definite of rank , then a perfect realization of the dissimilarities can be found by a collection of points in -dimensional Euclidean space.
Let be a dissimilarity matrix, and define via (1). A measure of the goodness of fit of MDS, even in the case when is not Euclidean, can be obtained as follows. If is a fitted configuration in with centered inner-product matrix , then a measure of the discrepancy between and is the following strain function [16],
[TABLE]
Theorem 2.4**.**
[2, Theorem 14.4.2]** Let be a dissimilarity matrix. Then for fixed , the strain function in (2) is minimized over all configurations in dimensions when is the classical solution to the MDS problem.
The reader is referred to [8, Section 2.4] for a summary of a related optimization procedure with a different normalization, due to Sammon [20].
3. Preliminaries
We are interested in studying the MDS embeddings of spaces with possibly infinitely many points, and distance matrices aren’t enough to store infinitely many pairwise distances. Instead, we use kernels, which roughly speaking are distance functions that compute the pairwise distance between any two points in the space. For example, the kernel corresponding to the geodesic distance on a circle is illustrated in Figure 3.
This section introduces the reader to concepts in infinite-dimensional linear algebra and operator theory used throughout the paper.
Kernels and Operators
Let be a metric space equipped with a measure . We denote by the set of square-integrable real-valued -functions with respect to the measure . We note that is furthermore a Hilbert space, after equipping it with the inner product given by .
A real-valued -kernel is a continuous measurable square-integrable function. The kernels that we consider in this paper are symmetric, meaning that for all . A symmetric kernel is positive semi-definite if
[TABLE]
holds for any , any , and any . At least in the case when is a compact subspace of (and probably more generally), a symmetric kernel is positive semi-definite if
[TABLE]
for any .
Definition 3.1** (Hilbert–Schmidt Integral Operator).**
Let be a -finite measure space, and let be an -kernel on . Then the integral operator
[TABLE]
which defines a linear mapping from the space into itself, is called a Hilbert–Schmidt integral operator.
Hilbert–Schmidt integral operators are both continuous (and hence bounded) and compact operators.
Definition 3.2**.**
A Hilbert–Schmidt integral operator is a self-adjoint operator if holds for almost all (with respect to ).
Definition 3.3**.**
A bounded self-adjoint operator on a Hilbert space is a positive semi-definite operator if for any .
It follows that the eigenvalues of a positive semi-definite operator , when they exist, are real.
The Spectral Theorem
Classical MDS relies on the fact that symmmetric matrices are orthogonally diagonalizable with real eigenvalues. Furthermore, positive semi-definite matrices (having nonnegative eigenvalues) may be represented as matrices of Euclidean inner products. The following two theorems give analogues of these results for kernels instead of matrices.
Theorem 3.4** (Spectral theorem on compact self-adjoint operators).**
Let be a Hilbert space, and suppose is a bounded compact self-adjoint operator. Then has at most a countable number of nonzero eigenvalues , with a corresponding orthonormal set of eigenvectors, such that
[TABLE]
Furthermore, the multiplicity of each nonzero eigenvalue is finite, zero is the only possible accumulation point of , and if the set of nonzero eigenvalues is infinite then zero is an accumulation point.
A fundamental theorem that characterizes positive semi-definite kernels is the Generalized Mercer’s Theorem.
Theorem 3.5** (Generalized Mercer’s Theorem).**
[15, Lemma 1]** Let be a compact topological Hausdorff space equipped with a finite Borel measure , and let be a continuous positive semi-definite kernel. Then, there exists a scalar sequence with , and an orthonormal system of continuous square-integrable functions with respesct to , such that the expansion
[TABLE]
converges uniformly, where denotes the support of a measure .
Therefore, given and as in Theorem 3.5, the associated Hilbert–Schmidt integral operator
[TABLE]
is also positive semi-definite. Moreover, the eigenvalues of can be arranged in non-increasing order , indexed according to their algebraic multiplicities, and the orthonormal system gives the corresponding eigenfunctions of .
4. MDS of Infinite Metric Measure Spaces
Classical multidimensional scaling (cMDS) can be described either as a strain-minimization problem, or as a linear algebra algorithm involving eigenvalues and eigenvectors. Indeed, one of the main theoretical results for cMDS is that the linear algebra algorithm solves the corresponding strain-minimization problem (see Theorem 2.4). In this section, we describe how to generalize both of these formulations to (possibly infinite) metric measure spaces.
This will allow us to discuss the MDS embedding of the circle, for example, without needing to restrict attention to finite subsets thereof.
Definition 4.1**.**
A metric measure space is a triple where
- •
is a compact metric space, and
- •
is a Borel probability measure on , i.e. .
Given a metric space , by a measure on we mean a measure on the Borel -algebra of . When it is clear from the context, the triple will be denoted by only . The reader is referred to [17, 18] for details on metric measure spaces, and for interpretations of these concepts in the context of object matching.
Let be a metric measure space, with a -function on . We say that is Euclidean if it can be isometrically embedded into . is furthermore Euclidean in the finite-dimensional sense if there is an isometric embedding .
MDS on Infinite Metric Measure Spaces
Let be a metric measure space, where is an -function on .
We propose the following MDS method on infinite metric measure spaces:
- (i)
From the metric , construct the kernel defined as
[TABLE] 2. (ii)
Obtain the kernel via
[TABLE]
Assume . Define as
[TABLE] 3. (iii)
Let denote the eigenvalues of , with corresponding eigenfunctions forming an orthonormal system in . 4. (iv)
Define , where if , and otherwise . Let be the Hilbert–Schmidt integral operator associated to the kernel . The eigenfunctions for (with eigenvalues ) are also the eigenfunctions for (with eigenvalues ). By Mercer’s Theorem (Theorem 3.5), converges uniformly. 5. (v)
Define the MDS embedding of into via the map given by
[TABLE]
Similarly, define the MDS embedding of into via the map given by
[TABLE]
The procedure for infinite classical MDS can be summarized in the following algorithm.
Proposition 4.2**.**
[13, Proposition 6.3.1.]** The MDS embedding map is continuous.
The following theorem generalizes Theorem 2.3 to metric measure spaces.
Theorem 4.3**.**
[13, Theorem 6.3.3.]** A metric measure space is Euclidean if and only if is a positive semi-definite operator on
We show that MDS for metric measure spaces minimizes the loss function , defined as
[TABLE]
This result generalizes [2, Theorem 14.4.2], or equivalently [24, Theorem 2], to the infinite case.
Theorem 4.4**.**
[13, Theorem 6.4.3.]** Let be a metric measure space. Then is minimized over all maps or when is the MDS embedding given in Section 4.
5. MDS of the Circle
Let be the unit circle equipped with arc-length distance and the uniform measure (). Using our definition of MDS as an integral operator, we show that MDS maps into an infinite dimensional sphere of radius sitting inside . The embedded circle occupies an infinite number of dimensions in , and in fact, the infinite dimensional space is needed—the embedding is better (in the sense of strain minimization) than the MDS embedding into for any finite .
It is instructive to consider how MDS on finite samples of converges to the MDS integral operator on the entire circle. We start with the easiest case: let be the sample of evenly-spaced points on .
Proposition 5.1**.**
The classical MDS embedding of lies, up to a rigid motion of , on the curve defined by
[TABLE]
where (with odd).
Figure 4 shows the MDS configuration in of 1000 equally-spaced points on obtained using the three largest positive eigenvalues.
We sketch the outline of this computation; full details are given in [13]. Let be the arc-length distance matrix for . Following the steps of classical MDS, define with , and let be the doubly mean-centered version of matrix . A matrix is called circulant if cyclically shifting all rows of down by one has the same effect as cyclically shifting all columns of left by one. Both and the double mean centering matrix have this property, and therefore the MDS symmetric matrix is circulant. In coordinates, it has the following form:
[TABLE]
For example, if is the distance matrix for equally-spaced points on the circle, then we compute
[TABLE]
The complex eigenvectors of such a matrix are given by the discrete Fourier modes, namely for , where . Since the first entry of each vector is one, the eigenvalue of can be computed simply by taking the dot product of the first row of with . Note that the vector of all ones has eigenvalue zero.
Since is symmetric, each complex eigenvector can be split into its real and imaginary part, which forms two real eigenvectors—this explains the sine and cosine representation of eigenvectors in the proposition. It turns out that the odd Fourier modes have positive eigenvalues, and the even Fourier modes have negative eigenvalues. Since MDS retains coordinates corresponding to positve eigenvalues, we are left with only the odd Fourier modes.
How does this finite MDS computation compare to the MDS integral operator on all of ? Let be the unit circle with arc-length distance and uniform measure. If , then one may check (use integration by parts) that
[TABLE]
Despite not having performed the double mean centering step to the kernel function, this computation shows that the (complex) eigenfunctions of MDS on are with , . Indeed, the mean centering step associates the eigenfunction with the eigenvalue [math], and the other Fourier basis functions remain invariant to the double mean centering since they are perpendicular to . Thus, as expected from Proposition 5.1, the MDS embedding of is
[TABLE]
where the is a normalization factor we picked up moving from a complex to a real eigendecomposition.
A couple of observations:
- (1)
Applying the Euclidean distance formula to the image of shows that for all ,
[TABLE]
That is, the MDS embedding lies on an infinite-dimensional sphere of radius in . 2. (2)
The distance between and gives an approximation of the arc-length distance between angles and :
[TABLE]
We leave it to the reader to verify that the expression above constitutes the odd modes in the Fourier series expansion of the periodic function . In fact, the error of MDS comes precisely from the even modes:
[TABLE]
For this example, the issue of convergence of MDS on finite samples to MDS on the manifold is intuitively clear: the discrete Fourier modes converge (pointwise on the sample points) to the Fourier basis . However, in general the issue of convergence is not as straightforward. In the next section of the paper we survey results on convergence.
The MDS embeddings of the geodesic circle are closely related to [26], which was written prior to the invention of MDS. In [26, Theorem 1], von Neumann and Schoenberg describe (roughly speaking) which metrics on the circle one can isometrically embed into the Hilbert space . The geodesic metric on the circle is not one of these metrics. However, the MDS embedding of the geodesic circle into must produce a metric on which is of the form described in [26, Theorem 1]. See also [28, Section 5] and [3, 6, 9].
6. Convergence of MDS
We saw in the prior section how MDS on an evenly-spaced sample from the geodesic circle generalizes to the MDS integral operator on the entire circle. In this section, we address convergence questions for MDS more generally. Convergence is well-understood when each metric space has the same finite number of points [22], but we are also interested in convergence when the number of points varies and is possibly infinite.
6.1. Robustness of MDS with Respect to Perturbations
In a series of papers [21, 22, 23], the authors consider the robustness of multidimensional scaling with respect to perturbations of the underlying dissimilarity or distance matrix, as illustrated in Figure 5. In particular, [22] gives quantitative control over the perturbation of the eigenvalues and vectors determining an MDS embedding in terms of the perturbations of the dissimilarities. These results build upon the fact that if and are a simple (i.e., non-repeated) eigenvalue and eigenvector of an matrix , then one can control the change in and upon a small symmetric perturbation of the entries in .
Sibson’s perturbation analysis shows that if one has a converging sequence of dissimilarity matrices, then the corresponding MDS embeddings of points into Euclidean space also converge. In the following sections, we consider the convergence of MDS when the number of points is not fixed. Indeed, we study the convergence of MDS when the number of points is finite but tending to infinity, and alternatively also when the number of points is infinite at each stage in a converging sequence of metric measure spaces.
6.2. Convergence of MDS by the Law of Large Numbers
Whereas Sibson’s perturbation analysis was for MDS on a fixed number of points, we now survey results on the convergence of MDS when points are sampled from a metric space according to a probability measure , in the limit as , i.e. when more and more points are sampled. In [1], Bengio et al. study converging measures which are averages of Dirac delta functions, namely , with all of the random points weighted equally (see Figure 6). Unsurprisingly, these results rely on the law of large numbers.
Consider a data set sampled independent and identically distributed (i.i.d.) from an unknown probability measure on . To generalize MDS, Bengio et al. define a corresponding data-dependent kernel that generalizes the mean centering matrix (as defined in Section 4). Consequently, they study the convergence of eigenvalues and eigenfuctions of the integral operator associated to the kernel as the number of sampled points increases, and they show the convergence of the MDS embeddings under desirable conditions. They use a fundamental result on the convergence of eigenvalues of this type of integral operator from [14].
6.3. Convergence of MDS for Arbitrary Measures
In [13], we reprove the results of the previous section under a a different setting which is more general in the sense that we allow for an arbitrary sequence of convergent measures, but which is easier in the sense that this sequence is fixed (i.e. deterministic, not random).
Indeed, let be a compact metric space. Suppose is an arbitrary sequence of probability measures on for all , such that converges to in total variation as . Roughly speaking, this notion of convergence of measures implies the uniform convergence of integrals against bounded measurable functions. For example, a measure in this sequence may again be a sum of Dirac delta functions, although now the weights (with ) need not be identically equal to (Figure 7). Much more generally, the support of any is now allowed to be infinite, as illustrated in Figure 7. Following [1, 14], we give some first results towards showing that the MDS embeddings of converge to the MDS embedding of [13]. We similarily define a data-dependent kernel that generalizes the mean centering matrix (as defined in Section 4). It is important to note that these kernels depend on the measure on the space. We again show convergence of eigenfunctions and consequently of MDS embeddings.
6.4. Convergence of MDS with Respect to Gromov–Wasserstein Distance
We now consider the more general setting in which is an arbitrary sequence of metric measure spaces, converging to in the Gromov–Wasserstein distance, as illustrated in Figure 8 for the finite case and Figure 8 for the infinite case. We remark that need to no longer equal , nor even be a subset of . Indeed, the metric on is allowed to be different from the metric on . Sections 6.2 and 6.3 are the particular case when for all , and the measures are converging to . We now want to consider the case where metric need no longer be equal to .
The Wasserstein (or Kantorovich–Rubinstein) metric is a distance function defined between probability distributions on a given metric space . Intuitively, if each distribution is viewed as a unit amount of “dirt” piled on , the distance between two distributions is the minimum amount of work required to transform one pile of dirt into the other. More generally, the Gromov–Wasserstein distance between metric measure spaces takes into account not only the variation in measures, but also the variation in metrics between these spaces. Applications of the notion of Gromov–Wasserstein distance arise in shape and data analysis [18].
Conjecture 6.1**.**
Let for be a sequence of metric measure spaces that converges to in the Gromov–Wasserstein distance. Then the MDS embeddings converge.
Question 6.2**.**
Are there other notions of convergence of a sequence of arbitrary (possibly infinite) metric measure spaces to a limiting metric measure space that would imply that the MDS embeddings converge in some sense? We remark that one might naturally try to break this into two steps: first analyze which notions of convergence imply that the corresponding operators converge, and then analyze which notions of convergence on the operators imply that their eigendecompositions and MDS embeddings converge.
7. Conclusion
MDS is concerned with problem of mapping the objects to a configuration (or embedding) of points in in such a way that the given dissimilarities are well-approximated by the Euclidean distances between and . We study a notion of MDS on metric measure spaces, which can be simply thought of as spaces of (possibly infinitely many) points equipped with some probability measure. We explain how MDS generalizes to metric measure spaces. Furthermore, we describe in a self-contained fashion an infinite analogue to the classical MDS algorithm. Indeed, classical multidimensional scaling can be described either as a strain-minimization problem, or as a linear algebra algorithm involving eigenvalues and eigenvectors. We describe how to generalize both of these formulations to metric measure spaces. We show that this infinite analogue minimizes a strain function similar to the strain function of classical MDS.
As a motivating example for convergence of MDS, we consider the MDS embeddings of the circle equipped with the (non-Euclidean) geodesic metric. By using the known eigendecomposition of circulant matrices, we identify the MDS embeddings of evenly-spaced points from the geodesic circle into , for all . Indeed, the MDS embeddings of the geodesic circle are closely related to [26], which was written prior to the invention of MDS.
Lastly, we address convergence questions for MDS. Convergence is understood when each metric space in the sequence has the same finite number of points, or when each metric space has a finite number of points tending to infinity. We are also interested in notions of convergence when each metric space in the sequence has an arbitrary (possibly infinite) number of points. For instance, if a sequence of metric measure spaces converges to a fixed metric measure space , then in what sense do the MDS embeddings of these spaces converge to the MDS embedding of ?
Several questions remain open. In particular, we would like to have a better understanding of the convergence of MDS under the most unrestrictive assumptions of a sequence of arbitrary (possibly infinite) metric measure spaces converging to a fixed metric measure space. Is there a version that holds under convergence in the Gromov–Wasserstein distance, which that allows for distortion of both the metric and the measure simultaneously (see Conjecture 6.1 and Question 6.2)? Despite all of the work that has been done on MDS by a wide variety of authors, many interesting questions remain open (at least to us). For example, consider the MDS embeddings of the -sphere for .
Question 7.1**.**
What are the MDS embeddings of the -sphere , equipped with the geodesic metric, into Euclidean space ?
To our knowledge, the MDS embeddings of into are not understood for all positive integers except in the case of the circle, when . The above question is also interesting, even in the case of the circle, when the -sphere is not equipped with the uniform measure. As a specific case, what is the MDS embedding of into when the measure is not uniform on all of , but instead (for example) uniform with mass on the northern hemisphere, and uniform with mass on the southern hemisphere?
We note the work of Blumstein and Kvinge [4], where a finite group representation theoretic perspective on MDS is employed. Adapting these techniques to the analytical setting of compact Lie groups may prove fruitful for the case of infinite MDS on higher dimensional spheres.
We also note the work [5], where the theory of an MDS embedding into pseudo Euclidean space is developed. In this setting, both positive and negative eigenvalues are used to create an embedding. In the example of embedding , positive and negative eigenvalues occur in a one-to-one fashion. We wonder about the significance of the full spectrum of eigenvalues for the higher dimensional spheres.
8. Acknowledgements
We would like to thank Bailey Fosdick, Michael Kirby, Henry Kvinge, Facundo Mémoli, Louis Scharf, the students in Michael Kirby’s Spring 2018 class, and the Pattern Analysis Laboratory at Colorado State University for their helpful conversations and support throughout this project.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux, Jean-François Paiement, Pascal Vincent, and Marie Ouimet. Learning eigenfunctions links spectral embedding and kernel PCA. Neural computation , 16(10):2197–2219, 2004.
- 2[2] JM Bibby, JT Kent, and KV Mardia. Multivariate analysis, 1979.
- 3[3] Leonard Mascot Blumenthal. Theory and applications of distance geometry . Chelsea New York, 1970.
- 4[4] Mark Blumstein and Henry Kvinge. Letting symmetry guide visualization: Multidimensional scaling on groups. ar Xiv preprint ar Xiv:1812.03362 , 2018.
- 5[5] Mark Blumstein and Louis Scharf. Pseudo Riemannian multidimensional scaling. 2019.
- 6[6] Eugène Bogomolny, Oriol Bohigas, and Charles Schmit. Spectral properties of distance matrices. Journal of Physics A: Mathematical and General , 36(12):3595, 2003.
- 7[7] Andreas Buja, Deborah F Swayne, Michael L Littman, Nathaniel Dean, Heike Hofmann, and Lisha Chen. Data visualization with multidimensional scaling. Journal of Computational and Graphical Statistics , 17(2):444–472, 2008.
- 8[8] Trevor F Cox and Michael AA Cox. Multidimensional scaling . CRC press, 2000.
