How can classical multidimensional scaling go wrong?
Rishi Sonthalia, Gregory Van Buskirk, Benjamin Raichel, Anna C., Gilbert

TL;DR
This paper analyzes the limitations of classical multidimensional scaling (cMDS), especially with non-Euclidean data, showing that increasing embedding dimension can worsen the quality and proposing a new algorithm to improve embeddings.
Contribution
The paper provides a theoretical error formula for cMDS, reveals counterintuitive degradation with higher dimensions, and introduces an efficient algorithm for better distance approximation.
Findings
Embedding quality can degrade as dimension increases with non-Euclidean metrics.
Classification accuracy decreases with higher embedding dimensions.
Proposed algorithm produces more stable embeddings with less accuracy loss.
Abstract
Given a matrix describing the pairwise dissimilarities of a data set, a common task is to embed the data points into Euclidean space. The classical multidimensional scaling (cMDS) algorithm is a widespread method to do this. However, theoretical analysis of the robustness of the algorithm and an in-depth analysis of its performance on non-Euclidean metrics is lacking. In this paper, we derive a formula, based on the eigenvalues of a matrix obtained from , for the Frobenius norm of the difference between and the metric returned by cMDS. This error analysis leads us to the conclusion that when the derived matrix has a significant number of negative eigenvalues, then , after initially decreasing, will eventually increase as we increase the dimension. Hence, counterintuitively, the quality of the embedding degrades as we increase the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFace and Expression Recognition · Neural Networks and Applications · Blind Source Separation Techniques
