Multidimensional Scaling of Noisy High Dimensional Data

Erez Peterfreund; Matan Gavish

arXiv:1801.10229·math.ST·February 1, 2018

Multidimensional Scaling of Noisy High Dimensional Data

Erez Peterfreund, Matan Gavish

PDF

TL;DR

This paper analyzes the limitations of classical Multidimensional Scaling (MDS) in high-dimensional noisy data environments, introduces an improved variant called MDS+ with optimal eigenvalue shrinkage, and demonstrates its superior embedding performance.

Contribution

The paper identifies the breakdown point of MDS under noise, derives an optimal eigenvalue shrinkage method (MDS+), and proves its asymptotic optimality and improved embedding quality.

Findings

01

MDS suffers a sharp breakdown at high noise levels depending on data dimension.

02

MDS+ with eigenvalue shrinkage outperforms classical MDS in noisy high-dimensional settings.

03

MDS+ automatically determines the optimal embedding dimension without external estimates.

Abstract

Multidimensional Scaling (MDS) is a classical technique for embedding data in low dimensions, still in widespread use today. Originally introduced in the 1950's, MDS was not designed with high-dimensional data in mind; while it remains popular with data analysis practitioners, no doubt it should be adapted to the high-dimensional data regime. In this paper we study MDS under modern setting, and specifically, high dimensions and ambient measurement noise. We show that, as the ambient noise level increase, MDS suffers a sharp breakdown that depends on the data dimension and noise level, and derive an explicit formula for this breakdown point in the case of white noise. We then introduce MDS+, an extremely simple variant of MDS, which applies a carefully derived shrinkage nonlinearity to the eigenvalues of the MDS similarity matrix. Under a loss function measuring the embedding quality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.