Compressibility Barriers to Neighborhood-Preserving Data Visualizations
Szymon Snoeck, Noah Bergam, Nakul Verma

TL;DR
This paper investigates the fundamental limits of visualizing high-dimensional data in low dimensions by analyzing the embedding of graphs representing data neighborhoods, revealing that preserving neighborhood structure often requires dimensions logarithmic in data size.
Contribution
It introduces a theoretical framework for understanding the dimensionality requirements for neighborhood-preserving embeddings, highlighting the inherent complexity and limitations of low-dimensional visualizations.
Findings
Most graphs require logarithmic dimension for neighbor preservation.
Sparse regular graphs need slightly lower dimensions, but still grow with data size.
Embedding into normed spaces is exponentially harder, demanding linear dimension.
Abstract
To what extent is it possible to visualize high-dimensional data in two- or three-dimensional plots? We reframe this question in terms of embedding -vertex graphs (representing the neighborhood structure of the input points) into metric spaces of low doubling dimension in such a way that keeps neighbors close and non-neighbors far. This notion of neighbor preservation can be understood as a considerably weaker embedding constraint than near-isometry, yet it is similarly as demanding in terms of how the minimum required dimension scales with the number of points. We show that for an overwhelming fraction of graphs, is both necessary and sufficient for neighbor preservation. Even sparse regular graphs, which represent more restricted neighborhood connectivity structures, typically require . The landscape changes dramatically when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Topological and Geometric Data Analysis · Computer Graphics and Visualization Techniques
