The bliss of dimensionality: how an unsupervised criterion identifies optimal low-resolution representations of high-dimensional datasets
Margherita Mele, Daniel Campos Moreno, and Raffaello Potestio

TL;DR
This paper validates an unsupervised, information-theoretic method for selecting optimal low-resolution representations of high-dimensional data, demonstrating its consistency with distribution-based optimality across various datasets.
Contribution
It systematically compares the Relevance-Resolution framework's optima with KL divergence minimization, establishing their quantitative alignment in high-dimensional data analysis.
Findings
Res-Rel optimality region contains KL-optimal discretizations
The -1 slope criterion closely matches KL divergence minimum in high dimensions
Validation across synthetic, image, and molecular datasets confirms the method's effectiveness
Abstract
Selecting the optimal resolution for discretizing high-dimensional data is a central problem in physics and data analysis, particularly in unsupervised settings where the underlying distribution is unknown. The Relevance-Resolution (Res-Rel) framework addresses this issue through an information-theoretic trade-off between descriptive detail and statistical reliability. Here we provide a systematic validation of this approach by comparing its characteristic optima--maximum relevance and the -1 slope (information-theoretic) point--with the discretization that minimizes the Kullback-Leibler divergence from a known or physically motivated ground truth distribution. Across unstructured and structured synthetic datasets, Gaussian clones of MNIST, and molecular dynamics simulations of the alanine dipeptide, we find that as the dimensionality or informative content increases the KL-optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Origins and Evolution of Life · Nanopore and Nanochannel Transport Studies
