Information-Theoretic Quality Metric of Low-Dimensional Embeddings
Sebasti\'an Guti\'errez-Bernal, Hector Medel Cobaxin, Abiel Galindo Gonz\'alez

TL;DR
This paper introduces the Entropy Rank Preservation Measure (ERPM), an information-theoretic metric for evaluating low-dimensional embeddings, which complements existing geometric and distance-based metrics by quantifying information preservation at the neighborhood level.
Contribution
The paper proposes ERPM, a novel local metric based on Shannon entropy and stable rank, to assess information loss in embeddings, enhancing evaluation beyond traditional geometric criteria.
Findings
ERPM correlates strongly with Local Procrustes but reveals discrepancies in local regimes.
Distance-based metrics show low correlation with spectral and geometric measures.
ERPM effectively identifies neighborhoods with significant information loss.
Abstract
In this work we study the quality of low-dimensional embeddings from an explicitly information-theoretic perspective. We begin by noting that classical evaluation metrics such as stress, rank-based neighborhood criteria, or Local Procrustes quantify distortions in distances or in local geometries, but do not directly assess how much information is preserved when projecting high-dimensional data onto a lower-dimensional space. To address this limitation, we introduce the Entropy Rank Preservation Measure (ERPM), a local metric based on the Shannon entropy of the singular-value spectrum of neighborhood matrices and on the stable rank, which quantifies changes in uncertainty between the original representation and its reduced projection, providing neighborhood-level indicators and a global summary statistic. To validate the results of the metric, we compare its outcomes with the Mean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Data Quality and Management · Data Visualization and Analytics
