A Universal Nearest-Neighbor Estimator for Intrinsic Dimensionality
Eng-Jon Ong, Omer Bobrowski, Gesine Reinert, Primoz Skraba

TL;DR
This paper introduces a universal nearest-neighbor-based estimator for intrinsic dimensionality that converges to the true ID regardless of data distribution, outperforming existing methods in accuracy.
Contribution
We propose a novel ID estimator using nearest-neighbor distance ratios that is theoretically proven to be distribution-independent and achieves state-of-the-art empirical results.
Findings
Achieves state-of-the-art accuracy on benchmark datasets
Proves convergence to true ID regardless of data distribution
Performs well on real-world high-dimensional data
Abstract
Estimating the intrinsic dimensionality (ID) of data is a fundamental problem in machine learning and computer vision, providing insight into the true degrees of freedom underlying high-dimensional observations. Existing methods often rely on geometric or distributional assumptions and can significantly fail when these assumptions are violated. In this paper, we introduce a novel ID estimator based on nearest-neighbor distance ratios that involves simple calculations and achieves state-of-the-art results. Most importantly, we provide a theoretical analysis proving that our estimator is \emph{universal}, namely, it converges to the true ID independently of the distribution generating the data. We present experimental results on benchmark manifolds and real-world datasets to demonstrate the performance of our estimator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMorphological variations and asymmetry · Face and Expression Recognition · Face recognition and analysis
