Intrinsic Dimension of Geometric Data Sets
Tom Hanika, Friedrich Martin Schneider, Gerd Stumme

TL;DR
This paper introduces a new geometric framework for understanding the intrinsic dimension of data sets, linking measure concentration and geometric analysis to better address the curse of dimensionality in machine learning.
Contribution
It develops an axiomatic approach to intrinsic dimension using Gromov's metric measure geometry, providing a computationally feasible and adaptable model for data analysis.
Findings
Proposes a new geometric model for data sets and their intrinsic dimension.
Establishes a concrete dimension function with desired properties.
Demonstrates the model's applicability through experiments.
Abstract
The curse of dimensionality is a phenomenon frequently observed in machine learning (ML) and knowledge discovery (KD). There is a large body of literature investigating its origin and impact, using methods from mathematics as well as from computer science. Among the mathematical insights into data dimensionality, there is an intimate link between the dimension curse and the phenomenon of measure concentration, which makes the former accessible to methods of geometric analysis. The present work provides a comprehensive study of the intrinsic geometry of a data set, based on Gromov's metric measure geometry and Pestov's axiomatic approach to intrinsic dimension. In detail, we define a concept of geometric data set and introduce a metric as well as a partial order on the set of isomorphism classes of such data sets. Based on these objects, we propose and investigate an axiomatic approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
