
TL;DR
This paper explores how the performance degradation of high-dimensional indexing schemes for similarity search relates to measure concentration phenomena and VC theory, providing a theoretical framework for understanding these effects.
Contribution
It connects the concepts of measure concentration and VC theory to explain the limitations of indexing schemes in high-dimensional spaces.
Findings
Performance degradation linked to measure concentration
Theoretical insights into high-dimensional indexing limitations
Connection between VC theory and similarity search performance
Abstract
Degrading performance of indexing schemes for exact similarity search in high dimensions has long since been linked to histograms of distributions of distances and other 1-Lipschitz functions getting concentrated. We discuss this observation in the framework of the phenomenon of concentration of measure on the structures of high dimension and the Vapnik-Chervonenkis theory of statistical learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
