Geometric Interpretations of the $k$-Nearest Neighbour Distributions
Kwanit Gangopadhyay, Arka Banerjee, Tom Abel

TL;DR
This paper explores the geometric interpretations of $k$-Nearest Neighbour CDFs, establishing their equivalence to sphere volumes and intersections, and compares their information content and computational efficiency with Minkowski Functionals for analyzing spatial data.
Contribution
It provides a detailed geometric interpretation of $k$NN CDFs, linking them to sphere volumes and intersections, and demonstrates their comparable information content to Minkowski Functionals with faster computation.
Findings
$k$NN CDFs are equivalent to sphere volume intersections around data points.
The derivatives of $k$NN CDFs contain geometric information similar to Minkowski Functionals.
$k$NN CDFs are computationally more efficient while retaining similar information content.
Abstract
The -Nearest Neighbour Cumulative Distribution Functions are measures of clustering for discrete datasets that are fast and efficient to compute. They are significantly more informative than the 2-point correlation function. Their connection to -point correlation functions, void probability functions and Counts-in-Cells is known. However, the connections between the CDFs and other geometric and topological spatial summary statistics are yet to be fully explored in the literature. This understanding will be crucial to find optimally informative summary statistics to analyse data from stage 4 cosmological surveys. We explore quantitatively the geometric interpretations of the NN CDF summary statistics. We establish an equivalence between the 1NN CDF at radius and the volume of spheres with the same radius around the data points. We show that higher NN CDFs are equivalent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Statistical Methods and Models · Data Management and Algorithms
