The emptiness inside: Finding gaps, valleys, and lacunae with geometric data analysis
Gabriella Contardo, David W. Hogg, Jason A.S. Hunt, Joshua E.G. Peek,, Yen-Chi Chen

TL;DR
This paper introduces a new statistical method to identify and characterize gaps and under-dense regions in high-dimensional data, demonstrated on stellar velocity distributions in the Milky Way.
Contribution
The paper presents a density-based statistic leveraging the gradient and Hessian to detect complex-shaped gaps in data without optimization, applicable in various scientific fields.
Findings
Effective detection of gaps in stellar velocity data.
Method highlights under-dense regions of arbitrary shape.
Provides practical implementation guidance.
Abstract
Discoveries of gaps in data have been important in astrophysics. For example, there are kinematic gaps opened by resonances in dynamical systems, or exoplanets of a certain radius that are empirically rare. A gap in a data set is a kind of anomaly, but in an unusual sense: Instead of being a single outlier data point, situated far from other data points, it is a region of the space, or a set of points, that is anomalous compared to its surroundings. Gaps are both interesting and hard to find and characterize, especially when they have non-trivial shapes. We present in this paper a statistic that can be used to estimate the (local) "gappiness" of a point in the data space. It uses the gradient and Hessian of the density estimate (and thus requires a twice-differentiable density estimator). This statistic can be computed at (almost) any point in the space and does not rely on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Applications · Genetic and phenotypic traits in livestock
