Stability of Density-Based Clustering
Alessandro Rinaldo, Aarti Singh, Rebecca Nugent, Larry Wasserman

TL;DR
This paper investigates the stability of density-based clustering methods by analyzing how estimates of density level sets and cluster trees vary with the kernel bandwidth, providing theoretical insights into their instability.
Contribution
It introduces two measures of instability for density level set and cluster tree estimates and studies their theoretical properties as functions of the bandwidth.
Findings
Instability measures depend on bandwidth choice
Theoretical bounds for instability are derived
Guidelines for selecting stable clustering parameters
Abstract
High density clusters can be characterized by the connected components of a level set of the underlying probability density function generating the data, at some appropriate level . The complete hierarchical clustering can be characterized by a cluster tree . In this paper, we study the behavior of a density level set estimate and cluster tree estimate based on a kernel density estimator with kernel bandwidth . We define two notions of instability to measure the variability of and as a function of , and investigate the theoretical properties of these instability measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
