Recovery Guarantees for Kernel-based Clustering under Non-parametric Mixture Models
Leena Chennuru Vankadara, Sebastian Bordt, Ulrike von Luxburg,, Debarghya Ghoshdastidar

TL;DR
This paper establishes statistical guarantees for kernel-based clustering under non-parametric mixture models, identifying conditions for consistent recovery of true clusters without assuming specific distributional forms.
Contribution
It provides necessary and sufficient separability conditions for kernel clustering and links kernel clustering to density-based methods, enabling broader applicability.
Findings
Identifies separability conditions for consistent clustering
Establishes equivalence between kernel and density-based clustering
Provides guidelines for choosing kernel bandwidth
Abstract
Despite the ubiquity of kernel-based clustering, surprisingly few statistical guarantees exist beyond settings that consider strong structural assumptions on the data generation process. In this work, we take a step towards bridging this gap by studying the statistical performance of kernel-based clustering algorithms under non-parametric mixture models. We provide necessary and sufficient separability conditions under which these algorithms can consistently recover the underlying true clustering. Our analysis provides guarantees for kernel clustering approaches without structural assumptions on the form of the component distributions. Additionally, we establish a key equivalence between kernel-based data-clustering and kernel density-based clustering. This enables us to provide consistency guarantees for kernel-based estimators of non-parametric mixture models. Along with theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Gaussian Processes and Bayesian Inference
