Provable Imbalanced Point Clustering
David Denisov, Dan Feldman, Shlomi Dolev, and Michael Segal

TL;DR
This paper introduces efficient, provable methods for imbalanced point clustering using coresets, providing theoretical guarantees and empirical validation across various datasets.
Contribution
It presents novel coreset-based algorithms for imbalanced clustering with provable approximation guarantees and demonstrates their effectiveness through experiments.
Findings
Effective clustering on real and synthetic data
Coreset methods achieve approximation guarantees
Choice clustering improves performance
Abstract
We suggest efficient and provable methods to compute an approximation for imbalanced point clustering, that is, fitting -centers to a set of points in , for any . To this end, we utilize \emph{coresets}, which, in the context of the paper, are essentially weighted sets of points in that approximate the fitting loss for every model in a given set, up to a multiplicative factor of . We provide [Section 3 and Section E in the appendix] experiments that show the empirical contribution of our suggested methods for real images (novel and reference), synthetic data, and real-world data. We also propose choice clustering, which by combining clustering algorithms yields better performance than each one separately.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
