On Tight Robust Coresets for $k$-Medians Clustering
Lingxiao Huang, Zhenyu Jiang, Yi Li, Xuan Wu

TL;DR
This paper develops tight coreset constructions for robust $k$-medians clustering with outliers across various metric spaces, achieving near-optimal sizes and extending to $(k,z)$-clustering with improved bounds.
Contribution
It introduces new coreset constructions with optimal or near-optimal sizes for robust $k$-medians and $(k,z)$-clustering in multiple metric spaces, using novel dataset decomposition techniques.
Findings
Coreset size $O(m) + ilde{O}(kd ext{...})$ for bounded VC/doubling spaces
Improved coreset size $O(m ext{...}) + ilde{O}( ext{min}\{k^{4/3} ext{...},k ext{...} ight)$ for Euclidean spaces
Extensions to robust $(k,z)$-clustering with improved bounds
Abstract
This paper considers coresets for the robust -medians problem with outliers, and new constructions in various metric spaces are obtained. Specifically, for metric spaces with a bounded VC or doubling dimension , the coreset size is , which is optimal up to logarithmic factors. For Euclidean spaces, the coreset size is , improving upon a recent result by Jiang and Lou (ICALP 2025). These results also extend to robust -clustering, yielding, for VC and doubling dimension, a coreset size of with the optimal linear dependence on . This extended result improves upon the earlier work of Huang et al. (SODA 2025). The techniques introduce novel dataset decompositions, enabling chaining arguments to be applied…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Multi-Criteria Decision Making
