TL;DR
This paper introduces a new coreset construction for the robust geometric median problem that removes size dependence on outliers, providing smaller coresets with strong theoretical guarantees and improved empirical performance.
Contribution
The authors develop a coreset construction with size independent of outlier count, extending to robust clustering in metric spaces, and introduce a novel error analysis technique.
Findings
Coreset size is independent of outlier number m for n ≥ 4m.
Achieves an optimal coreset size of rac{ ilde{ heta}}{ ext{(epsilon)}}^{1/2} + rac{m}{n} ext{(epsilon)}^{-1} in 1D.
Empirically outperforms existing methods in size-accuracy tradeoffs and runtime.
Abstract
We study the robust geometric median problem in Euclidean space , with a focus on coreset construction.A coreset is a compact summary of a dataset of size that approximates the robust cost for all centers within a multiplicative error . Given an outlier count , we construct a coreset of size when , eliminating the dependency present in prior work [Huang et al., 2022 & 2023]. For the special case of , we achieve an optimal coreset size of , revealing a clear separation from the vanilla case studied in [Huang et al., 2023; Afshani and Chris, 2024]. Our results further extend to robust -clustering in various metric spaces, eliminating the -dependence under mild data assumptions. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
