$\varepsilon$-Coresets for Clustering (with Outliers) in Doubling Metrics
Lingxiao Huang, Shaofeng H.-C. Jiang, Jian Li, Xuan Wu

TL;DR
This paper introduces an efficient method for constructing small $ ext{epsilon}$-coresets for clustering in doubling metrics, with size independent of dataset size, and establishes new theoretical relations between doubling dimension and VC-dimension.
Contribution
It presents the first size-independent $ ext{epsilon}$-coreset construction for clustering in doubling metrics and relates doubling dimension to the VC-dimension of the induced range space.
Findings
Coreset size depends only on parameters $k, z, ext{epsilon}$, and doubling dimension.
Established a relation between doubling dimension and VC-dimension with distance distortion.
Developed robust coresets and centroid sets for improved clustering and property testing.
Abstract
We study the problem of constructing -coresets for the -clustering problem in a doubling metric . An -coreset is a weighted subset with weight function , such that for any -subset , it holds that . We present an efficient algorithm that constructs an -coreset for the -clustering problem in , where the size of the coreset only depends on the parameters and the doubling dimension . To the best of our knowledge, this is the first efficient -coreset construction of size independent of for general clustering problems in doubling metrics. To this end, we establish the first relation between the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Automated Road and Building Extraction · Computational Geometry and Mesh Generation
