Faster Approximation Algorithms for k-Center via Data Reduction

Arnold Filtser; Shaofeng H.-C. Jiang; Yi Li; Anurag Murty; Naredla; Ioannis Psarros; Qiaoyuan Yang; Qin Zhang

arXiv:2502.05888·cs.DS·February 11, 2025

Faster Approximation Algorithms for k-Center via Data Reduction

Arnold Filtser, Shaofeng H.-C. Jiang, Yi Li, Anurag Murty, Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces efficient data reduction techniques using coresets for the Euclidean k-Center problem, enabling faster approximation algorithms especially for large k, with practical validation on real datasets.

Contribution

The paper presents novel algorithms for constructing small coresets of size k·o(n), leading to faster approximation algorithms for large k in Euclidean k-Center.

Findings

01

Coresets enable up to 4x speedup in clustering algorithms.

02

Near-linear time 1-approximation for k=n^c with 0<c<1.

03

New hashing technique with competitive parameters for high-dimensional spaces.

Abstract

We study efficient algorithms for the Euclidean $k$ -Center problem, focusing on the regime of large $k$ . We take the approach of data reduction by considering $α$ -coreset, which is a small subset $S$ of the dataset $P$ such that any $β$ -approximation on $S$ is an $(α + β)$ -approximation on $P$ . We give efficient algorithms to construct coresets whose size is $k \cdot o (n)$ , which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O (1)$ -approximation when $k = n^{c}$ for any $0 < c < 1$ . We validate the performance of our coresets on real-world datasets with large $k$ , and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Faster Approximation Algorithms for k-Center via Data Reduction· slideslive

Taxonomy

TopicsFace and Expression Recognition