Massively Parallel and Dynamic Algorithms for Minimum Size Clustering
Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, Peilin Zhong

TL;DR
This paper introduces scalable parallel and dynamic algorithms for the $r$-gather clustering problem, achieving constant-factor approximations with efficient update times in metric spaces and Euclidean settings.
Contribution
It presents the first scalable MPC algorithm and a dynamic algorithm for $r$-gather clustering with provable approximation guarantees and efficient complexity.
Findings
MPC algorithm computes $O(1)$-approximate solutions in $O( ext{polylog } n)$ rounds.
Dynamic algorithm maintains $O(1)$-approximate solutions with polylogarithmic update and query times.
Algorithms are scalable and handle high-dimensional Euclidean and general metric spaces.
Abstract
In this paper, we study the -gather problem, a natural formulation of minimum-size clustering in metric spaces. The goal of -gather is to partition points into clusters such that each cluster has size at least , and the maximum radius of the clusters is minimized. This additional constraint completely changes the algorithmic nature of the problem, and many clustering techniques fail. Also previous dynamic and parallel algorithms do not achieve desirable complexity. We propose algorithms both in the Massively Parallel Computation (MPC) model and in the dynamic setting. Our MPC algorithm handles input points from the Euclidean space . It computes an -approximate solution of -gather in rounds using total space for arbitrarily small constants . In addition our algorithm is fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Computational Geometry and Mesh Generation · Advanced Clustering Algorithms Research
