On the Communication Complexity of Distributed Clustering
Qin Zhang

TL;DR
This paper establishes fundamental communication lower bounds for distributed clustering problems like k-center, k-median, and k-means, matching existing upper bounds up to a logarithmic factor, and introduces a new proof framework.
Contribution
It provides the first set of communication lower bounds for distributed clustering and develops a novel composition framework for multiparty communication complexity.
Findings
Lower bounds match current best upper bounds up to a logarithmic factor.
Introduces a new composition framework for multiparty number-in-hand communication complexity.
Provides insights into the communication requirements for distributed clustering algorithms.
Abstract
In this paper we give a first set of communication lower bounds for distributed clustering problems, in particular, for k-center, k-median and k-means. When the input is distributed across a large number of machines and the number of clusters k is small, our lower bounds match the current best upper bounds up to a logarithmic factor. We have designed a new composition framework in our proofs for multiparty number-in-hand communication complexity which may be of independent interest.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
