Clustering with diversity
Jian Li, Ke Yi, Qin Zhang

TL;DR
This paper introduces a 2-approximation algorithm for the clustering with diversity problem, ensuring clusters have diverse colors and minimal maximum radius, with extensions for outliers, relevant to privacy-preserving data publication.
Contribution
It provides the first constant-factor approximation for clustering with diversity and establishes its optimality under P≠NP, including extensions for handling outliers.
Findings
2-approximation algorithm for clustering with diversity
Matching lower bound proving optimality unless P=NP
Extensions for outlier handling in privacy contexts
Abstract
We consider the {\em clustering with diversity} problem: given a set of colored points in a metric space, partition them into clusters such that each cluster has at least points, all of which have distinct colors. We give a 2-approximation to this problem for any when the objective is to minimize the maximum radius of any cluster. We show that the approximation ratio is optimal unless , by providing a matching lower bound. Several extensions to our algorithm have also been developed for handling outliers. This problem is mainly motivated by applications in privacy-preserving data publication.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFacility Location and Emergency Management · Data Management and Algorithms · Bayesian Methods and Mixture Models
