Fair k-Center Clustering for Data Summarization

Matth\"aus Kleindessner; Pranjal Awasthi; Jamie Morgenstern

arXiv:1901.08628·stat.ML·May 14, 2019·42 cites

Fair k-Center Clustering for Data Summarization

Matth\"aus Kleindessner, Pranjal Awasthi, Jamie Morgenstern

PDF

Open Access 1 Repo

TL;DR

This paper introduces a linear-time approximation algorithm for fair k-center clustering, ensuring demographic fairness in data summarization without sacrificing computational efficiency.

Contribution

It presents the first linear-time approximation algorithm for fair k-center clustering, addressing the computational gap in existing methods.

Findings

01

Algorithm runs in linear time relative to data size and k

02

Approximation guarantee incurs only a constant-factor overhead for few groups

03

Effectively incorporates fairness constraints into clustering

Abstract

In data summarization we want to choose $k$ prototypes in order to summarize a data set. We study a setting where the data set comprises several demographic groups and we are restricted to choose $k_{i}$ prototypes belonging to group $i$ . A common approach to the problem without the fairness constraint is to optimize a centroid-based clustering objective such as $k$ -center. A natural extension then is to incorporate the fairness constraint into the clustering problem. Existing algorithms for doing so run in time super-quadratic in the size of the data set, which is in contrast to the standard $k$ -center problem being approximable in linear time. In this paper, we resolve this gap by providing a simple approximation algorithm for the $k$ -center problem under the fairness constraint with running time linear in the size of the data set and $k$ . If the number of demographic groups is small,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matthklein/fair_k_center_clustering
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Data Quality and Management · Data Mining Algorithms and Applications