Coresets for Clustering with Fairness Constraints

Lingxiao Huang; Shaofeng H.-C. Jiang; Nisheeth K. Vishnoi

arXiv:1906.08484·cs.DS·December 18, 2019·21 cites

Coresets for Clustering with Fairness Constraints

Lingxiao Huang, Shaofeng H.-C. Jiang, Nisheeth K. Vishnoi

PDF

Open Access 1 Repo

TL;DR

This paper introduces scalable coreset constructions for fair clustering with multiple, non-disjoint sensitive types, enabling faster algorithms and smaller data summaries for fair $k$-median and $k$-means clustering.

Contribution

It presents the first coreset construction for fair $k$-median clustering and improves existing methods for $k$-means with multiple sensitive types, enhancing scalability.

Findings

01

Coreset sizes are significantly smaller than full datasets.

02

Achieved faster fair clustering algorithms using coresets.

03

Validated approach on multiple real-world datasets.

Abstract

In a recent work, [19] studied the following "fair" variants of classical clustering problems such as $k$ -means and $k$ -median: given a set of $n$ data points in $R^{d}$ and a binary type associated to each data point, the goal is to cluster the points while ensuring that the proportion of each type in each cluster is roughly the same as its underlying proportion. Subsequent work has focused on either extending this setting to when each data point has multiple, non-disjoint sensitive types such as race and gender [6], or to address the problem that the clustering algorithms in the above work do not scale well. The main contribution of this paper is an approach to clustering with fairness constraints that involve multiple, non-disjoint types, that is also scalable. Our approach is based on novel constructions of coresets: for the $k$ -median objective, we construct an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sfjiang1990/Coresets-for-Clustering-with-Fairness-Constraints
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Machine Learning and Data Classification

MethodsCoresets