Coresets for Clustering with Fairness Constraints
Lingxiao Huang, Shaofeng H.-C. Jiang, Nisheeth K. Vishnoi

TL;DR
This paper introduces scalable coreset constructions for fair clustering with multiple, non-disjoint sensitive types, enabling faster algorithms and smaller data summaries for fair $k$-median and $k$-means clustering.
Contribution
It presents the first coreset construction for fair $k$-median clustering and improves existing methods for $k$-means with multiple sensitive types, enhancing scalability.
Findings
Coreset sizes are significantly smaller than full datasets.
Achieved faster fair clustering algorithms using coresets.
Validated approach on multiple real-world datasets.
Abstract
In a recent work, [19] studied the following "fair" variants of classical clustering problems such as -means and -median: given a set of data points in and a binary type associated to each data point, the goal is to cluster the points while ensuring that the proportion of each type in each cluster is roughly the same as its underlying proportion. Subsequent work has focused on either extending this setting to when each data point has multiple, non-disjoint sensitive types such as race and gender [6], or to address the problem that the clustering algorithms in the above work do not scale well. The main contribution of this paper is an approach to clustering with fairness constraints that involve multiple, non-disjoint types, that is also scalable. Our approach is based on novel constructions of coresets: for the -median objective, we construct an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Machine Learning and Data Classification
MethodsCoresets
