Scalable Fair Clustering

Arturs Backurs; Piotr Indyk; Krzysztof Onak; Baruch Schieber; Ali; Vakilian; Tal Wagner

arXiv:1902.03519·cs.DS·June 12, 2019·58 cites

Scalable Fair Clustering

Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali, Vakilian, Tal Wagner

PDF

Open Access 1 Repo

TL;DR

This paper introduces a fast, practical algorithm for fair clustering that improves efficiency and offers better control over cluster balance, addressing limitations of previous methods.

Contribution

We develop a nearly linear time fairlet decomposition algorithm that enhances scalability and cluster balance control over prior super-quadratic approaches.

Findings

01

Algorithm runs in nearly linear time.

02

Provides finer control over cluster fairness.

03

Empirical results validate efficiency and effectiveness.

Abstract

We study the fair variant of the classic $k$ -median problem introduced by Chierichetti et al. [2017]. In the standard $k$ -median problem, given an input pointset $P$ , the goal is to find $k$ centers $C$ and assign each input point to one of the centers in $C$ such that the average distance of points to their cluster center is minimized. In the fair variant of $k$ -median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color. Chierichetti et al. proposed a two-phase algorithm for fair $k$ -clustering. In the first step, the pointset is partitioned into subsets called fairlets that satisfy the fairness requirement and approximately preserve the $k$ -median objective. In the second step, fairlets are merged into $k$ clusters by one of the existing $k$ -median…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

talwagner/fair_clustering
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFacility Location and Emergency Management · Point processes and geometric inequalities