Fairness in Clustering with Multiple Sensitive Attributes

Savitha Sam Abraham; Deepak P; Sowmya S Sundaram

arXiv:1910.05113·cs.LG·January 27, 2020·19 cites

Fairness in Clustering with Multiple Sensitive Attributes

Savitha Sam Abraham, Deepak P, Sowmya S Sundaram

PDF

Open Access

TL;DR

This paper introduces FairKM, a novel fair clustering method for multiple sensitive attributes, which improves both fairness and clustering quality over existing approaches, validated on real datasets.

Contribution

The paper proposes FairKM, a new fair clustering algorithm that handles multiple sensitive attributes and balances fairness with clustering quality.

Findings

01

FairKM outperforms baseline methods in fairness metrics.

02

FairKM achieves higher clustering quality while maintaining fairness.

03

Experimental results on real datasets validate the effectiveness of FairKM.

Abstract

A clustering may be considered as fair on pre-specified sensitive attributes if the proportions of sensitive attribute groups in each cluster reflect that in the dataset. In this paper, we consider the task of fair clustering for scenarios involving multiple multi-valued or numeric sensitive attributes. We propose a fair clustering method, \textit{FairKM} (Fair K-Means), that is inspired by the popular K-Means clustering formulation. We outline a computational notion of fairness which is used along with a cluster coherence objective, to yield the FairKM clustering method. We empirically evaluate our approach, wherein we quantify both the quality and fairness of clusters, over real-world datasets. Our experimental evaluation illustrates that the clusters generated by FairKM fare significantly better on both clustering quality and fair representation of sensitive attribute groups compared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data

Methodsk-Means Clustering