A Computational Approach to Improving Fairness in K-means Clustering

Guancheng Zhou; Haiping Xu; Hongkang Xu; Chenyu Li; Donghui Yan

arXiv:2505.22984·cs.LG·February 10, 2026

A Computational Approach to Improving Fairness in K-means Clustering

Guancheng Zhou, Haiping Xu, Hongkang Xu, Chenyu Li, Donghui Yan

PDF

Open Access

TL;DR

This paper introduces a two-stage optimization method to enhance fairness in K-means clustering by adjusting cluster memberships, addressing bias related to sensitive attributes with minimal impact on clustering quality.

Contribution

It proposes two efficient algorithms for identifying and adjusting unfairly biased data points, improving fairness in K-means clustering.

Findings

01

Significant fairness improvements on benchmark datasets

02

Minimal impact on clustering quality

03

Algorithms extendable to other clustering methods

Abstract

The popular K-means clustering algorithm potentially suffers from a major weakness for further analysis or interpretation. Some cluster may have disproportionately more (or fewer) points from one of the subpopulations in terms of some sensitive variable, e.g., gender or race. Such a fairness issue may cause bias and unexpected social consequences. This work attempts to improve the fairness of K-means clustering with a two-stage optimization formulation--clustering first and then adjust cluster membership of a small subset of selected data points. Two computationally efficient algorithms are proposed in identifying those data points that are expensive for fairness, with one focusing on nearest data points outside of a cluster and the other on highly 'mixed' data points. Experiments on benchmark datasets show substantial improvement on fairness with a minimal impact to clustering quality.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition · Customer churn and segmentation

Methodsk-Means Clustering