Generalizing Fair Clustering to Multiple Groups: Algorithms and Applications
Diptarka Chakraborty, Kushagra Chatterjee, Debarati Das, Tien-Long Nguyen

TL;DR
This paper extends fair clustering algorithms to handle multiple protected groups, providing NP-hardness results, efficient approximation algorithms, and improved guarantees for related fair clustering problems.
Contribution
It generalizes the closest fair clustering problem to multiple groups, introduces near-linear time approximation algorithms, and advances the state-of-the-art in fair correlation and consensus clustering.
Findings
Closest fair clustering is NP-hard for multiple groups.
Proposed near-linear time approximation algorithms.
Improved approximation guarantees for fair correlation and consensus clustering.
Abstract
Clustering is a fundamental task in machine learning and data analysis, but it frequently fails to provide fair representation for various marginalized communities defined by multiple protected attributes -- a shortcoming often caused by biases in the training data. As a result, there is a growing need to enhance the fairness of clustering outcomes, ideally by making minimal modifications, possibly as a post-processing step after conventional clustering. Recently, Chakraborty et al. [COLT'25] initiated the study of \emph{closest fair clustering}, though in a restricted scenario where data points belong to only two groups. In practice, however, data points are typically characterized by many groups, reflecting diverse protected attributes such as age, ethnicity, gender, etc. In this work, we generalize the study of the \emph{closest fair clustering} problem to settings with an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Advanced Clustering Algorithms Research
