A Generic Framework for Fair Consensus Clustering in Streams

Diptarka Chakraborty; Kushagra Chatterjee; Debarati Das; Tien-Long Nguyen

arXiv:2602.11500·cs.LG·February 13, 2026

A Generic Framework for Fair Consensus Clustering in Streams

Diptarka Chakraborty, Kushagra Chatterjee, Debarati Das, Tien-Long Nguyen

PDF

Open Access

TL;DR

This paper introduces a novel streaming algorithm for fair consensus clustering that efficiently processes sequential input clusterings with limited memory, providing improved approximation guarantees and a versatile, fairness-agnostic framework applicable to various fairness definitions.

Contribution

It presents the first constant-factor streaming algorithm for fair consensus clustering with logarithmic memory, along with a generic, fairness-agnostic framework that enhances offline and streaming clustering.

Findings

01

First constant-factor streaming algorithm with logarithmic memory

02

Framework improves approximation guarantees in streaming and offline settings

03

Applicable to any efficiently computable fairness definition

Abstract

Consensus clustering seeks to combine multiple clusterings of the same dataset, potentially derived by considering various non-sensitive attributes by different agents in a multi-agent environment, into a single partitioning that best reflects the overall structure of the underlying dataset. Recent work by Chakraborty et al, introduced a fair variant under proportionate fairness and obtained a constant-factor approximation by naively selecting the best closest fair input clustering; however, their offline approach requires storing all input clusterings, which is prohibitively expensive for most large-scale applications. In this paper, we initiate the study of fair consensus clustering in the streaming model, where input clusterings arrive sequentially and memory is limited. We design the first constant-factor algorithm that processes the stream while storing only a logarithmic number…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Privacy-Preserving Technologies in Data