A Generic Framework for Fair Consensus Clustering in Streams
Diptarka Chakraborty, Kushagra Chatterjee, Debarati Das, Tien-Long Nguyen

TL;DR
This paper introduces a novel streaming algorithm for fair consensus clustering that efficiently processes sequential input clusterings with limited memory, providing improved approximation guarantees and a versatile, fairness-agnostic framework applicable to various fairness definitions.
Contribution
It presents the first constant-factor streaming algorithm for fair consensus clustering with logarithmic memory, along with a generic, fairness-agnostic framework that enhances offline and streaming clustering.
Findings
First constant-factor streaming algorithm with logarithmic memory
Framework improves approximation guarantees in streaming and offline settings
Applicable to any efficiently computable fairness definition
Abstract
Consensus clustering seeks to combine multiple clusterings of the same dataset, potentially derived by considering various non-sensitive attributes by different agents in a multi-agent environment, into a single partitioning that best reflects the overall structure of the underlying dataset. Recent work by Chakraborty et al, introduced a fair variant under proportionate fairness and obtained a constant-factor approximation by naively selecting the best closest fair input clustering; however, their offline approach requires storing all input clusterings, which is prohibitively expensive for most large-scale applications. In this paper, we initiate the study of fair consensus clustering in the streaming model, where input clusterings arrive sequentially and memory is limited. We design the first constant-factor algorithm that processes the stream while storing only a logarithmic number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Ethics and Social Impacts of AI · Privacy-Preserving Technologies in Data
