Federated Active Learning Under Extreme Non-IID and Global Class Imbalance
Chen-Chen Zong, Sheng-Jun Huang

TL;DR
This paper introduces FairFAL, an adaptive federated active learning framework that improves data sampling and model selection under severe class imbalance and client heterogeneity, leading to better performance.
Contribution
The paper systematically studies query-model selection in federated active learning and proposes FairFAL, a novel framework that adaptively balances global and local models for improved learning.
Findings
FairFAL outperforms state-of-the-art methods on five benchmarks.
Adaptive selection between global and local models enhances performance.
Class-balanced sampling improves minority class representation.
Abstract
Federated active learning (FAL) seeks to reduce annotation cost under privacy constraints, yet its effectiveness degrades in realistic settings with severe global class imbalance and highly heterogeneous clients. We conduct a systematic study of query-model selection in FAL and uncover a central insight: the model that achieves more class-balanced sampling, especially for minority classes, consistently leads to better final performance. Moreover, global-model querying is beneficial only when the global distribution is highly imbalanced and client data are relatively homogeneous; otherwise, the local model is preferable. Based on these findings, we propose FairFAL, an adaptive class-fair FAL framework. FairFAL (1) infers global imbalance and local-global divergence via lightweight prediction discrepancy, enabling adaptive selection between global and local query models; (2) performs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Privacy-Preserving Technologies in Data · Imbalanced Data Classification Techniques
