Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
Sikai Bai, Shuaicheng Li, Weiming Zhuang, Jie Zhang, Song Guo, Kunlin, Yang, Jun Hou, Shuai Zhang, Junyu Gao, Shuai Yi

TL;DR
This paper introduces FedDure, a novel federated semi-supervised learning framework with dual regulators designed to handle non-IID data distributions across and within clients, improving model performance in realistic scenarios.
Contribution
The paper proposes FedDure, a new FSSL framework with coarse- and fine-grained regulators, and formulates client training as bi-level optimization with convergence guarantees.
Findings
FedDure outperforms existing methods by over 11% on CIFAR-10 and CINIC-10.
The dual regulators effectively address data distribution heterogeneity.
Theoretical convergence guarantees are established for the proposed framework.
Abstract
Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume independent and identically distributed (IID) labeled data across clients and consistent class distribution between labeled and unlabeled data within a client. This work studies a more practical and challenging scenario of FSSL, where data distribution is different not only across clients but also within a client between labeled and unlabeled data. To address this challenge, we propose a novel FSSL framework with dual regulators, FedDure. FedDure lifts the previous assumption with a coarse-grained regulator (C-reg) and a fine-grained regulator (F-reg): C-reg regularizes the updating of the local model by tracking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques
