Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model
Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue

TL;DR
This paper introduces FedDISC, a novel semi-supervised federated learning method that leverages pre-trained diffusion models to generate synthetic datasets, reducing communication costs and handling data heterogeneity effectively.
Contribution
FedDISC is the first method to incorporate pre-trained diffusion models into semi-FL, enabling high-quality synthetic data generation with minimal communication and no local training.
Findings
FedDISC achieves comparable or superior performance to centralized supervised training.
The synthetic datasets exhibit diversity and quality similar to original client data.
FedDISC operates effectively within a single communication round with minimal information upload.
Abstract
Recently, semi-supervised federated learning (semi-FL) has been proposed to handle the commonly seen real-world scenarios with labeled data on the server and unlabeled data on the clients. However, existing methods face several challenges such as communication costs, data heterogeneity, and training pressure on client devices. To address these challenges, we introduce the powerful diffusion models (DM) into semi-FL and propose FedDISC, a Federated Diffusion-Inspired Semi-supervised Co-training method. Specifically, we first extract prototypes of the labeled server data and use these prototypes to predict pseudo-labels of the client data. For each category, we compute the cluster centroids and domain-specific representations to signify the semantic and stylistic information of their distributions. After adding noise, these representations are sent back to the server, which uses the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsDiffusion
