Distilled One-Shot Federated Learning
Yanlin Zhou, George Pu, Xiyao Ma, Xiaolin Li, Dapeng Wu

TL;DR
DOSFL is a novel federated learning approach that drastically reduces communication costs by using dataset distillation into synthetic data, achieving near-centralized performance in just one round.
Contribution
This work introduces DOSFL, a one-shot federated learning method that uses dataset distillation to minimize communication while maintaining high accuracy.
Findings
Achieves 93-99% of centralized model performance.
Reduces communication cost by up to three orders of magnitude.
Effective on vision and language tasks with various models.
Abstract
Current federated learning algorithms take tens of communication rounds transmitting unwieldy model weights under ideal circumstances and hundreds when data is poorly distributed. Inspired by recent work on dataset distillation and distributed one-shot learning, we propose Distilled One-Shot Federated Learning (DOSFL) to significantly reduce the communication cost while achieving comparable performance. In just one round, each client distills their private dataset, sends the synthetic data (e.g. images or sentences) to the server, and collectively trains a global model. The distilled data look like noise and are only useful to the specific model weights, i.e., become useless after the model updates. With this weight-less and gradient-less design, the total communication cost of DOSFL is up to three orders of magnitude less than FedAvg while preserving between 93% to 99% performance of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Tanh Activation · Residual Connection
