Distilled One-Shot Federated Learning

Yanlin Zhou; George Pu; Xiyao Ma; Xiaolin Li; Dapeng Wu

arXiv:2009.07999·cs.LG·June 8, 2021·74 cites

Distilled One-Shot Federated Learning

Yanlin Zhou, George Pu, Xiyao Ma, Xiaolin Li, Dapeng Wu

PDF

Open Access 1 Repo

TL;DR

DOSFL is a novel federated learning approach that drastically reduces communication costs by using dataset distillation into synthetic data, achieving near-centralized performance in just one round.

Contribution

This work introduces DOSFL, a one-shot federated learning method that uses dataset distillation to minimize communication while maintaining high accuracy.

Findings

01

Achieves 93-99% of centralized model performance.

02

Reduces communication cost by up to three orders of magnitude.

03

Effective on vision and language tasks with various models.

Abstract

Current federated learning algorithms take tens of communication rounds transmitting unwieldy model weights under ideal circumstances and hundreds when data is poorly distributed. Inspired by recent work on dataset distillation and distributed one-shot learning, we propose Distilled One-Shot Federated Learning (DOSFL) to significantly reduce the communication cost while achieving comparable performance. In just one round, each client distills their private dataset, sends the synthetic data (e.g. images or sentences) to the server, and collectively trains a global model. The distilled data look like noise and are only useful to the specific model weights, i.e., become useless after the model updates. With this weight-less and gradient-less design, the total communication cost of DOSFL is up to three orders of magnitude less than FedAvg while preserving between 93% to 99% performance of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Guang000/Awesome-Dataset-Distillation/blob/main/README.md
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Tanh Activation · Residual Connection