Reviving Stale Updates: Data-Free Knowledge Distillation for Asynchronous Federated Learning
Baris Askin, Holger R. Roth, Zhenyu Sun, Carlee Joe-Wong, Gauri Joshi, Ziyue Xu

TL;DR
This paper introduces FedRevive, a novel asynchronous federated learning framework that uses data-free knowledge distillation to effectively revive stale updates, leading to faster training and higher accuracy without data sharing.
Contribution
FedRevive is the first to integrate data-free knowledge distillation into AFL to mitigate staleness, combining parameter aggregation with pseudo-sample generation for improved convergence.
Findings
Achieves up to 38.4% faster training
Improves final accuracy by up to 16.5%
Effective across vision and text benchmarks
Abstract
Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, yet its scalability is limited by synchronization overhead. Asynchronous federated learning (AFL) alleviates this issue by allowing clients to communicate independently, thereby improving wall-clock efficiency in large-scale, hardware-heterogeneous environments. However, asynchrony introduces updates computed on outdated global models (staleness) that can destabilize optimization and hinder convergence. We propose FedRevive, an AFL framework that revives stale updates through data-free knowledge distillation (DFKD). FedRevive integrates parameter-space aggregation with a lightweight, server-side DFKD process that transfers knowledge from stale client updates to the current global model without access to data. A meta-learned generator synthesizes pseudo-samples used for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
