Optimizing Asynchronous Federated Learning: A Delicate Trade-Off Between Model-Parameter Staleness and Update Frequency
Abdelkrim Alahyane (LAAS-SARA, LAAS), C\'eline Comte (CNRS, LAAS-SARA, LAAS), Matthieu Jonckheere (CNRS, LAAS-SARA, LAAS), \'Eric Moulines (X)

TL;DR
This paper analyzes the trade-off between model parameter staleness and update frequency in asynchronous federated learning, proposing optimization methods that improve accuracy by 10-30% while balancing system speed and learning quality.
Contribution
It introduces a stochastic modeling framework for asynchronous FL, deriving a closed-form delay metric and an alternative speed-aware metric to optimize system performance.
Findings
Optimizations improve accuracy by 10-30%.
Derived a discrete Little's law variant for delay measurement.
Balanced staleness and throughput to enhance learning efficiency.
Abstract
Synchronous federated learning (FL) scales poorly with the number of clients due to the straggler effect. Algorithms like FedAsync and GeneralizedFedAsync address this limitation by enabling asynchronous communication between clients and the central server. In this work, we rely on stochastic modeling and analysis to better understand the impact of design choices in asynchronous FL algorithms, such as the concurrency level and routing probabilities, and we leverage this knowledge to optimize loss. Compared to most existing studies, we account for the joint impact of heterogeneous and variable service speeds and heterogeneous datasets at the clients. We characterize in particular a fundamental trade-off for optimizing asynchronous FL: minimizing gradient estimation errors by avoiding model parameter staleness, while also speeding up the system by increasing the throughput of model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
