Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan, Alexander Tyurin, Peter Richt\'arik

TL;DR
Ringmaster ASGD is a novel asynchronous stochastic gradient descent method that achieves optimal time complexity in distributed settings with heterogeneous and fluctuating worker computation times, filling a key gap in the literature.
Contribution
We introduce Ringmaster ASGD, the first asynchronous SGD algorithm proven to attain optimal time complexity under arbitrary heterogeneity and dynamic worker performance.
Findings
Achieves optimal time complexity in heterogeneous environments
Theoretically matches lower bounds for asynchronous SGD
Addresses limitations of previous asynchronous methods
Abstract
Asynchronous Stochastic Gradient Descent (Asynchronous SGD) is a cornerstone method for parallelizing learning in distributed machine learning. However, its performance suffers under arbitrarily heterogeneous computation times across workers, leading to suboptimal time complexity and inefficiency as the number of workers scales. While several Asynchronous SGD variants have been proposed, recent findings by Tyurin & Richt\'arik (NeurIPS 2023) reveal that none achieve optimal time complexity, leaving a significant gap in the literature. In this paper, we propose Ringmaster ASGD, a novel Asynchronous SGD method designed to address these limitations and tame the inherent challenges of Asynchronous SGD. We establish, through rigorous theoretical analysis, that Ringmaster ASGD achieves optimal time complexity under arbitrarily heterogeneous and dynamically fluctuating worker computation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCellular Automata and Applications · Advanced Data Storage Technologies · Algorithms and Data Compression
MethodsStochastic Gradient Descent
