Enhancing Parallelism in Decentralized Stochastic Convex Optimization
Ofri Eisen, Ron Dorfman, Kfir Y. Levy

TL;DR
This paper introduces Decentralized Anytime SGD, a new algorithm that enhances parallelism in decentralized stochastic convex optimization, allowing more machines to work together efficiently without degrading convergence, thus bridging the gap with centralized methods.
Contribution
We propose a novel decentralized learning algorithm that extends the parallelism threshold, improving scalability and theoretical guarantees in stochastic convex optimization.
Findings
Significantly extends the critical parallelism threshold.
Achieves better statistical guarantees for larger networks.
Closes the gap with centralized learning in highly connected topologies.
Abstract
Decentralized learning has emerged as a powerful approach for handling large datasets across multiple machines in a communication-efficient manner. However, such methods often face scalability limitations, as increasing the number of machines beyond a certain point negatively impacts convergence rates. In this work, we propose Decentralized Anytime SGD, a novel decentralized learning algorithm that significantly extends the critical parallelism threshold, enabling the effective use of more machines without compromising performance. Within the stochastic convex optimization (SCO) framework, we establish a theoretical upper bound on parallelism that surpasses the current state-of-the-art, allowing larger networks to achieve favorable statistical guarantees and closing the gap with centralized learning in highly connected topologies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques
