Do We Need Asynchronous SGD? On the Near-Optimality of Synchronous Methods

Grigory Begunov; Alexander Tyurin

arXiv:2602.03802·cs.DC·February 4, 2026

Do We Need Asynchronous SGD? On the Near-Optimality of Synchronous Methods

Grigory Begunov, Alexander Tyurin

PDF

Open Access

TL;DR

This paper demonstrates that synchronous stochastic gradient descent (SGD) and its variant are nearly optimal for many heterogeneous distributed computing scenarios, challenging the common preference for asynchronous methods.

Contribution

The paper provides a theoretical analysis showing the near-optimality of synchronous SGD and its robust variant in various practical heterogeneous environments.

Findings

01

Synchronous methods are nearly optimal under random computation times.

02

Synchronous methods are optimal up to logarithmic factors with adversarial partial participation.

03

Asynchronous methods are not always necessary for modern heterogeneous tasks.

Abstract

Modern distributed optimization methods mostly rely on traditional synchronous approaches, despite substantial recent progress in asynchronous optimization. We revisit Synchronous SGD and its robust variant, called $m$ -Synchronous SGD, and theoretically show that they are nearly optimal in many heterogeneous computation scenarios, which is somewhat unexpected. We analyze the synchronous methods under random computation times and adversarial partial participation of workers, and prove that their time complexities are optimal in many practical regimes, up to logarithmic factors. While synchronous methods are not universal solutions and there exist tasks where asynchronous methods may be necessary, we show that they are sufficient for many modern heterogeneous computation scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Cryptography and Data Security