Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits

Grigory Begunov; Alexander Tyurin

arXiv:2605.07795·math.OC·May 11, 2026

Scalable Distributed Stochastic Optimization via Bidirectional Compression: Beyond Pessimistic Limits

Grigory Begunov, Alexander Tyurin

PDF

TL;DR

This paper introduces new compressed stochastic optimization methods that surpass existing lower bounds, enabling better scaling with the number of workers in distributed learning.

Contribution

It proposes Inkheart SGD and M4 algorithms that, under an additional structural assumption, achieve state-of-the-art complexities surpassing previous pessimistic limits.

Findings

01

New methods outperform traditional approaches in distributed settings.

02

Achieve scaling with the number of workers n, breaking previous lower bounds.

03

Provide theoretical guarantees under specific structural assumptions.

Abstract

In centralized, distributed, and federated learning with stochastic gradients and $n$ workers, it was recently shown that it is infeasible to find an $ε$ -stationary point faster than $\tilde{Ω} (min {\frac{d κ L Δ}{ε} + \frac{h L Δ}{ε} + \frac{h σ ^{2} L Δ}{n ε ^{2}}, \frac{h σ ^{2} L Δ}{ε ^{2}} + \frac{h L Δ}{ε}})$ seconds in both homogeneous and heterogeneous settings under standard assumptions: $L$ -smoothness, $σ^{2}$ -bounded unbiased stochastic gradients, and lower boundedness of the function, i.e., $f (x) \geq f^{*}$ for all $x \in R^{d}$ , where $Δ = f (x^{0}) - f^{*}$ , $h$ is the computation time, $κ$ is the communication speed between the workers and the server, and $d$ is the dimension of the iterates and gradients. This result is pessimistic since it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.