Asynchronous Stochastic Optimization Robust to Arbitrary Delays
Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

TL;DR
This paper introduces an efficient stochastic optimization algorithm that is robust to arbitrary and variable delays in gradient updates, improving convergence guarantees over previous methods that depended on maximum delay.
Contribution
The paper presents a simple, efficient algorithm for non-convex stochastic optimization that depends on average delay, not maximum delay, enhancing robustness in asynchronous distributed systems.
Findings
Algorithm achieves $O( rac{\sigma^2}{\epsilon^4} + rac{ au}{\epsilon^2} )$ steps for $\epsilon$-stationary points.
Outperforms previous methods by depending on average delay $ au$ instead of maximum delay.
Demonstrates robustness in experiments with skewed and heavy-tailed delay distributions.
Abstract
We consider stochastic optimization with delayed gradients where, at each time step , the algorithm makes an update using a stale stochastic gradient from step for some arbitrary delay . This setting abstracts asynchronous distributed optimization where a central server receives gradient updates computed by worker machines. These machines can experience computation and communication loads that might vary significantly over time. In the general non-convex smooth optimization setting, we give a simple and efficient algorithm that requires steps for finding an -stationary point , where is the \emph{average} delay and is the variance of the stochastic gradients. This improves over previous work, which showed that stochastic gradient decent achieves the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
