A Distributed Asynchronous Generalized Momentum Algorithm Without Delay Bounds
Ellie Pond, Yichen Zhao, Matthew Hale

TL;DR
This paper introduces a distributed asynchronous generalized momentum algorithm that converges rapidly without requiring delay bounds, outperforming traditional methods in scenarios with unbounded delays.
Contribution
It presents a novel distributed generalized momentum algorithm that ensures fast convergence without delay bounds, unifying several existing algorithms and demonstrating superior performance.
Findings
Converges linearly with arbitrary delays.
Requires at least 71% fewer iterations than gradient descent.
Outperforms heavy ball and Nesterov's algorithms in unbounded delay scenarios.
Abstract
Asynchronous optimization algorithms often require delay bounds to prove their convergence, though these bounds can be difficult to obtain in practice. Existing algorithms that do not require delay bounds often converge slowly. Therefore, we introduce a novel distributed generalized momentum algorithm that provides fast convergence and allows arbitrary delays. It subsumes Nesterov's accelerated gradient algorithm and the heavy ball algorithm, among others. We first develop conditions on the parameters of this algorithm that ensure asymptotic convergence. Then we show its convergence rate is linear in a function of the number of computations and communications that processors perform (in a way that we make precise). Simulations compare this algorithm to gradient descent, heavy ball, and Nesterov's accelerated gradient algorithm with a classification problem on the Fashion-MNIST dataset.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
