Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox
Jialei Wang, Weiran Wang, Nathan Srebro

TL;DR
This paper introduces a distributed stochastic optimization method using minibatch-prox that optimally balances communication and memory, achieving near-linear speedups and statistical optimality regardless of minibatch size.
Contribution
It proposes a novel minibatch-prox approach with a new analysis that guarantees statistical optimality and flexible tradeoffs between communication and memory in distributed settings.
Findings
Achieves near-linear speedups with logarithmic communication or polynomial communication with reduced memory.
Provides a new analysis ensuring statistical optimality regardless of minibatch size.
Significantly improves upon prior methods in distributed stochastic optimization.
Abstract
We present and analyze an approach for distributed stochastic optimization which is statistically optimal and achieves near-linear speedups (up to logarithmic factors). Our approach allows a communication-memory tradeoff, with either logarithmic communication but linear memory, or polynomial communication and a corresponding polynomial reduction in required memory. This communication-memory tradeoff is achieved through minibatch-prox iterations (minibatch passive-aggressive updates), where a subproblem on a minibatch is solved at each iteration. We provide a novel analysis for such a minibatch-prox procedure which achieves the statistical optimal rate regardless of minibatch size and smoothness, thus significantly improving on prior work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Error Correcting Code Techniques
