Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity
Jason D. Lee, Qihang Lin, Tengyu Ma, Tianbao Yang

TL;DR
This paper introduces a distributed stochastic variance reduced gradient algorithm that optimally balances runtime, communication, and rounds of communication for large-scale convex optimization, and establishes a lower bound for communication rounds.
Contribution
The paper presents a novel distributed stochastic variance reduced gradient method that achieves optimal communication complexity and rounds, matching a new lower bound for distributed first-order algorithms.
Findings
Achieves optimal parallel runtime and communication efficiency.
Outperforms existing methods in rounds of communication when the condition number is moderate.
Proves a lower bound for communication rounds and shows the proposed method attains it.
Abstract
We study distributed optimization algorithms for minimizing the average of convex functions. The applications include empirical risk minimization problems in statistical machine learning where the datasets are large and have to be stored on different machines. We design a distributed stochastic variance reduced gradient algorithm that, under certain conditions on the condition number, simultaneously achieves the optimal parallel runtime, amount of communication and rounds of communication among all distributed first-order methods up to constant factors. Our method and its accelerated extension also outperform existing distributed algorithms in terms of the rounds of communication as long as the condition number is not too large compared to the size of data in each machine. We also prove a lower bound for the number of rounds of communication for a broad class of distributed first-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
