Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Zeyuan Allen-Zhu

TL;DR
Katyusha is a novel stochastic gradient method that achieves optimal accelerated convergence rates and parallel speedup by introducing a new momentum technique, overcoming limitations of Nesterov's momentum in stochastic settings.
Contribution
The paper introduces Katyusha, a primal-only stochastic gradient method with a novel negative momentum, providing optimal acceleration and parallel speedup in convex finite-sum optimization.
Findings
Achieves optimal accelerated convergence rate
Enjoys optimal parallel linear speedup
Incorporates a novel negative momentum technique
Abstract
Nesterov's momentum trick is famously known for accelerating gradient descent, and has been proven useful in building fast iterative algorithms. However, in the stochastic setting, counterexamples exist and prevent Nesterov's momentum from providing similar acceleration, even if the underlying problem is convex and finite-sum. We introduce , a direct, primal-only stochastic gradient method to fix this issue. In convex finite-sum stochastic optimization, has an optimal accelerated convergence rate, and enjoys an optimal parallel linear speedup in the mini-batch setting. The main ingredient is , a novel "negative momentum" on top of Nesterov's momentum. It can be incorporated into a variance-reduction based algorithm and speed it up, both in terms of performance. Since variance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
