Breaking the Lower Bound with (Little) Structure: Acceleration in Non-Convex Stochastic Optimization with Heavy-Tailed Noise
Zijian Liu, Jiawei Zhang, Zhengyuan Zhou

TL;DR
This paper improves convergence guarantees for heavy-tailed stochastic optimization, showing that with minimal structure, faster rates than the known lower bounds are achievable using a new variance-reduced accelerated algorithm.
Contribution
It introduces a variance-reduced accelerated algorithm for structured stochastic optimization, surpassing existing lower bounds under mild assumptions.
Findings
Achieves nearly optimal high-probability convergence without restrictive assumptions.
Demonstrates faster convergence rate with minimal problem structure.
Yields near-optimal rates even in finite-variance scenarios.
Abstract
We consider the stochastic optimization problem with smooth but not necessarily convex objectives in the heavy-tailed noise regime, where the stochastic gradient's noise is assumed to have bounded th moment (). Zhang et al. (2020) is the first to prove the lower bound for convergence (in expectation) and provides a simple clipping algorithm that matches this optimal rate. Cutkosky and Mehta (2021) proposes another algorithm, which is shown to achieve the nearly optimal high-probability convergence guarantee , where is the probability of failure. However, this desirable guarantee is only established under the additional assumption that the stochastic gradient itself is bounded in th moment, which fails to hold even for quadratic objectives and centered Gaussian noise. In this work, we first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods
