Near-Optimal Non-Convex Stochastic Optimization under Generalized Smoothness
Zijian Liu, Srikanth Jagabathula, Zhengyuan Zhou

TL;DR
This paper introduces a new analysis of a simple variant of the STORM algorithm for generalized smooth non-convex stochastic optimization, achieving near-optimal high-probability and expected convergence guarantees with constant batch size.
Contribution
It provides the first near-optimal high-probability sample complexity for generalized smoothness and improves expected convergence bounds, all with constant batch size requirements.
Findings
Achieves $O( ext{log}(1/(\delta ext{,}\epsilon)) ext{ extasciicircum}3)$ high-probability sample complexity.
Recovers the optimal $O( ext{ extasciicircum}3)$ expected sample complexity.
Requires only a constant batch size, unlike previous methods.
Abstract
The generalized smooth condition, -smoothness, has triggered people's interest since it is more realistic in many optimization problems shown by both empirical and theoretical evidence. Two recent works established the sample complexity to obtain an -stationary point. However, both require a large batch size on the order of , which is not only computationally burdensome but also unsuitable for streaming applications. Additionally, these existing convergence bounds are established only for the expected rate, which is inadequate as they do not supply a useful performance guarantee on a single run. In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm. Specifically, under the -smoothness and affine-type noises, we establish the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
