Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization
Stefan Vlaski, Ali H. Sayed

TL;DR
This paper provides new theoretical guarantees for stochastic gradient descent in non-convex optimization, showing that under relaxed noise conditions, efficient escape from saddle points is achievable without extra noise or complex assumptions.
Contribution
It introduces a mean-square analysis approach that relaxes traditional noise bounds, ensuring saddle-point escape in non-convex SGD without additional noise or restrictive assumptions.
Findings
Relaxed variance bounds suffice for saddle-point escape.
Mean-square analysis offers an alternative to concentration-based methods.
No need for extra noise injection or global dispersive noise assumptions.
Abstract
Recent years have seen increased interest in performance guarantees of gradient descent algorithms for non-convex optimization. A number of works have uncovered that gradient noise plays a critical role in the ability of gradient descent recursions to efficiently escape saddle-points and reach second-order stationary points. Most available works limit the gradient noise component to be bounded with probability one or sub-Gaussian and leverage concentration inequalities to arrive at high-probability results. We present an alternate approach, relying primarily on mean-square arguments and show that a more relaxed relative bound on the gradient noise variance is sufficient to ensure efficient escape from saddle-points without the need to inject additional noise, employ alternating step-sizes or rely on a global dispersive noise assumption, as long as a gradient noise component is present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
