Escaping Saddle-Points Faster under Interpolation-like Conditions
Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant, Mohapatra

TL;DR
This paper demonstrates that over-parametrized models with interpolation-like conditions enable stochastic optimization algorithms to escape saddle points more rapidly, achieving convergence rates comparable to deterministic methods.
Contribution
The paper establishes faster convergence rates for PSGD and SCRN under interpolation-like conditions, bridging the gap between stochastic and deterministic optimization complexities.
Findings
PSGD reaches an $ ilde{O}(1/ ext{epsilon}^2)$ complexity under interpolation-like assumptions.
SCRN achieves an $ ilde{O}(1/ ext{epsilon}^{2.5})$ complexity under similar conditions.
Further Hessian-based assumptions may be needed to match deterministic rates.
Abstract
In this paper, we show that under over-parametrization several standard stochastic optimization algorithms escape saddle-points and converge to local-minimizers much faster. One of the fundamental aspects of over-parametrized models is that they are capable of interpolating the training data. We show that, under interpolation-like assumptions satisfied by the stochastic gradients in an over-parametrization setting, the first-order oracle complexity of Perturbed Stochastic Gradient Descent (PSGD) algorithm to reach an -local-minimizer, matches the corresponding deterministic rate of . We next analyze Stochastic Cubic-Regularized Newton (SCRN) algorithm under interpolation-like conditions, and show that the oracle complexity to reach an -local-minimizer under interpolation-like conditions, is .…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
