SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator
Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

TL;DR
This paper introduces SPIDER, a novel stochastic estimator that significantly reduces computational costs in non-convex optimization, enabling faster convergence for both first-order and zeroth-order methods.
Contribution
The paper presents SPIDER, a new stochastic estimator, and develops algorithms that achieve near-optimal convergence rates for non-convex stochastic optimization problems.
Findings
Achieves a gradient computation cost of O(min(n^{1/2} ε^{-2}, ε^{-3})) for first-order stationary points.
Outperforms existing methods in zeroth-order stochastic optimization with cost O(d min(n^{1/2} ε^{-2}, ε^{-3})).
Provides sharp convergence bounds and nearly matches lower bounds for first-order stationary point finding.
Abstract
In this paper, we propose a new technique named \textit{Stochastic Path-Integrated Differential EstimatoR} (SPIDER), which can be used to track many deterministic quantities of interest with significantly reduced computational cost. We apply SPIDER to two tasks, namely the stochastic first-order and zeroth-order methods. For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only. We provide sharp error-bound results on their convergence rates. In special, we prove that the SPIDER-SFO and SPIDER-SFO\textsuperscript{+} algorithms achieve a record-breaking gradient computation cost of for finding an -approximate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
