Escaping saddle points in zeroth-order optimization: the power of two-point estimators
Zhaolin Ren, Yujie Tang, and Na Li

TL;DR
This paper demonstrates that two-point zeroth-order optimization methods, with added isotropic perturbation, can efficiently escape saddle points and find second-order stationary points using fewer function evaluations than previous methods, especially in high-dimensional nonconvex problems.
Contribution
It introduces a novel two-point zeroth-order algorithm with isotropic perturbation that escapes saddle points efficiently and reduces the number of function evaluations needed.
Findings
The proposed method finds $ ilde{O}(d/(mar{ heta}\epsilon^{2}))$ function evaluations.
It can escape saddle points in high-dimensional nonconvex optimization.
The method achieves polynomial convergence to second-order stationary points.
Abstract
Two-point zeroth order methods are important in many applications of zeroth-order optimization, such as robotics, wind farms, power systems, online optimization, and adversarial robustness to black-box attacks in deep neural networks, where the problem may be high-dimensional and/or time-varying. Most problems in these applications are nonconvex and contain saddle points. While existing works have shown that zeroth-order methods utilizing function valuations per iteration (with denoting the problem dimension) can escape saddle points efficiently, it remains an open question if zeroth-order methods based on two-point estimators can escape saddle points. In this paper, we show that by adding an appropriate isotropic perturbation at each iteration, a zeroth-order algorithm based on (for any ) function evaluations per iteration can not only find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
