Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent
Qinbo Bai, Mridul Agarwal, Vaneet Aggarwal

TL;DR
This paper introduces a gradient estimation method for zeroth-order non-convex optimization that guarantees convergence to second-order stationary points, expanding the applicability of gradient-based techniques without explicit gradient access.
Contribution
It proposes a model-free algorithm that achieves second-order stationary points in non-convex optimization using estimated gradients, with theoretical query complexity bounds.
Findings
Converges to an $oldsymbol{\e}$-second-order stationary point.
Requires $oldsymbol{ ilde{O}(rac{d^{2+rac{ heta}{2}}}{ ext{ extit{e}}^{8+ heta}})}$ function queries.
Applicable to general non-convex problems without gradient oracle access.
Abstract
Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stationary properties, the second-order stationary properties are important in machine learning applications to achieve better performance. We show that the proposed model-free non-convex optimization algorithm returns an -second-order stationary point with queries of the function for any arbitrary .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
