Escaping Saddle Points for Zeroth-order Nonconvex Optimization using   Estimated Gradient Descent

Qinbo Bai; Mridul Agarwal; Vaneet Aggarwal

arXiv:1910.01277·math.OC·October 7, 2019

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Qinbo Bai, Mridul Agarwal, Vaneet Aggarwal

PDF

Open Access

TL;DR

This paper introduces a gradient estimation method for zeroth-order non-convex optimization that guarantees convergence to second-order stationary points, expanding the applicability of gradient-based techniques without explicit gradient access.

Contribution

It proposes a model-free algorithm that achieves second-order stationary points in non-convex optimization using estimated gradients, with theoretical query complexity bounds.

Findings

01

Converges to an $oldsymbol{\e}$-second-order stationary point.

02

Requires $oldsymbol{ ilde{O}(rac{d^{2+rac{ heta}{2}}}{ ext{ extit{e}}^{8+ heta}})}$ function queries.

03

Applicable to general non-convex problems without gradient oracle access.

Abstract

Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stationary properties, the second-order stationary properties are important in machine learning applications to achieve better performance. We show that the proposed model-free non-convex optimization algorithm returns an $ϵ$ -second-order stationary point with $O (\frac{d ^{2 + \frac{θ}{2}}}{ϵ ^{8 + θ}})$ queries of the function for any arbitrary $θ > 0$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms