Zeroth-order Nonconvex Stochastic Optimization: Handling Constraints,   High-Dimensionality and Saddle-Points

Krishnakumar Balasubramanian; Saeed Ghadimi

arXiv:1809.06474·math.OC·January 16, 2019

Zeroth-order Nonconvex Stochastic Optimization: Handling Constraints, High-Dimensionality and Saddle-Points

Krishnakumar Balasubramanian, Saeed Ghadimi

PDF

TL;DR

This paper develops zeroth-order stochastic algorithms for nonconvex and convex optimization, effectively handling constraints, high-dimensionality, and saddle-points by leveraging structural sparsity and Stein's identities.

Contribution

It introduces new zeroth-order algorithms for constrained, high-dimensional, and non-convex optimization, including saddle-point avoidance, with theoretical convergence guarantees.

Findings

01

Algorithms achieve rates similar to first-order methods using only zeroth-order info.

02

Exploits sparsity to improve high-dimensional optimization efficiency.

03

Provides a zeroth-order Hessian estimator and saddle-point avoidance method.

Abstract

In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for nonconvex and convex optimization, with a focus on addressing constrained optimization, high-dimensional setting and saddle-point avoiding. To handle constrained optimization, we first propose generalizations of the conditional gradient algorithm achieving rates similar to the standard stochastic gradient algorithm using only zeroth-order information. To facilitate zeroth-order optimization in high-dimensions, we explore the advantages of structural sparsity assumptions. Specifically, (i) we highlight an implicit regularization phenomenon where the standard stochastic gradient algorithm with zeroth-order information adapts to the sparsity of the problem at hand by just varying the step-size and (ii) propose a truncated stochastic gradient algorithm with zeroth-order information, whose rate of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.