Lower Bounds for Finding Stationary Points I

Yair Carmon; John C. Duchi; Oliver Hinder; Aaron Sidford

arXiv:1710.11606·math.OC·August 16, 2019·Math. Program.

Lower Bounds for Finding Stationary Points I

Yair Carmon, John C. Duchi, Oliver Hinder, Aaron Sidford

PDF

TL;DR

This paper establishes tight lower bounds on the number of queries needed by algorithms to find approximate stationary points in smooth, possibly non-convex functions, demonstrating the optimality of several common optimization methods.

Contribution

It provides the first sharp lower bounds on the complexity of finding stationary points in high-dimensional non-convex optimization, confirming the optimality of existing algorithms.

Findings

01

Lower bounds match the complexity of gradient descent and Newton's method.

02

Optimality of first- and higher-order regularization methods is established.

03

Results apply to a broad class of smooth, non-convex functions.

Abstract

We prove lower bounds on the complexity of finding $ϵ$ -stationary points (points $x$ such that $∥\nabla f (x) ∥ \leq ϵ$ ) of smooth, high-dimensional, and potentially non-convex functions $f$ . We consider oracle-based complexity measures, where an algorithm is given access to the value and all derivatives of $f$ at a query point $x$ . We show that for any (potentially randomized) algorithm $A$ , there exists a function $f$ with Lipschitz $p$ th order derivatives such that $A$ requires at least $ϵ^{- (p + 1) / p}$ queries to find an $ϵ$ -stationary point. Our lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton's method, and generalized $p$ th order regularization are worst-case optimal within their natural function classes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.