On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum   Optimization

Nicolas Emmenegger; Rasmus Kyng; Ahad N. Zehmakan

arXiv:2103.05138·math.OC·July 5, 2021·1 cites

On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization

Nicolas Emmenegger, Rasmus Kyng, Ahad N. Zehmakan

PDF

Open Access

TL;DR

This paper establishes fundamental lower bounds for higher-order optimization methods in smooth non-convex finite-sum problems, revealing limitations and proposing new smoothness assumptions to bridge gaps between bounds and algorithms.

Contribution

It proves lower bounds for deterministic and randomized higher-order methods, and introduces a new second-order smoothness assumption to improve convergence guarantees.

Findings

01

Deterministic algorithms cannot exploit finite-sum structure.

02

Simulating pth-order regularized methods is optimal up to constants.

03

New second-order smoothness assumption improves bounds and guarantees.

Abstract

We prove lower bounds for higher-order methods in smooth non-convex finite-sum optimization. Our contribution is threefold: We first show that a deterministic algorithm cannot profit from the finite-sum structure of the objective, and that simulating a pth-order regularized method on the whole function by constructing exact gradient information is optimal up to constant factors. We further show lower bounds for randomized algorithms and compare them with the best known upper bounds. To address some gaps between the bounds, we propose a new second-order smoothness assumption that can be seen as an analogue of the first-order mean-squared smoothness assumption. We prove that it is sufficient to ensure state-of-the-art convergence guarantees, while allowing for a sharper lower bound.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Complexity and Algorithms in Graphs