OFFO minimization algorithms for second-order optimality and their   complexity

S. Gratton; Ph. L. Toint

arXiv:2203.03351·math.OC·February 16, 2023·Comput. Optim. Appl.

OFFO minimization algorithms for second-order optimality and their complexity

S. Gratton, Ph. L. Toint

PDF

TL;DR

This paper introduces a class of Adagrad-inspired algorithms for smooth unconstrained optimization that achieve near-optimal convergence rates for gradient norms and second-order optimality measures without evaluating the objective function.

Contribution

The paper proposes a novel Adagrad-inspired algorithm that minimizes function evaluations while maintaining optimal convergence rates for second-order optimality measures.

Findings

01

Gradient norms decrease as O(1/√k)

02

Second-order optimality measures converge as O(1/k^{1/3})

03

Related divergent stepsize method has slightly inferior convergence

Abstract

An Adagrad-inspired class of algorithms for smooth unconstrained optimization is presented in which the objective function is never evaluated and yet the gradient norms decrease at least as fast as $\calO (1/ k + 1)$ while second-order optimality measures converge to zero at least as fast as $\calO (1/ (k + 1)^{1/3})$ . This latter rate of convergence is shown to be essentially sharp and is identical to that known for more standard algorithms (like trust-region or adaptive-regularization methods) using both function and derivatives' evaluations. A related "divergent stepsize" method is also described, whose essentially sharp rate of convergence is slighly inferior. It is finally discussed how to obtain weaker second-order optimality guarantees at a (much) reduced computional cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.