Local Curvature Descent: Squeezing More Curvature out of Standard and   Polyak Gradient Descent

Peter Richt\'arik; Simone Maria Giancola; Dymitr Lubczyk; Robin Yadav

arXiv:2405.16574·math.OC·May 28, 2024

Local Curvature Descent: Squeezing More Curvature out of Standard and Polyak Gradient Descent

Peter Richt\'arik, Simone Maria Giancola, Dymitr Lubczyk, Robin Yadav

PDF

Open Access

TL;DR

This paper introduces three new local curvature descent methods that adaptively use curvature information to improve gradient descent, achieving better empirical performance without expensive second-order computations.

Contribution

The paper develops three novel local curvature descent algorithms (LCD1, LCD2, LCD3) that incorporate local curvature information into gradient descent, with theoretical guarantees and empirical improvements.

Findings

01

LCD methods outperform classical gradient descent in experiments

02

Theoretical analysis recovers known rates when curvature info is absent

03

LCD3 provides a closed-form iterative expression

Abstract

We contribute to the growing body of knowledge on more powerful and adaptive stepsizes for convex optimization, empowered by local curvature information. We do not go the route of fully-fledged second-order methods which require the expensive computation of the Hessian. Instead, our key observation is that, for some problems (e.g., when minimizing the sum of squares of absolutely convex functions), certain local curvature information is readily available, and can be used to obtain surprisingly powerful matrix-valued stepsizes, and meaningful theory. In particular, we develop three new methods $\unicode x 2013$ LCD1, LCD2 and LCD3 $\unicode x 2013$ where the abbreviation stands for local curvature descent. While LCD1 generalizes gradient descent with fixed stepsize, LCD2 generalizes gradient descent with Polyak stepsize. Our methods enhance these classical gradient descent baselines with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Measurement and Metrology Techniques