A note on $R$-linear convergence of nonmonotone gradient methods

Xinrui Li; Yakui Huang

arXiv:2207.05912·math.OC·February 7, 2023

A note on $R$-linear convergence of nonmonotone gradient methods

Xinrui Li, Yakui Huang

PDF

Open Access

TL;DR

This paper introduces a property that improves the theoretical convergence rate analysis of nonmonotone gradient methods, aligning it more closely with their practical performance, especially for quadratic optimization.

Contribution

It establishes a new convergence property that guarantees $R$-linear convergence for a broad class of gradient methods, improving existing theoretical rates.

Findings

01

Gradient methods with the new property converge $R$-linearly at rate $1-rac{ ext{smallest eigenvalue}}{ ext{upper bound of inverse stepsize}}$.

02

Existing nonmonotone methods' convergence rates can be improved to $1-1/\kappa$, where $\kappa$ is the condition number.

03

The results bridge the gap between theoretical convergence rates and practical performance of nonmonotone gradient methods.

Abstract

Nonmonotone gradient methods generally perform better than their monotone counterparts especially on unconstrained quadratic optimization. However, the known convergence rate of the monotone method is often much better than its nonmonotone variant. With the aim of shrinking the gap between theory and practice of nonmonotone gradient methods, we introduce a property for convergence analysis of a large collection of gradient methods. We prove that any gradient method using stepsizes satisfying the property will converge $R$ -linearly at a rate of $1 - λ_{1} / M_{1}$ , where $λ_{1}$ is the smallest eigenvalue of Hessian matrix and $M_{1}$ is the upper bound of the inverse stepsize. Our results indicate that the existing convergence rates of many nonmonotone methods can be improved to $1 - 1/ κ$ with $κ$ being the associated condition number.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques