Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping,   Acceleration, and Adaptivity

Eduard Gorbunov; Nazarii Tupitsa; Sayantan Choudhury; Alen Aliev,; Peter Richt\'arik; Samuel Horv\'ath; Martin Tak\'a\v{c}

arXiv:2409.14989·math.OC·December 30, 2024

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity

Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury, Alen Aliev,, Peter Richt\'arik, Samuel Horv\'ath, Martin Tak\'a\v{c}

PDF

Open Access 1 Video

TL;DR

This paper advances optimization methods for convex functions with generalized smoothness, introducing improved convergence guarantees for existing algorithms and proposing a new accelerated method that does not depend on standard smoothness assumptions.

Contribution

It provides new convergence rates for Gradient Descent with clipping and Polyak stepsizes under $(L_0,L_1)$-smoothness, and introduces a novel accelerated method for this class.

Findings

01

Improved convergence rates for Gradient Descent with clipping.

02

Enhanced rates for Gradient Descent with Polyak stepsizes.

03

A new accelerated method with better convergence guarantees.

Abstract

Due to the non-smoothness of optimization problems in Machine Learning, generalized smoothness assumptions have been gaining a lot of attention in recent years. One of the most popular assumptions of this type is $(L_{0}, L_{1})$ -smoothness (Zhang et al., 2020). In this paper, we focus on the class of (strongly) convex $(L_{0}, L_{1})$ -smooth functions and derive new convergence guarantees for several existing methods. In particular, we derive improved convergence rates for Gradient Descent with (Smoothed) Gradient Clipping and for Gradient Descent with Polyak Stepsizes. In contrast to the existing results, our rates do not rely on the standard smoothness assumption and do not suffer from the exponential dependency from the initial distance to the solution. We also extend these results to the stochastic case under the over-parameterization assumption, propose a new accelerated method for convex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity· slideslive

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research

MethodsSoftmax · Attention Is All You Need · Focus · Gradient Clipping