Almost sure convergence rates of stochastic gradient methods under   gradient domination

Simon Weissmann; Sara Klein; Wa\"iss Azizian; Leif D\"oring

arXiv:2405.13592·cs.LG·March 18, 2025

Almost sure convergence rates of stochastic gradient methods under gradient domination

Simon Weissmann, Sara Klein, Wa\"iss Azizian, Leif D\"oring

PDF

Open Access

TL;DR

This paper establishes almost sure convergence rates for stochastic gradient methods under gradient domination conditions, extending theoretical understanding beyond classical assumptions like strong convexity.

Contribution

It proves almost sure convergence rates for stochastic gradient descent under gradient domination, applicable to supervised and reinforcement learning.

Findings

01

Almost sure convergence rates of $f(X_n)-f^* o 0$ at a specific rate

02

Rates are close to recent expectation-based rates

03

Application to training in supervised and reinforcement learning

Abstract

Stochastic gradient methods are among the most important algorithms in training machine learning problems. While classical assumptions such as strong convexity allow a simple analysis they are rarely satisfied in applications. In recent years, global and local gradient domination properties have shown to be a more realistic replacement of strong convexity. They were proved to hold in diverse settings such as (simple) policy gradient methods in reinforcement learning and training of deep neural networks with analytic activation functions. We prove almost sure convergence rates $f (X_{n}) - f^{*} \in o (n^{- \frac{1}{4 β - 1} + ϵ})$ of the last iterate for stochastic gradient descent (with and without momentum) under global and local $β$ -gradient domination assumptions. The almost sure rates get arbitrarily close to recent rates in expectation. Finally, we demonstrate how to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Mathematical Biology Tumor Growth · Topological and Geometric Data Analysis