Convergence Rate for the Last Iterate of Stochastic Gradient Descent Schemes
Marcel Hudiani

TL;DR
This paper analyzes the convergence rates of the last iterate of stochastic gradient descent (SGD) and stochastic heavy ball (SHB) methods for convex and non-convex functions with Hölder continuous gradients, using discrete Gronwall's inequality.
Contribution
It provides new convergence rate results for SGD and SHB without relying on Robbins-Siegmund theorem, including probabilistic bounds for convex functions with constant momentum.
Findings
SGD and SHB achieve specific convergence rates for non-convex objectives.
SHB with constant momentum attains a logarithmic convergence rate in probability for convex functions.
The paper recovers known results and extends them to broader settings with Hölder continuous gradients.
Abstract
We study the convergence rate for the last iterate of stochastic gradient descent (SGD) and stochastic heavy ball (SHB) in the parametric setting when the objective function is globally convex or non-convex whose gradient is -H\"{o}lder. Using only discrete Gronwall's inequality without Robbins-Siegmund theorem, we recover results for both SGD and SHB: for non-convex objectives and for , , and for convex objectives whose minimum is . In addition, we proved that SHB with constant momentum parameter attains a convergence rate of with probability at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Stochastic processes and financial applications
