Convergence Analysis of SGD under Expected Smoothness

Yuta Kawamoto; Hideaki Iiduka

arXiv:2510.20608·cs.LG·October 28, 2025

Convergence Analysis of SGD under Expected Smoothness

Yuta Kawamoto, Hideaki Iiduka

PDF

Open Access

TL;DR

This paper provides a detailed convergence analysis of stochastic gradient descent (SGD) under the expected smoothness condition, offering refined bounds and explicit rates that unify recent theoretical developments.

Contribution

It introduces a self-contained analysis of SGD under expected smoothness, refining the condition, deriving bounds, and establishing explicit convergence rates with detailed proofs.

Findings

01

SGD achieves $O(1/K)$ convergence rates under expected smoothness.

02

The analysis includes explicit residual errors for various step-size schedules.

03

The paper unifies and extends recent theoretical results on SGD convergence.

Abstract

Stochastic gradient descent (SGD) is the workhorse of large-scale learning, yet classical analyses rely on assumptions that can be either too strong (bounded variance) or too coarse (uniform noise). The expected smoothness (ES) condition has emerged as a flexible alternative that ties the second moment of stochastic gradients to the objective value and the full gradient. This paper presents a self-contained convergence analysis of SGD under ES. We (i) refine ES with interpretations and sampling-dependent constants; (ii) derive bounds of the expectation of squared full gradient norm; and (iii) prove $O (1/ K)$ rates with explicit residual errors for various step-size schedules. All proofs are given in full detail in the appendix. Our treatment unifies and extends recent threads (Khaled and Richt\'arik, 2020; Umeda and Iiduka, 2025).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Gaussian Processes and Bayesian Inference