Stopping Rules for Stochastic Gradient Descent via Anytime-Valid Confidence Sequences

Liviu Aolaritei; Michael I. Jordan

arXiv:2512.13123·math.OC·February 24, 2026

Stopping Rules for Stochastic Gradient Descent via Anytime-Valid Confidence Sequences

Liviu Aolaritei, Michael I. Jordan

PDF

Open Access

TL;DR

This paper introduces a new framework of anytime-valid confidence sequences for stochastic gradient descent, enabling statistically valid, trajectory-dependent stopping rules that work in both convex and nonconvex optimization without prior horizon knowledge.

Contribution

It develops the first time-uniform, statistically valid stopping rules for SGD applicable to convex and nonconvex problems based only on observed trajectories.

Findings

01

Provides anytime-valid certificates for suboptimality in convex SGD.

02

Offers time-uniform stationarity certificates in nonconvex optimization.

03

Characterizes stopping-time complexity under standard stepsize schedules.

Abstract

The problem of stopping stochastic gradient descent (SGD) in an online manner, based solely on the observed trajectory, is a challenging theoretical problem with significant consequences for applications. While SGD is routinely monitored as it runs, the classical theory of SGD provides guarantees only at pre-specified iteration horizons and offers no valid way to decide, based on the observed trajectory, when further computation is justified. We address this longstanding gap by developing anytime-valid confidence sequences for stochastic gradient methods, which remain valid under continuous monitoring and directly induce statistically valid, trajectory-dependent stopping rules: stop as soon as the current upper confidence bound on an appropriate performance measure falls below a user-specified tolerance. The confidence sequences are constructed using nonnegative supermartingales, are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems · Advanced Bandit Algorithms Research