Bias-Variance Trade-off for Clipped Stochastic First-Order Methods: From Bounded Variance to Infinite Mean

Chuan He

arXiv:2512.14686·cs.LG·December 17, 2025

Bias-Variance Trade-off for Clipped Stochastic First-Order Methods: From Bounded Variance to Infinite Mean

Chuan He

PDF

Open Access

TL;DR

This paper extends the analysis of stochastic first-order methods to heavy-tailed noise with infinite mean, showing clipping improves complexity guarantees across all tail indices from bounded variance to infinite mean.

Contribution

It provides a unified analysis of bias-variance trade-off in clipped stochastic methods for all tail indices, including the scarcely studied infinite mean case.

Findings

01

Clipped SFOMs achieve better complexity bounds under heavy-tailed noise.

02

The analysis applies to all tail indices in (0,2], including infinite mean.

03

Numerical experiments confirm the theoretical improvements.

Abstract

Stochastic optimization is fundamental to modern machine learning. Recent research has extended the study of stochastic first-order methods (SFOMs) from light-tailed to heavy-tailed noise, which frequently arises in practice, with clipping emerging as a key technique for controlling heavy-tailed gradients. Extensive theoretical advances have further shown that the oracle complexity of SFOMs depends on the tail index $α$ of the noise. Nonetheless, existing complexity results often cover only the case $α \in (1, 2]$ , that is, the regime where the noise has a finite mean, while the complexity bounds tend to infinity as $α$ approaches $1$ . This paper tackles the general case of noise with tail index $α \in (0, 2]$ , covering regimes ranging from noise with bounded variance to noise with an infinite mean, where the latter case has been scarcely studied. Through a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Risk and Portfolio Optimization · Simulation Techniques and Applications