Bias-Variance Trade-off for Clipped Stochastic First-Order Methods: From Bounded Variance to Infinite Mean
Chuan He

TL;DR
This paper extends the analysis of stochastic first-order methods to heavy-tailed noise with infinite mean, showing clipping improves complexity guarantees across all tail indices from bounded variance to infinite mean.
Contribution
It provides a unified analysis of bias-variance trade-off in clipped stochastic methods for all tail indices, including the scarcely studied infinite mean case.
Findings
Clipped SFOMs achieve better complexity bounds under heavy-tailed noise.
The analysis applies to all tail indices in (0,2], including infinite mean.
Numerical experiments confirm the theoretical improvements.
Abstract
Stochastic optimization is fundamental to modern machine learning. Recent research has extended the study of stochastic first-order methods (SFOMs) from light-tailed to heavy-tailed noise, which frequently arises in practice, with clipping emerging as a key technique for controlling heavy-tailed gradients. Extensive theoretical advances have further shown that the oracle complexity of SFOMs depends on the tail index of the noise. Nonetheless, existing complexity results often cover only the case , that is, the regime where the noise has a finite mean, while the complexity bounds tend to infinity as approaches . This paper tackles the general case of noise with tail index , covering regimes ranging from noise with bounded variance to noise with an infinite mean, where the latter case has been scarcely studied. Through a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Risk and Portfolio Optimization · Simulation Techniques and Applications
