Improved Convergence in High Probability of Clipped Gradient Methods with Heavy Tails
Ta Duy Nguyen, Alina Ene, Huy L. Nguyen

TL;DR
This paper introduces a new analysis method for clipped gradient algorithms with heavy-tailed noise, improving high-probability convergence guarantees and allowing adaptive parameters without prior problem knowledge.
Contribution
We develop a novel supermartingale-based analysis that reduces failure probability dependence on iteration count, enabling adaptive step sizes and clipping in heavy-tailed settings.
Findings
Improved high-probability convergence bounds for clipped gradient methods.
Applicable to both convex and nonconvex stochastic optimization.
Eliminates need for prior problem constants in parameter setting.
Abstract
In this work, we study the convergence \emph{in high probability} of clipped gradient methods when the noise distribution has heavy tails, ie., with bounded th moments, for some . Prior works in this setting follow the same recipe of using concentration inequalities and an inductive argument with union bound to bound the iterates across all iterations. This method results in an increase in the failure probability by a factor of , where is the number of iterations. We instead propose a new analysis approach based on bounding the moment generating function of a well chosen supermartingale sequence. We improve the dependency on in the convergence guarantee for a wide range of algorithms with clipped gradients, including stochastic (accelerated) mirror descent for convex objectives and stochastic gradient descent for nonconvex objectives. This approach naturally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference
