An Improved Analysis of the Clipped Stochastic subGradient Method under Heavy-Tailed Noise
Daniela Angela Parletta, Andrea Paudice, Saverio Salzo

TL;DR
This paper introduces improved convergence rates for a clipped stochastic subgradient method under heavy-tailed noise, applicable to nonsmooth convex problems, with theoretical guarantees and practical experiments.
Contribution
It provides novel optimal convergence rates for the clipped stochastic subgradient method under heavy-tailed noise, extending analysis to unbounded domains and last/average iterates.
Findings
Convergence rates of order (log^{1/p} k)/k^{(p-1)/p} for last iterate.
Enhanced convergence rates for average iterates with high probability.
Effective application to supervised learning with kernels.
Abstract
In this paper, we provide novel optimal (or near optimal) convergence rates for a clipped version of the stochastic subgradient method. We consider nonsmooth convex problems over possibly unbounded domains, under heavy-tailed noise that possesses only the first moments for . For the last iterate, we establish convergence in expectation for the objective values with rates of order and , for anytime and finite-horizon respectively. We also derive new convergence rates, in expectation and with high probability, for the objective values along the average iterates--improving existing results by a factor. Those results are applied to the problem of supervised learning with kernels demonstrating the effectiveness of our theory. Finally, we give preliminary experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms
