Stochastic Weakly Convex Optimization Under Heavy-Tailed Noises
Tianxi Zhu, Yi Xu, Xiangyang Ji

TL;DR
This paper studies the convergence behavior of stochastic weakly convex optimization algorithms under heavy-tailed noises, providing new theoretical insights into their performance in non-convex, non-smooth settings.
Contribution
It extends convergence analysis of stochastic subgradient methods to weakly convex functions under heavy-tailed noises, including sub-Weibull and p-BCM assumptions.
Findings
Vanilla SsGD's dependence on failure probability remains stable under sub-Weibull noise.
Clipped SsGD's dependence on failure probability is unaffected by non-convexity and non-smoothness under p-BCM noise.
Sample complexity under p-BCM noise is worse than the lower bound for smooth optimization.
Abstract
An increasing number of studies have focused on stochastic first-order methods (SFOMs) under heavy-tailed gradient noises, which have been observed in the training of practical deep learning models. In this paper, we focus on two types of gradient noises: one is sub-Weibull noise, and the other is noise under the assumption that it has a bounded -th central moment (-BCM) with . The latter is more challenging due to the occurrence of infinite variance when . Under these two gradient noise assumptions, the in-expectation and high-probability convergence of SFOMs have been extensively studied in the contexts of convex optimization and standard smooth optimization. However, for weakly convex objectives-a class that includes all Lipschitz-continuous convex objectives and smooth objectives-our understanding of the in-expectation and high-probability convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Stochastic processes and financial applications
