High Probability Convergence of Distributed Clipped Stochastic Gradient Descent with Heavy-tailed Noise
Yuchen Yang, Kaihong Lu, Long Wang

TL;DR
This paper introduces a distributed clipped stochastic gradient descent algorithm designed for heavy-tailed noise environments, providing high probability convergence guarantees in networked multi-agent optimization.
Contribution
It presents the first high probability convergence analysis of distributed SGD under heavy-tailed noise using a clipping operator, extending prior work limited to light-tailed noise.
Findings
Algorithm converges with high probability under mild graph conditions
Effective handling of heavy-tailed noise in distributed optimization
Simulation confirms theoretical convergence results
Abstract
In this paper, the problem of distributed optimization is studied via a network of agents. Each agent only has access to a noisy gradient of its own objective function, and can communicate with its neighbors via a network. To handle this problem, a distributed clipped stochastic gradient descent algorithm is proposed, and the high probability convergence of the algorithm is studied. Existing works on distributed algorithms involving stochastic gradients only consider the light-tailed noises. Different from them, we study the case with heavy-tailed settings. Under mild assumptions on the graph connectivity, we prove that the algorithm converges in high probability under a certain clipping operator. Finally, a simulation is provided to demonstrate the effectiveness of our theoretical results
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques
