High Probability Convergence of Distributed Clipped Stochastic Gradient Descent with Heavy-tailed Noise

Yuchen Yang; Kaihong Lu; Long Wang

arXiv:2506.11647·math.OC·June 19, 2025·Syst. Control. Lett.

High Probability Convergence of Distributed Clipped Stochastic Gradient Descent with Heavy-tailed Noise

Yuchen Yang, Kaihong Lu, Long Wang

PDF

Open Access

TL;DR

This paper introduces a distributed clipped stochastic gradient descent algorithm designed for heavy-tailed noise environments, providing high probability convergence guarantees in networked multi-agent optimization.

Contribution

It presents the first high probability convergence analysis of distributed SGD under heavy-tailed noise using a clipping operator, extending prior work limited to light-tailed noise.

Findings

01

Algorithm converges with high probability under mild graph conditions

02

Effective handling of heavy-tailed noise in distributed optimization

03

Simulation confirms theoretical convergence results

Abstract

In this paper, the problem of distributed optimization is studied via a network of agents. Each agent only has access to a noisy gradient of its own objective function, and can communicate with its neighbors via a network. To handle this problem, a distributed clipped stochastic gradient descent algorithm is proposed, and the high probability convergence of the algorithm is studied. Existing works on distributed algorithms involving stochastic gradients only consider the light-tailed noises. Different from them, we study the case with heavy-tailed settings. Under mild assumptions on the graph connectivity, we prove that the algorithm converges in high probability under a certain clipping operator. Finally, a simulation is provided to demonstrate the effectiveness of our theoretical results

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques