Smoothed Gradient Clipping and Error Feedback for Decentralized Optimization under Symmetric Heavy-Tailed Noise
Shuhua Yu, Dusan Jakovetic, Soummya Kar

TL;DR
This paper introduces a novel decentralized gradient clipping method with error feedback that effectively handles heavy-tailed gradient noise, achieving convergence without assuming bounded gradients, and validated by numerical experiments.
Contribution
It develops a smoothed gradient clipping operator with error feedback for decentralized optimization under heavy-tailed noise, providing the first convergence guarantees without bounded gradient assumptions.
Findings
Achieves a mean-square error convergence rate of O(1/t^δ) with δ in (0, 2/5).
Convergence rate is independent of higher order noise moments.
Numerical experiments confirm theoretical results.
Abstract
Motivated by understanding and analysis of large-scale machine learning under heavy-tailed gradient noise, we study decentralized optimization with gradient clipping, i.e., in which certain clipping operators are applied to the gradients or gradient estimates computed from local nodes prior to further processing. While vanilla gradient clipping has proven effective in mitigating the impact of heavy-tailed gradient noise in non-distributed setups, it incurs bias that causes convergence issues in heterogeneous distributed settings. To address the inherent bias introduced by gradient clipping, we develop a smoothed clipping operator, and propose a decentralized gradient method equipped with an error feedback mechanism, i.e., the clipping operator is applied on the difference between some local gradient estimator and local stochastic gradient. We consider strongly convex and smooth local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks · Neural Networks and Applications · Advanced Algorithms and Applications
