Decentralized Relaxed Smooth Optimization with Gradient Descent Methods
Zhanhong Jiang, Aditya Balu, Soumik Sarkar

TL;DR
This paper introduces a novel decentralized gradient descent framework under relaxed $(L_0,L_1)$-smoothness, achieving improved convergence rates and bridging theoretical advances with practical applications in modern tasks like deep learning.
Contribution
It is the first to extend $(L_0,L_1)$-smoothness to decentralized optimization, providing new analysis techniques and adaptive methods with state-of-the-art convergence guarantees.
Findings
Achieves best-known convergence rates for convex/nonconvex functions.
Provides complexity bounds in stochastic settings with conditions for improvement.
Empirical results validate gradient-norm-dependent smoothness in real datasets.
Abstract
-smoothness, which has been pivotal to advancing decentralized optimization theory, is often fairly restrictive for modern tasks like deep learning. The recent advent of relaxed -smoothness condition enables improved convergence rates for gradient methods. Despite centralized advances, its decentralized extension remains unexplored and challenging. In this work, we propose the first general framework for decentralized gradient descent (DGD) under -smoothness by introducing novel analysis techniques. For deterministic settings, our method with adaptive clipping achieves the best-known convergence rates for convex/nonconvex functions without prior knowledge of and and bounded gradient assumption. In stochastic settings, we derive complexity bounds and identify conditions for improved complexity bound in convex optimization. The empirical validation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
