On the Convergence of Decentralized Adaptive Gradient Methods
Xiangyi Chen, Belhal Karimi, Weijie Zhao, Ping Li

TL;DR
This paper introduces a general framework for converting adaptive gradient methods into decentralized algorithms, providing convergence guarantees and demonstrating benefits through theoretical analysis and numerical experiments.
Contribution
It presents a novel, rigorous framework for decentralized adaptive gradient methods, extending existing algorithms with proven convergence properties.
Findings
Decentralized AMSGrad converges under certain conditions.
The framework applies to various adaptive methods, ensuring their decentralized versions are convergent.
Numerical results show improved performance in distributed settings.
Abstract
Adaptive gradient methods including Adam, AdaGrad, and their variants have been very successful for training deep learning models, such as neural networks. Meanwhile, given the need for distributed computing, distributed optimization algorithms are rapidly becoming a focal point. With the growth of computing power and the need for using machine learning models on mobile devices, the communication cost of distributed training algorithms needs careful consideration. In this paper, we introduce novel convergent decentralized adaptive gradient methods and rigorously incorporate adaptive gradient methods into decentralized training procedures. Specifically, we propose a general algorithmic framework that can convert existing adaptive gradient methods to their decentralized counterparts. In addition, we thoroughly analyze the convergence behavior of the proposed algorithmic framework and show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research
MethodsAdaGrad · Adam · AMSGrad
