Cooperative SGD with Dynamic Mixing Matrices
Soumya Sarkar, Shweta Jain

TL;DR
This paper introduces a unified framework for distributed stochastic gradient descent that incorporates dynamic topologies and non-uniform aggregation, leading to improved convergence guarantees over traditional fixed-topology methods.
Contribution
It proposes a novel framework for distributed SGD with dynamic mixing matrices, extending existing algorithms and providing stronger theoretical convergence guarantees.
Findings
Enhanced convergence guarantees under dynamic topologies
Improved performance with non-uniform client contributions
Unified analysis covering multiple local-update SGD algorithms
Abstract
One of the most common methods to train machine learning algorithms today is the stochastic gradient descent (SGD). In a distributed setting, SGD-based algorithms have been shown to converge theoretically under specific circumstances. A substantial number of works in the distributed SGD setting assume a fixed topology for the edge devices. These papers also assume that the contribution of nodes to the global model is uniform. However, experiments have shown that such assumptions are suboptimal and a non uniform aggregation strategy coupled with a dynamically shifting topology and client selection can significantly improve the performance of such models. This paper details a unified framework that covers several Local-Update SGD-based distributed algorithms with dynamic topologies and provides improved or matching theoretical guarantees on convergence compared to existing work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Reinforcement Learning in Robotics · Neural Networks and Applications
