On Consensus-Optimality Trade-offs in Collaborative Deep Learning
Zhanhong Jiang, Aditya Balu, Chinmay Hegde, and Soumik Sarkar

TL;DR
This paper introduces new distributed deep learning algorithms that balance consensus and optimality, providing theoretical convergence guarantees and demonstrating improved performance through experiments.
Contribution
The paper proposes the i-CDSGD and g-CDSGD algorithms, enabling flexible trade-offs between consensus and disagreement in distributed deep learning.
Findings
Algorithms converge for strongly convex and nonconvex objectives
Momentum variants improve convergence in strongly convex cases
Numerical experiments show significant performance improvements
Abstract
In distributed machine learning, where agents collaboratively learn from diverse private data sets, there is a fundamental tension between consensus and optimality. In this paper, we build on recent algorithmic progresses in distributed deep learning to explore various consensus-optimality trade-offs over a fixed communication topology. First, we propose the incremental consensus-based distributed SGD (i-CDSGD) algorithm, which involves multiple consensus steps (where each agent communicates information with its neighbors) within each SGD iteration. Second, we propose the generalized consensus-based distributed SGD (g-CDSGD) algorithm that enables us to navigate the full spectrum from complete consensus (all agents agree) to complete disagreement (each agent converges to individual model parameters). We analytically establish convergence of the proposed algorithms for strongly convex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data
MethodsStochastic Gradient Descent
