Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms
Jianyu Wang, Gauri Joshi

TL;DR
This paper introduces Cooperative SGD, a unified framework that analyzes and designs communication-efficient distributed SGD algorithms, improving scalability and convergence guarantees while balancing communication costs.
Contribution
The paper provides a unified analysis of existing communication-efficient SGD algorithms and proposes new algorithms with optimized communication and convergence properties.
Findings
Unified convergence guarantees for multiple algorithms
Design of new communication-efficient SGD algorithms
Balance between communication reduction and convergence speed
Abstract
Communication-efficient SGD algorithms, which allow nodes to perform local updates and periodically synchronize local models, are highly effective in improving the speed and scalability of distributed SGD. However, a rigorous convergence analysis and comparative study of different communication-reduction strategies remains a largely open problem. This paper presents a unified framework called Cooperative SGD that subsumes existing communication-efficient SGD algorithms such as periodic-averaging, elastic-averaging and decentralized SGD. By analyzing Cooperative SGD, we provide novel convergence guarantees for existing algorithms. Moreover, this framework enables us to design new communication-efficient SGD algorithms that strike the best balance between reducing communication overhead and achieving fast error convergence with low error floor.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent
