Adaptive Worker Grouping For Communication-Efficient and Straggler-Tolerant Distributed SGD
Feng Zhu, Jingjing Zhang, Osvaldo Simeone, Xin Wang

TL;DR
This paper introduces G-CADA, a grouping-based scheme for distributed SGD that reduces communication and improves robustness to stragglers by adaptively scheduling worker groups, leading to faster convergence.
Contribution
G-CADA extends CADA by grouping workers to enhance straggler tolerance and reduce communication, with adaptive scheduling and analysis demonstrating significant performance improvements.
Findings
G-CADA significantly reduces wall-clock convergence time.
G-CADA lowers communication and computation loads.
Experimental results outperform benchmark schemes.
Abstract
Wall-clock convergence time and communication load are key performance metrics for the distributed implementation of stochastic gradient descent (SGD) in parameter server settings. Communication-adaptive distributed Adam (CADA) has been recently proposed as a way to reduce communication load via the adaptive selection of workers. CADA is subject to performance degradation in terms of wall-clock convergence time in the presence of stragglers. This paper proposes a novel scheme named grouping-based CADA (G-CADA) that retains the advantages of CADA in reducing the communication load, while increasing the robustness to stragglers at the cost of additional storage at the workers. G-CADA partitions the workers into groups of workers that are assigned the same data shards. Groups are scheduled adaptively at each iteration, and the server only waits for the fastest worker in each selected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · 3D Shape Modeling and Analysis · Privacy-Preserving Technologies in Data
MethodsAdam
