Communication-Computation Efficient Gradient Coding

Min Ye; Emmanuel Abbe

arXiv:1802.03475·stat.ML·February 13, 2018·73 cites

Communication-Computation Efficient Gradient Coding

Min Ye, Emmanuel Abbe

PDF

Open Access

TL;DR

This paper introduces a novel coding scheme for distributed gradient computation that optimally balances computation load, straggler tolerance, and communication cost, significantly reducing running time.

Contribution

It presents an explicit coding scheme based on recursive polynomial constructions that achieves the optimal tradeoff among key parameters in distributed learning.

Findings

01

Reduces gradient computation time by 32% on Amazon EC2.

02

Maintains the same generalization error as uncoded schemes.

03

Outperforms prior coded schemes by 23% in reducing runtime.

Abstract

This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation load, straggler tolerance and communication cost. It further gives an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial constructions, coding both across data subsets and vector components. As a result, the proposed scheme allows to minimize the running time for gradient computations. Implementations are made on Amazon EC2 clusters using Python with mpi4py package. Results show that the proposed scheme maintains the same generalization error while reducing the running time by $32%$ compared to uncoded schemes and $23%$ compared to prior coded schemes focusing only on stragglers (Tandon et al., ICML 2017).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques