Stream Iterative Distributed Coded Computing for Learning Applications   in Heterogeneous Systems

Homa Esfahanizadeh; Alejandro Cohen; Muriel Medard

arXiv:2204.13195·cs.DC·April 29, 2022

Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

Homa Esfahanizadeh, Alejandro Cohen, Muriel Medard

PDF

Open Access

TL;DR

This paper introduces a novel joint scheduling-coding framework for heterogeneous distributed systems to minimize delay in learning applications, utilizing optimized load splitting and redundant computations.

Contribution

It proposes a new approach that combines scheduling and coding to efficiently allocate computational loads and reduce delays in heterogeneous systems.

Findings

01

Significantly lower delay compared to uniform load splitting.

02

Delay close to an ideal lower bound with minimal redundant computations.

03

Effective handling of system heterogeneity in distributed learning.

Abstract

To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several workers, which brings up the major challenge of coping with delays and failures caused by the system's heterogeneity and uncertainties. In particular, minimizing the end-to-end job in-order execution delay, from arrival to delivery, is of great importance for real-world delay-sensitive applications. In this paper, for computation of each job iteration in a stochastic heterogeneous distributed system where the workers vary in their computing and communicating powers, we present a novel joint scheduling-coding framework that optimally split the coded computational load among the workers. This closes the gap between the workers' response time, and is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Error Correcting Code Techniques · Privacy-Preserving Technologies in Data