Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems
Homa Esfahanizadeh, Alejandro Cohen, Muriel Medard

TL;DR
This paper introduces a novel joint scheduling-coding framework for heterogeneous distributed systems to minimize delay in learning applications, utilizing optimized load splitting and redundant computations.
Contribution
It proposes a new approach that combines scheduling and coding to efficiently allocate computational loads and reduce delays in heterogeneous systems.
Findings
Significantly lower delay compared to uniform load splitting.
Delay close to an ideal lower bound with minimal redundant computations.
Effective handling of system heterogeneity in distributed learning.
Abstract
To improve the utility of learning applications and render machine learning solutions feasible for complex applications, a substantial amount of heavy computations is needed. Thus, it is essential to delegate the computations among several workers, which brings up the major challenge of coping with delays and failures caused by the system's heterogeneity and uncertainties. In particular, minimizing the end-to-end job in-order execution delay, from arrival to delivery, is of great importance for real-world delay-sensitive applications. In this paper, for computation of each job iteration in a stochastic heterogeneous distributed system where the workers vary in their computing and communicating powers, we present a novel joint scheduling-coding framework that optimally split the coded computational load among the workers. This closes the gap between the workers' response time, and is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Error Correcting Code Techniques · Privacy-Preserving Technologies in Data
