Distributed Computations with Layered Resolution
Homa Esfahanizadeh, Alejandro Cohen, Muriel M\'edard, Shlomo Shamai, (Shitz)

TL;DR
This paper introduces layered-resolution distributed coded computations, enabling early approximate results in time-sensitive applications, thereby improving deadline adherence and resource efficiency in distributed systems.
Contribution
It proposes a novel layered-resolution approach that provides early approximate results, enhancing deadline-based performance in distributed coded computing.
Findings
Early resolutions have significantly lower execution delay.
Probability of meeting deadlines is one for early resolutions.
Layered approach improves resource utilization and system success rate.
Abstract
Modern computationally-heavy applications are often time-sensitive, demanding distributed strategies to accelerate them. On the other hand, distributed computing suffers from the bottleneck of slow workers in practice. Distributed coded computing is an attractive solution that adds redundancy such that a subset of distributed computations suffices to obtain the final result. However, the final result is still either obtained within a desired time or not, and for the latter, the resources that are spent are wasted. In this paper, we introduce the novel concept of layered-resolution distributed coded computations such that lower resolutions of the final result are obtained from collective results of the workers -- at an earlier stage than the final result. This innovation makes it possible to have more effective deadline-based systems, since even if a computational job is terminated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Distributed systems and fault tolerance · Distributed and Parallel Computing Systems
