General Coded Computing in a Probabilistic Straggler Regime
Parsa Moradi, Mohammad Ali Maddah-Ali

TL;DR
This paper analyzes the approximation error convergence of two general coded computing schemes, BACC and LeTCC, in a probabilistic straggler model, showing they achieve zero average error as the number of servers grows.
Contribution
It provides the first theoretical analysis demonstrating convergence of approximation error for BACC and LeTCC under probabilistic straggler scenarios.
Findings
Average approximation error converges to zero for BACC and LeTCC.
Convergence rates are at least (\,log^3_{1/p}(N) \, \, N^{-3}) and (\,log^4_{1/p}(N) \, \, N^{-2}) respectively.
Results validated through experiments on neural networks and other functions.
Abstract
Coded computing has demonstrated promising results in addressing straggler resiliency in distributed computing systems. However, most coded computing schemes are designed for exact computation, requiring the number of responding servers to exceed a certain recovery threshold. Additionally, these schemes are tailored for highly structured functions. Recently, new coded computing schemes for general computing functions, where exact computation is replaced with approximate computation, have emerged. In these schemes, the availability of additional results corresponds to more accurate estimation of computational tasks. This flexibility introduces new questions that need to be addressed. This paper addresses the practically important scenario in the context of general coded computing, where each server may become a straggler with a probability , independently from others. We theoretically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms
