Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing Systems
Ali Mokhtari, Chavit Denninnart, Mohsen Amini Salehi

TL;DR
This paper proposes an autonomous task dropping mechanism based on probabilistic analysis to enhance robustness in heterogeneous distributed computing systems facing uncertainties in task execution and arrival times.
Contribution
It introduces a mathematical model and heuristic for proactive task dropping to maximize system robustness against uncertainties in heterogeneous systems.
Findings
Improves system robustness by up to 20%.
Provides a generic probabilistic model for task dropping decisions.
Develops a feasible-time heuristic for robustness maximization.
Abstract
Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our goal is to make the system robust against these uncertainties. Considering task execution time as a random variable, we use probabilistic analysis to develop an autonomous proactive task dropping mechanism to attain our robustness goal. Specifically, we provide a mathematical model that identifies the optimality of a task dropping decision, so that the system robustness is maximized. Then, we leverage the mathematical model to develop a task dropping heuristic that achieves the system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
