On Delay-Optimal Scheduling in Queueing Systems with Replications
Yin Sun, C. Emre Koksal, and Ness B. Shroff

TL;DR
This paper investigates delay-optimal scheduling policies for task replications in multi-server queueing systems, providing theoretical guarantees and low-complexity solutions applicable to diverse practical scenarios.
Contribution
It introduces delay-optimal and near-optimal scheduling policies for replications, with comprehensive theoretical analysis and novel sample-path tools for general system settings.
Findings
Delay-optimal policies are identified for various system models.
Proposed policies are proven to be delay-optimal or near-optimal.
Results apply to systems with arbitrary arrivals, job sizes, and server heterogeneity.
Abstract
In modern computer systems, jobs are divided into short tasks and executed in parallel. Empirical observations in practical systems suggest that the task service times are highly random and the job service time is bottlenecked by the slowest straggling task. One common solution for straggler mitigation is to replicate a task on multiple servers and wait for one replica of the task to finish early. The delay performance of replications depends heavily on the scheduling decisions of when to replicate, which servers to replicate on, and which job to serve first. So far, little is understood on how to optimize these scheduling decisions for minimizing the delay to complete the jobs. In this paper, we present a comprehensive study on delay-optimal scheduling of replications in both centralized and distributed multi-server systems. Low-complexity scheduling policies are designed and are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Age of Information Optimization · Distributed systems and fault tolerance
