SPLIT: SymPathy for Large jobs Improves Tail latency
Zhouzi Li, Mor Harchol-Balter, Alan Scheller-Wolf

TL;DR
This paper introduces SPLIT, a scheduling policy that improves tail latency in multi-server queues with heavy-tailed job sizes by giving small priority to large jobs, unlike single-server policies.
Contribution
It presents the first strongly tail-optimal scheduling policies for multi-server queues with heavy-tailed workloads, highlighting the importance of sympathetic treatment of large jobs.
Findings
SPLIT achieves strong tail optimality across the stability region.
Sympathy for large jobs is crucial for tail optimality in multi-server systems.
The policy works with or without knowledge of job sizes.
Abstract
We study the asymptotic response time tail in the M/G/n multi-server queue with heavy-tailed (regularly varying) job sizes, a setting representative of modern computing workloads. For single-server systems, tail optimization is well understood: under heavy-tailed job sizes, policies such as SRPT that strictly prioritize short jobs are strongly tail optimal, and giving any priority to large jobs is harmful. For multi-server systems, the question has been almost entirely open. This paper gives the first strongly tail-optimal scheduling policies for the M/G/n queue with heavy-tailed job sizes. Our central finding is that the multi-server case is intrinsically different from the single-server case: giving a small amount of ``sympathy'' to large jobs is essential for strong tail optimality. We establish strong (or arbitrarily close to strong) tail optimality across the full stability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
