Reducing Tail Latency via Safe and Simple Duplication
Hafiz Mohsin Bashir, Abdullah Bin Faisal, Muhammad Asim Jamshed, Peter, Vondras, Ali Musa Iftikhar, Ihsan Ayyub Qazi, Fahad R. Dogar

TL;DR
This paper introduces DAS, a duplication-aware scheduling method that safely reduces tail latency in cloud services by leveraging prioritization and purging, supported by a flexible abstraction for diverse system layers.
Contribution
It proposes DAS and the D-Stage abstraction to enable safe, simple duplication across cloud system layers, improving tail latency without overloading the system.
Findings
DAS reduces tail latency in cloud applications.
The approach is safe and effective across various workloads.
Experiments confirm latency improvements in real cloud environments.
Abstract
Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We present duplicate-aware scheduling or DAS, which makes duplication safe and easy to use, by leveraging the two well-known primitives of prioritization and purging. To support DAS across diverse layers of a cloud system (e.g., network, storage, etc), we propose the D-Stage abstraction, which decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · IoT and Edge/Fog Computing · Distributed and Parallel Computing Systems
