Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters
Dingyu Yang, Kangpeng Zheng, Shiyou Qian, Jian Cao, Guangtao Xue

TL;DR
This paper introduces PISM, a proactive framework that models and scores interference from best-effort jobs to latency-critical services, optimizing scheduling to significantly reduce interference and improve performance in large-scale clusters.
Contribution
PISM is a novel data-driven framework that characterizes BEJs, models their impact on LCS response times, and uses interference scoring for optimized scheduling.
Findings
PISM reduces cluster interference by up to 41.5%.
PISM improves long-tail LCS throughput by 76.4%.
Effective BEJ scheduling mitigates interference in large-scale clusters.
Abstract
Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Through an extensive analysis of voluminous real trace data derived from two production clusters, we observe that BEJs typically exhibit periodic execution patterns and serve as the primary sources of interference to LCSs. Furthermore, despite occupying the same level of resource consumption, the diverse compositions of BEJs can result in varying degrees of interference on LCSs. Subsequently, we propose PISM, a proactive Performance Interference Scoring and Mitigating framework for LCSs through the optimization of BEJ scheduling. Firstly, PISM adopts a data-driven approach to establish a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Peer-to-Peer Network Technologies
