PANDA: Noise-Resilient Antagonist Identification in Production Datacenters
Sixiang Zhou, Nan Deng, Krzysiek Rzadca, Xiaojun Lin, Y. Charlie Hu

TL;DR
PANDA is a noise-resilient framework for identifying performance-interfering jobs in large datacenters, leveraging historical data and a new CPI metric to improve accuracy and scalability over existing methods.
Contribution
PANDA introduces a novel, noise-resilient approach using global historical knowledge and a machine-level CPI metric for better antagonist detection in production datacenters.
Findings
Significantly improves antagonist ranking accuracy from 50-55% to 82.6%.
Performs reliably under multi-victim scenarios.
Maintains negligible runtime overhead.
Abstract
Modern warehouse-scale datacenters commonly collocate multiple jobs on shared machines to improve resource utilization. However, such collocation often leads to performance interference caused by antagonistic jobs that overconsume shared resources. Existing antagonist-detection approaches either rely on offline profiling, which is costly and unscalable, or use a sample-from-production approach, which suffers from noisy measurements and fails under multi-victim scenarios. We present PANDA, a noise-resilient antagonist identification framework for production-scale datacenters. Like prior correlation-based methods, PANDA uses cycles per instruction (CPI) as its performance metric, but it differs by (i) leveraging global historical knowledge across all machines to suppress sampling noise and (ii) introducing a machine-level CPI metric that captures shared-resource contention among multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Software-Defined Networks and 5G
