Arcus: SLO Management for Accelerators in the Cloud with Traffic Shaping
Jiechen Zhao, Ran Shu, Katie Lim, Zewen Fan, Thomas Anderson, Mingyu, Gao, Natalie Enright Jerger

TL;DR
Arcus introduces a traffic shaping approach to manage accelerator resources in cloud systems, effectively reducing latency and ensuring SLO compliance amidst diverse and unpredictable communication patterns.
Contribution
It presents a novel SLO management framework that proactively shapes traffic for accelerators, addressing communication contention overlooked by prior solutions.
Findings
Up to 45% reduction in tail latency.
Less than 1% throughput variance.
Effective SLO guarantees across diverse workloads.
Abstract
Cloud servers use accelerators for common tasks (e.g., encryption, compression, hashing) to improve CPU/GPU efficiency and overall performance. However, users' Service-level Objectives (SLOs) can be violated due to accelerator-related contention. The root cause is that existing solutions for accelerators only focus on isolation or fair allocation of compute and memory resources; they overlook the contention for communication-related resources. Specifically, three communication-induced challenges drive us to re-think the problem: (1) Accelerator traffic patterns are diverse, hard to predict, and mixed across users, (2) communication-related components lack effective low-level isolation mechanism to configure, and (3) computational heterogeneity of accelerators lead to unique relationships between the traffic mixture and the corresponding accelerator performance. The focus of this work is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
