A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices
Haneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, Kyoung Soo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim

TL;DR
This paper uncovers hidden LLC contention issues caused by high-bandwidth I/O devices in modern CPUs and proposes a runtime management framework to mitigate these contentions, significantly improving latency-sensitive workload performance.
Contribution
It reveals two previously unrecognized LLC contention mechanisms triggered by emerging I/O devices and introduces extbackslash design, a hardware-aware runtime framework to alleviate these issues.
Findings
Improves latency-sensitive workload performance by 51%.
Effectively mitigates LLC contention caused by high-bandwidth I/O devices.
Enhances overall datacenter server efficiency with minimal impact on low-priority workloads.
Abstract
In modern server CPUs, the Last-Level Cache (LLC) serves not only as a victim cache for higher-level private caches but also as a buffer for low-latency DMA transfers between CPU cores and I/O devices through Direct Cache Access (DCA). However, prior work has shown that high-bandwidth network-I/O devices can rapidly flood the LLC with packets, often causing significant contention with co-running workloads. One step further, this work explores hidden microarchitectural properties of the Intel Xeon CPUs, uncovering two previously unrecognized LLC contentions triggered by emerging high-bandwidth I/O devices. Specifically, (C1) DMA-written cache lines in LLC ways designated for DCA (referred to as DCA ways) are migrated to certain LLC ways (denoted as inclusive ways) when accessed by CPU cores, unexpectedly contending with non-I/O cache lines within the inclusive ways. In addition, (C2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
