Harvesting L2 Caches in Server Processors
Majid Jalili, Mattan Erez

TL;DR
This paper proposes a novel cache management scheme that leverages idle L2 caches in modern server processors to improve performance and latency, achieving up to 2X speedup and significant latency reductions.
Contribution
It introduces a logical path for LLC evictions to unused private caches, enhancing resource utilization and system performance in multi-core server processors.
Findings
Up to 2X system performance improvement for single-application runs.
Up to 32% P99 latency reduction for user-facing tasks.
Up to 50% IPC improvement for background jobs.
Abstract
We make three observations in modern processors: (1) LLC capacity is getting larger (up to 1GB); (2) core counts are increasing (up to 128 cores), accumulating a more significant amount of private L2 cache capacity on the chip; and (3) overall processor utilization in the cloud remains very low despite many efforts, leaving many large private caches unused. To enable better use of these beefy processors, we propose to open up a logical path for LLC evictions to unused private caches. In other words, instead of writing LLC evictions back to slow and busy main memory, we send some of them that are still alive up to idle L2 caches to avoid unnecessary long and costly main memory. Our scheme takes the importance of applications (user-facing vs. background), and system load into account to provide each application with a fair share of idle resources. Our results show that we can improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Cloud Computing and Resource Management
