CachePerf: A Unified Cache Miss Classifier via Hybrid Hardware Sampling
Jin Zhou, Steven (Jiaxun) Tang, Hanmei Yang, Tongping Liu

TL;DR
CachePerf is a hybrid hardware sampling tool that accurately classifies cache misses, distinguishes their causes, and identifies bugs with minimal performance overhead, significantly improving cache analysis for various applications.
Contribution
It introduces a hybrid sampling scheme combining coarse and fine-grained methods for unified cache miss classification and bug detection.
Findings
Imposes only 14% performance overhead
Identifies 9 previously unknown bugs
Achieves up to 3788% performance speedup after bug fixes
Abstract
The cache plays a key role in determining the performance of applications, no matter for sequential or concurrent programs on homogeneous and heterogeneous architecture. Fixing cache misses requires to understand the origin and the type of cache misses. However, this remains to be an unresolved issue even after decades of research. This paper proposes a unified profiling tool--CachePerf--that could correctly identify different types of cache misses, differentiate allocator-induced issues from those of applications, and exclude minor issues without much performance impact. The core idea behind CachePerf is a hybrid sampling scheme: it employs the PMU-based coarse-grained sampling to select very few susceptible instructions (with frequent cache misses) and then employs the breakpoint-based fine-grained sampling to collect the memory access pattern of these instructions. Based on our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
