Enhancing Instruction Prefetching via Cache and TLB Management
Alexandre Valentin Jamet, Georgios Vavouliotis, Marti Torrents, and Dimitrios Chasapis, Marc Casas

TL;DR
This paper introduces IP-CaT, a microarchitectural framework that jointly optimizes TLB and cache management to significantly improve instruction prefetching performance in modern server workloads.
Contribution
IP-CaT is the first framework to jointly optimize TLB and cache management specifically for L1 instruction prefetching, addressing translation latency and reuse heterogeneity.
Findings
IP-CaT+EPI achieves an 8.7% speedup over EPI alone.
IP-CaT outperforms state-of-the-art instruction TLB prefetchers and cache replacement policies.
Consistent performance improvements across 105 server workloads.
Abstract
Modern server workloads exhibit massive instruction footprints that heavily pressure the processor front-end, making L1 instruction (L1I) prefetching critical for sustaining performance. However, this paper shows that current L1I prefetchers fail to reach their full potential due to two key limitations. First, L1I prefetches crossing page boundaries require address translation before issuance, and translation latency reduces prefetch timeliness. Second, the reuse behavior of code lines fetched by L1I prefetches is highly heterogeneous: while some lines are reused many times, others are dead-on-arrival. This paper introduces Instruction Prefetch-Centric Cache and TLB Management (IP-CaT), the first microarchitectural framework jointly optimizing TLB and cache management for L1I prefetching. IP-CaT consists of two components: (i) the translation Prefetch Buffer (tPB), a small structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
