Exposing Shadow Branches
Chrysanthos Pepi, Bhargav Reddy Godala, Krishnam Tibrewala, Gino, Chacon, Paul V. Gratz, Daniel A. Jim\'enez, Gilles A. Pokam, David I. August

TL;DR
Skeia is a novel technique that detects and decodes shadow branches within instruction cache lines to improve branch prediction accuracy and processor performance, especially when BTB misses occur.
Contribution
Skeia introduces a shadow branch decoding method and buffer that enhance branch prediction during BTB misses with minimal storage overhead.
Findings
Achieves ~5.7% speedup over traditional BTB
Provides ~2% speedup compared to larger BTB implementations
Consistently improves performance across multiple applications
Abstract
Modern processors implement a decoupled front-end in the form of Fetch Directed Instruction Prefetching (FDIP) to avoid front-end stalls. FDIP is driven by the Branch Prediction Unit (BPU), relying on the BPU's accuracy and branch target tracking structures to speculatively fetch instructions into the Instruction Cache (L1I). As data center applications become more complex, their code footprints also grow, resulting in an increase in Branch Target Buffer (BTB) misses. FDIP can alleviate L1I cache misses, but when it encounters a BTB miss, the BPU may not identify the current instruction as a branch to FDIP. This can prevent FDIP from prefetching or cause it to speculate down the wrong path, further polluting the L1I cache. We observe that the vast majority, 75%, of BTB-missing, unidentified branches are actually present in instruction cache lines that FDIP has previously fetched but,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Low-power high-performance VLSI design
