TL;DR
Triangel is a novel on-chip temporal prefetcher that improves performance and accuracy by extending Triage with sampling techniques, achieving significant speedups with modest memory traffic increase.
Contribution
It introduces Triangel, an improved temporal prefetcher that addresses Triage's limitations using sampling-based methods for better accuracy and performance.
Findings
26.4% system speedup over baseline
Reduced memory traffic increase to 10%
Outperforms Triage in speed and efficiency
Abstract
Temporal prefetching, where correlated pairs of addresses are logged and replayed on repeat accesses, has recently become viable in commercial designs. Arm's latest processors include Correlating Miss Chaining prefetchers, which store such patterns in a partition of the on-chip cache. However, the state-of-the-art on-chip temporal prefetcher in the literature, Triage, features some design inconsistencies and inaccuracies that pose challenges for practical implementation. We first examine and design fixes for these inconsistencies to produce an implementable baseline. We then introduce Triangel, a prefetcher that extends Triage with novel sampling-based methodologies to allow it to be aggressive and timely when the prefetcher is able to handle observed long-term patterns, and to avoid inaccurate prefetches when less able to do so. Triangel gives a 26.4% speedup compared to a baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
