Accelerating Graph Analytics on a Reconfigurable Architecture with a Data-Indirect Prefetcher
Yichen Yang, Jingtao Li, Nishil Talati, Subhankar Pal, Siying Feng,, Chaitali Chakrabarti, Trevor Mudge, Ronald Dreslinski

TL;DR
This paper introduces a novel data prefetcher called Transmuter for manycore reconfigurable architectures, significantly improving graph workload performance by adapting CPU prefetching techniques to the unique architecture constraints.
Contribution
The paper designs and evaluates a new data prefetcher tailored for MRAs, incorporating runtime reconfiguration and cache redesign to enhance graph analytics performance.
Findings
Achieves up to 2.72x speedup on graph workloads
Outperforms baseline without prefetcher by 1.27x on average
Demonstrates effective adaptation of CPU prefetching techniques to MRAs
Abstract
The irregular nature of memory accesses of graph workloads makes their performance poor on modern computing platforms. On manycore reconfigurable architectures (MRAs), in particular, even state-of-the-art graph prefetchers do not work well (only 3% speedup), since they are designed for traditional CPUs. This is because caches in MRAs are typically not large enough to host a large quantity of prefetched data, and many employs shared caches that such prefetchers simply do not support. This paper studies the design of a data prefetcher for an MRA called Transmuter. The prefetcher is built on top of Prodigy, the current best-performing data prefetcher for CPUs. The key design elements that adapt the prefetcher to the MRA include fused prefetcher status handling registers and a prefetch handshake protocol to support run-time reconfiguration, in addition, a redesign of the cache structure in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Graph Theory and Algorithms · Parallel Computing and Optimization Techniques
