DR-CGRA: Supporting Loop-Carried Dependencies in CGRAs Without Spilling Intermediate Values
Elad Hadar, Yoav Etsion

TL;DR
This paper introduces DR-CGRA, a novel architecture that handles loop-carried dependencies within CGRAs through inter-thread communication, eliminating spilling and significantly improving performance on benchmark programs.
Contribution
The paper proposes DR-CGRA, a massively-multithreaded CGRA architecture that manages loop-carried dependencies internally, avoiding spilling and enhancing loop execution efficiency.
Findings
Achieved 2.1 to 4.5x speedup on SPEC CPU 2017 benchmarks.
Eliminated the need for spilling loop-carried data out of the grid.
Demonstrated significant performance gains over state-of-the-art CGRA architectures.
Abstract
Coarse-grain reconfigurable architectures (CGRAs) are gaining traction thanks to their performance and power efficiency. Utilizing CGRAs to accelerate the execution of tight loops holds great potential for achieving significant overall performance gains, as a substantial portion of program execution time is dedicated to tight loops. But loop parallelization using CGRAs is challenging because of loop-carried data dependencies. Traditionally, loop-carried dependencies are handled by spilling dependent values out of the reconfigurable array to a memory medium and then feeding them back to the grid. Spilling the values and feeding them back into the grid imposes additional latencies and logic that impede performance and limit parallelism. In this paper, we present the Dependency Resolved CGRA (DR-CGRA) architecture that is designed to accelerate the execution of tight loops. DR-CGRA,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConstraint Satisfaction and Optimization
