Dynamic Loop Fusion in High-Level Synthesis
Robert Szafarczyk, Syed Waqar Nabi, Wim Vanderbauwhede

TL;DR
This paper introduces a novel dynamic loop fusion technique for High-Level Synthesis that allows parallel execution of multiple irregular loops with memory dependencies, significantly improving throughput.
Contribution
It presents a new compiler/hardware co-design approach enabling parallel execution of multiple irregular loops in HLS using dynamic memory disambiguation based on address monotonicity.
Findings
Average speedup of 14x over static HLS
Average speedup of 4x over existing dynamic HLS
Enables parallel execution of irregular loops with dependencies
Abstract
Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS. However, most irregular codes consist of multiple sibling loops, which currently have to be executed sequentially by all HLS tools. Static HLS performs loop fusion only on regular codes, while dynamic HLS relies on loops with dependencies to run to completion before the next loop starts. We present dynamic loop fusion for HLS, a compiler/hardware co-design approach that enables multiple loops to run in parallel, even if they contain unpredictable memory dependencies. Our only requirement is that memory addresses are monotonically non-decreasing in inner loops. We present a novel program-order schedule for HLS, inspired by polyhedral compilers, that together with our address monotonicity analysis enables dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
