StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems
Johannes de Fine Licht, Andreas Kuster, Tiziano De Matteis, Tal, Ben-Nun, Dominic Hofer, Torsten Hoefler

TL;DR
StencilFlow is a framework that maps large, heterogeneous stencil programs onto distributed spatial computing systems, maximizing performance and ensuring deadlock freedom, demonstrated by record-high FPGA performance in weather simulation applications.
Contribution
It introduces a novel method for mapping complex stencil DAGs to distributed spatial hardware, optimizing locality and deadlock freedom, with comprehensive analysis and high-performance FPGA implementation.
Findings
Achieved record FPGA performance for stencil programs
Demonstrated effective mapping of complex stencil DAGs
Provided insights into architecture requirements for efficiency
Abstract
Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the general case of mapping directed acyclic graphs of heterogeneous stencil computations to spatial computing systems, assuming large input programs without an iterative component. StencilFlow maximizes temporal locality and ensures deadlock freedom in this setting, providing end-to-end analysis and mapping from a high-level program description to distributed hardware. We evaluate our generated architectures on a Stratix 10 FPGA testbed, yielding 1.31 TOp/s and 4.18 TOp/s on single-device and multi-device, respectively, demonstrating the highest performance recorded for stencil programs on FPGAs to date. We then leverage the framework to study a complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
