High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers
Kamalavasan Kamalakkannan, Gihan R. Mudalige, Istvan Z. Reguly, Suhaib, A. Fahmy

TL;DR
This paper introduces a workflow for designing FPGA accelerators for structured-mesh explicit solvers, achieving near-optimal performance and energy efficiency comparable to GPUs through innovative optimizations and predictive modeling.
Contribution
It presents a unified workflow combining state-of-the-art techniques with new optimizations and a predictive model for FPGA-based stencil application acceleration.
Findings
FPGA implementations match GPU runtime performance
Over 2x energy savings with FPGA over GPU
Predictive model achieves over 85% accuracy
Abstract
This paper presents a workflow for synthesizing near-optimal FPGA implementations for structured-mesh based stencil applications for explicit solvers. It leverages key characteristics of the application class, its computation-communication pattern, and the architectural capabilities of the FPGA to accelerate solvers from the high-performance computing domain. Key new features of the workflow are (1) the unification of standard state-of-the-art techniques with a number of high-gain optimizations such as batching and spatial blocking/tiling, motivated by increasing throughput for real-world work loads and (2) the development and use of a predictive analytic model for exploring the design space, resource estimates and performance. Three representative applications are implemented using the design workflow on a Xilinx Alveo U280 FPGA, demonstrating near-optimal performance and over 85%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
