High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit   Numerical Solvers

Kamalavasan Kamalakkannan; Gihan R. Mudalige; Istvan Z. Reguly; Suhaib; A. Fahmy

arXiv:2101.01177·cs.AR·January 8, 2021

High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers

Kamalavasan Kamalakkannan, Gihan R. Mudalige, Istvan Z. Reguly, Suhaib, A. Fahmy

PDF

TL;DR

This paper introduces a workflow for designing FPGA accelerators for structured-mesh explicit solvers, achieving near-optimal performance and energy efficiency comparable to GPUs through innovative optimizations and predictive modeling.

Contribution

It presents a unified workflow combining state-of-the-art techniques with new optimizations and a predictive model for FPGA-based stencil application acceleration.

Findings

01

FPGA implementations match GPU runtime performance

02

Over 2x energy savings with FPGA over GPU

03

Predictive model achieves over 85% accuracy

Abstract

This paper presents a workflow for synthesizing near-optimal FPGA implementations for structured-mesh based stencil applications for explicit solvers. It leverages key characteristics of the application class, its computation-communication pattern, and the architectural capabilities of the FPGA to accelerate solvers from the high-performance computing domain. Key new features of the workflow are (1) the unification of standard state-of-the-art techniques with a number of high-gain optimizations such as batching and spatial blocking/tiling, motivated by increasing throughput for real-world work loads and (2) the development and use of a predictive analytic model for exploring the design space, resource estimates and performance. Three representative applications are implemented using the design workflow on a Xilinx Alveo U280 FPGA, demonstrating near-optimal performance and over 85%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.