Exploring the Versal AI engines for accelerating stencil-based atmospheric advection simulation
Nick Brown

TL;DR
This paper investigates the use of AMD's Versal AI engines to accelerate atmospheric advection simulations, demonstrating potential performance improvements and exploring optimal implementation strategies for stencil-based HPC workloads.
Contribution
It provides an initial exploration of porting a stencil-based atmospheric advection scheme onto Versal AI engines, highlighting performance benefits and implementation considerations.
Findings
AI engines can double performance over traditional FPGA configurations.
Performance is limited by channel bandwidth between fabric and AI engines.
Versal ACAP shows promise for accelerating HPC simulations.
Abstract
AMD Xilinx's new Versal Adaptive Compute Acceleration Platform (ACAP) is an FPGA architecture combining reconfigurable fabric with other on-chip hardened compute resources. AI engines are one of these and, by operating in a highly vectorized manner, they provide significant raw compute that is potentially beneficial for a range of workloads including HPC simulation. However, this technology is still early-on, and as yet unproven for accelerating HPC codes, with a lack of benchmarking and best practice. This paper presents an experience report, exploring porting of the Piacsek and Williams (PW) advection scheme onto the Versal ACAP, using the chip's AI engines to accelerate the compute. A stencil-based algorithm, advection is commonplace in atmospheric modelling, including several Met Office codes who initially developed this scheme. Using this algorithm as a vehicle, we explore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
