Accelerating the Dutch Atmospheric Large-Eddy Simulation (DALES) model with OpenACC
Lucas Esclapez, Laurent Soucasse, Caspar Jungbacker, Fredrik, Jansson, Stephan R. de Roode, Pedro Costa, Gijs van den Oord and, Alessio Sclocco

TL;DR
This paper demonstrates how porting the Dutch Atmospheric Large-Eddy Simulation (DALES) model to GPUs using OpenACC directives improves performance, scalability, and portability across different hardware, with additional kernel optimization.
Contribution
The paper introduces a GPU acceleration approach for DALES using OpenACC, enabling high-performance atmospheric simulations on modern HPC hardware.
Findings
Significant reduction in simulation time on GPU-accelerated nodes
Effective weak scaling across multiple GPU hardware platforms
Enhanced kernel performance through auto-tuning with Kernel Tuner
Abstract
This paper presents the GPU porting through OpenACC directives of the Dutch Atmospheric Large-Eddy Simulation (DALES) application, a high-resolution atmospheric model. The code is written in Fortran~90 and features parallel (distributed) execution through spatial domain decomposition. We assess the performance of the GPU offloading, comparing the time-to-solution on regular and accelerated HPC nodes. %comparing the computational time between distributed and accelerated nodes. A weak scaling analysis is conducted and portability across NVIDIA A100 and H100 hardware %and AMD hardware is discussed. Finally, we show how targeted kernels can benefit from further optimization with Kernel Tuner, a GPU kernels auto-tuning package.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
