GPU-Accelerated DNS of Compressible Turbulent Flows

Youngdae Kim; Debojyoti Ghosh; Emil M. Constantinescu; Ramesh; Balakrishnan

arXiv:2211.16718·cs.CE·December 7, 2022

GPU-Accelerated DNS of Compressible Turbulent Flows

Youngdae Kim, Debojyoti Ghosh, Emil M. Constantinescu, Ramesh, Balakrishnan

PDF

Open Access

TL;DR

This paper presents a GPU-accelerated version of a CFD solver for compressible turbulent flows, achieving significant speedups and demonstrating scalability on supercomputers for large-scale turbulence simulations.

Contribution

The paper introduces a GPU-optimized implementation of HyPar, enabling high-resolution turbulence simulations on exascale heterogeneous platforms with demonstrated scalability.

Findings

01

200x speedup of key kernels on GPU

02

Successful strong and weak scaling on NVIDIA V100 GPUs

03

Simulation of turbulence with up to 1024^3 grid points on 1024 GPUs

Abstract

This paper explores strategies to transform an existing CPU-based high-performance computational fluid dynamics solver, HyPar, for compressible flow simulations on emerging exascale heterogeneous (CPU+GPU) computing platforms. The scientific motivation for developing a GPU-enhanced version of HyPar is to simulate canonical turbulent flows at the highest resolution possible on such platforms. We show that optimizing memory operations and thread blocks results in 200x speedup of computationally intensive kernels compared with a CPU core. Using multiple GPUs and CUDA-aware MPI communication, we demonstrate both strong and weak scaling of our GPU-based HyPar implementation on the NVIDIA Volta V100 GPUs. We simulate the decay of homogeneous isotropic turbulence in a triply periodic box on grids with up to $102 4^{3}$ points (5.3 billion degrees of freedom) and on up to 1,024 GPUs. We compare…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems