Method for portable, scalable, and performant GPU-accelerated simulation of multiphase compressible flow
Anand Radhakrishnan, Henry Le Berre, Benjamin Wilfong and, Jean-Sebastien Spratt, Mauro Rodriguez Jr., Tim Colonius, Spencer H., Bryngelson

TL;DR
This paper introduces a portable, scalable GPU acceleration strategy for multiphase compressible flow simulations, achieving significant speedups and efficient scaling on large GPU clusters using OpenACC, metaprogramming, and MPI optimizations.
Contribution
It presents a novel portable GPU acceleration approach using OpenACC and metaprogramming that significantly improves performance and scalability of multiphase flow solvers.
Findings
8x speedup of compute kernels
46% of peak FLOPs utilization on NVIDIA GPUs
97% weak scaling efficiency on 13824 GPUs
Abstract
Multiphase compressible flows are often characterized by a broad range of space and time scales. Thus entailing large grids and small time steps, simulations of these flows on CPU-based clusters can thus take several wall-clock days. Offloading the compute kernels to GPUs appears attractive but is memory-bound for standard finite-volume and -difference methods, damping speed-ups. Even when realized, faster GPU-based kernels lead to more intrusive communication and I/O times. We present a portable strategy for GPU acceleration of multiphase compressible flow solvers that addresses these challenges and obtains large speedups at scale. We use OpenACC for portable offloading of all compute kernels while maintaining low-level control when needed. An established Fortran preprocessor and metaprogramming tool, Fypp, enables otherwise hidden compile-time optimizations. This strategy exposes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
