Multiple-GPU accelerated high-order gas-kinetic scheme on three-dimensional unstructured meshes
Yuhang Wang, Waixiang Cao, Liang Pan

TL;DR
This paper presents a GPU-accelerated implementation of high-order gas-kinetic schemes on 3D unstructured meshes, achieving significant speedups and scalable performance for compressible flow simulations.
Contribution
It introduces a multiple-GPU framework for HGKS on unstructured meshes using CUDA and MPI, enabling efficient parallel computation and domain decomposition.
Findings
5x speedup with RTX A5000 GPU
9x speedup with Tesla V100 GPU
Proper scaling with increasing GPU number
Abstract
Recently, successes have been achieved for the high-order gas-kinetic schemes (HGKS) on unstructured meshes for compressible flows. In this paper, to accelerate the computation, HGKS is implemented with the graphical processing unit (GPU) using the compute unified device architecture (CUDA). HGKS on unstructured meshes is a fully explicit scheme, and the acceleration framework can be developed based on the cell-level parallelism. For single-GPU computation, the connectivity of geometric information is generated for the requirement of data localization and independence. Based on such data structure, the kernels and corresponding girds of CUDA are set. With the one-to-one mapping between the indices of cells and CUDA threads, the single-GPU computation using CUDA can be implemented for HGKS. For multiple-GPU computation, the domain decomposition and data exchange need to be taken into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGas Dynamics and Kinetic Theory · Lattice Boltzmann Simulation Studies · Fluid Dynamics and Heat Transfer
