Performance Comparison on Parallel CPU and GPU Algorithms for Unified Gas-Kinetic Scheme
Jizhou Liu, Fang Q. Hu, Xiaodong Li

TL;DR
This paper compares CPU and GPU parallel algorithms for the Unified Gas-Kinetic Scheme, revealing GPU's superior speedup in general, but CPU's advantage in large velocity space problems, through performance tests on 2D channel flow.
Contribution
It introduces a two-level parallelization approach for GPU UGKS and systematically compares its performance with CPU algorithms across various mesh sizes.
Findings
GPU achieves up to 118.38x speedup on certain problems.
CPU algorithms perform better with large velocity space grids.
Performance varies with problem size and velocity space range.
Abstract
Parallel algorithms on CPU and GPU are implemented for the Unified Gas-Kinetic Scheme and their performances are investigated and compared by a two dimensional channel flow case. The parallel CPU algorithm has a one dimensional block partition that parallelizes only the spatial space. Due to the intrinsic feature of the UGKS, a compromised two-level parallelization is adopted for GPU algorithm. A series of meshes with different sizes are tested to reveal the performance evolution of the algorithms with respect to problem size. Then special attentions are paid to UGKS applications where the molecular velocity space range is large. The comparison confirms that GPU has relative elevated accelerations with the latest device having a speedup of 118.38x. Parallel CPU algorithm, on the contrary, might provide better performances when the grid point number in velocity space is large.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGas Dynamics and Kinetic Theory · Fluid Dynamics and Heat Transfer · Lattice Boltzmann Simulation Studies
