Multi-GPU Acceleration of PALABOS Fluid Solver using C++ Standard Parallelism
Jonas Latt, Christophe Coreixas

TL;DR
This paper details a GPU-accelerated version of the Palabos lattice Boltzmann solver using modern C++, enabling seamless CPU-GPU hybrid execution, modular development, and demonstrating high performance and scalability on multiple GPUs.
Contribution
Introduces a hybrid CPU-GPU architecture for Palabos with modern C++ techniques, enhancing performance, modularity, and ease of porting existing features to GPU.
Findings
Single-GPU performance comparable to CUDA-native solvers
Good weak and strong scaling across multiple GPUs
Successful validation with multiphysics benchmarks
Abstract
This article presents the principles, software architecture, and performance analysis of the GPU port of the lattice Boltzmann software library Palabos (J. Latt et al., "Palabos: Parallel lattice Boltzmann solver", Comput. Math. Appl. 81, 334-350, (2021)). A hybrid CPU-GPU execution model is adopted, in which numerical components are selectively assigned to either the CPU or the GPU, depending on considerations of performance or convenience. This design enables a progressive porting strategy, allowing most features of the original CPU-based codebase to be gradually and seamlessly adapted to GPU execution. The new architecture builds upon two complementary paradigms: a classical object-oriented structure for CPU execution, and a data-oriented counterpart for GPUs, which reproduces the modularity of the original code while eliminating object-oriented overhead detrimental to GPU…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLattice Boltzmann Simulation Studies · Advanced Numerical Methods in Computational Mathematics · Fluid Dynamics and Vibration Analysis
