Particle-resolved thermal lattice Boltzmann simulation using OpenACC on multi-GPUs
Ao Xu, Bo-Tao Li

TL;DR
This paper demonstrates a GPU-accelerated particle-resolved thermal lattice Boltzmann simulation using OpenACC, achieving high performance on single and multi-GPU systems through innovative load balancing and communication strategies.
Contribution
It introduces an OpenACC-based GPU acceleration framework with optimized multi-GPU communication for particle-resolved thermal LB simulations, enhancing performance and scalability.
Findings
Achieved 1750 MLUPS on a single GPU.
Reached 10846 MLUPS using 8 GPUs.
Implemented load balancing and communication overlap techniques.
Abstract
We utilize the Open Accelerator (OpenACC) approach for graphics processing unit (GPU) accelerated particle-resolved thermal lattice Boltzmann (LB) simulation. We adopt the momentum-exchange method to calculate fluid-particle interactions to preserve the simplicity of the LB method. To address load imbalance issues, we extend the indirect addressing method to collect fluid-particle link information at each timestep and store indices of fluid-particle link in a fixed index array. We simulate the sedimentation of 4,800 hot particles in cold fluids with a domain size of , and the simulation achieves 1750 million lattice updates per second (MLUPS) on a single GPU. Furthermore, we implement a hybrid OpenACC and message passing interface (MPI) approach for multi-GPU accelerated simulation. This approach incorporates four optimization strategies, including building domain lists,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLattice Boltzmann Simulation Studies · Aerosol Filtration and Electrostatic Precipitation
