Adapting AREPO-RT for Exascale Computing: GPU Acceleration and Efficient Communication
Oliver Zier, Rahul Kannan, Aaron Smith, Mark Vogelsberger, Erkin, Verbeek

TL;DR
This paper enhances the AREPO-RT code for exascale supercomputers by implementing GPU acceleration and optimized communication strategies, significantly improving simulation speed and scalability for astrophysical radiative transfer modeling.
Contribution
It introduces GPU-based computation and a novel node-to-node communication method to optimize AREPO-RT for exascale architectures, enabling faster and more scalable astrophysical simulations.
Findings
GPU implementation yields ~15x speedup on benchmarks.
Communication optimizations improve performance on large and small-scale systems.
Overall efficiency triples in cosmological simulations of the Epoch of Reionization.
Abstract
Radiative transfer (RT) is a crucial ingredient for self-consistent modelling of numerous astrophysical phenomena across cosmic history. However, on-the-fly integration into radiation-hydrodynamics (RHD) simulations is computationally demanding, particularly due to the stringent time-stepping conditions and increased dimensionality inherent in multi-frequency collisionless Boltzmann physics. The emergence of exascale supercomputers, equipped with extensive CPU cores and GPU accelerators, offers new opportunities for enhancing RHD simulations. We present a novel optimization of AREPO-RT explicitly tailored for such high-performance computing environments. We implement a novel node-to-node communication strategy that utilizes shared memory to substitute intra-node communication with direct memory access. Furthermore, combining multiple inter-node messages into a single message…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
