cuRAMSES: Scalable AMR Optimizations for Large-Scale Cosmological Simulations
Juhan Kim

TL;DR
cuRAMSES introduces hierarchical domain decomposition and algorithmic optimizations to enhance scalability, memory efficiency, and computational speed in large-scale cosmological simulations using AMR techniques.
Contribution
It proposes a recursive k-section domain decomposition and novel memory and acceleration strategies, significantly improving scalability and performance over traditional methods.
Findings
Achieves over 260x speedup in feedback routines.
Reduces per-rank memory footprint with Morton-key hash table.
Improves strong scaling with neighbour-only communication.
Abstract
We present cuRAMSES, a suite of advanced domain decomposition strategies and algorithmic optimizations for the ramses adaptive mesh refinement (AMR) code, designed to overcome the communication, memory, and solver bottlenecks inherent in massive cosmological simulations. The central innovation is a recursive k-section domain decomposition that replaces the traditional Hilbert curve ordering with a hierarchical spatial partitioning. This approach substitutes global all-to-all communications with neighbour-only point-to-point communications. By maintaining a constant number of communication partners regardless of the total rank count, it significantly improves strong scaling at high concurrency. To address critical memory constraints at scale, we introduce a Morton-key hash table for octree-neighbour lookup alongside on-demand array allocation, drastically reducing the per-rank memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
