GRACOS: Scalable and Load Balanced P3M Cosmological N-body Code
Alexander Shirokov, Edmund Bertschinger

TL;DR
The paper introduces GRACOS, a scalable parallel P3M cosmological N-body code that efficiently balances load and maintains high performance on distributed memory clusters, suitable for large inhomogeneous simulations.
Contribution
GRACOS implements a hybrid P3M algorithm with dynamic load balancing and domain decomposition, enhancing scalability and efficiency for large cosmological simulations.
Findings
Good load balance achieved on 40-processor cluster
Scalability up to 80 processes demonstrated
Potential improvements with adaptive mesh refinement
Abstract
We present a parallel implementation of the particle-particle/particle-mesh (P3M) algorithm for distributed memory clusters. The GRACOS (GRAvitational COSmology) code uses a hybrid method for both computation and domain decomposition. Long-range forces are computed using a Fourier transform gravity solver on a regular mesh; the mesh is distributed across parallel processes using a static one-dimensional slab domain decomposition. Short-range forces are computed by direct summation of close pairs; particles are distributed using a dynamic domain decomposition based on a space-filling Hilbert curve. A nearly-optimal method was devised to dynamically repartition the particle distribution so as to maintain load balance even for extremely inhomogeneous mass distributions. Tests using simulations on a 40-processor beowulf cluster showed good load balance and scalability up to 80…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Research and Discoveries · Opportunistic and Delay-Tolerant Networks · Distributed and Parallel Computing Systems
