High Performance P3M N-body code: CUBEP3M
Joachim Harnois-Deraps, Ue-Li Pen, Ilian T. Iliev, Hugh Merz, J.D., Emberson, Vincent Desjacques

TL;DR
CUBEP3M is a high-performance, scalable cosmological N-body simulation code with utilities for large-scale structure studies, optimized for speed and memory efficiency, suitable for diverse astrophysical applications.
Contribution
The paper introduces CUBEP3M, a publicly available N-body code with novel utilities, enhanced scalability, and optimized memory usage for large cosmological simulations.
Findings
Achieves near-ideal weak scaling on 27,000 cores
Memory footprint as low as 37 bytes per particle in lean mode
Effective for large-scale cosmological applications
Abstract
This paper presents CUBEP3M, a publicly-available high performance cosmological N-body code and describes many utilities and extensions that have been added to the standard package. These include a memory-light runtime SO halo finder, a non-Gaussian initial conditions generator, and a system of unique particle identification. CUBEP3M is fast, its accuracy is tuneable to optimize speed or memory, and has been run on more than 27,000 cores, achieving within a factor of two of ideal weak scaling even at this problem size. The code can be run in an extra-lean mode where the peak memory imprint for large runs is as low as 37 bytes per particles, which is almost two times leaner than other widely used N-body codes. However, load imbalances can increase this requirement by a factor of two, such that fast configurations with all the utilities enabled and load imbalances factored in require…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
