TL;DR
This paper presents PKDGRAV3, a highly efficient GPU-accelerated code enabling trillion-particle cosmological simulations with unprecedented speed and scalability, facilitating next-generation galaxy survey research.
Contribution
Introduction of PKDGRAV3, a GPU-accelerated cosmological simulation code that achieves trillion-particle simulations with optimal performance and scalability on supercomputers.
Findings
Completed a 2 trillion particle simulation on Piz Daint.
Achieved perfect scaling up to 18000 nodes on Titan.
Reached a peak performance of 10 Pflops.
Abstract
We report on the successful completion of a 2 trillion particle cosmological simulation to z=0 run on the Piz Daint supercomputer (CSCS, Switzerland), using 4000+ GPU nodes for a little less than 80h of wall-clock time or 350,000 node hours. Using multiple benchmarks and performance measurements on the US Oak Ridge National Laboratory Titan supercomputer, we demonstrate that our code PKDGRAV3, delivers, to our knowledge, the fastest time-to-solution for large-scale cosmological N-body simulations. This was made possible by using the Fast Multipole Method in conjunction with individual and adaptive particle time steps, both deployed efficiently (and for the first time) on supercomputers with GPU-accelerated nodes. The very low memory footprint of PKDGRAV3 allowed us to run the first ever benchmark with 8 trillion particles on Titan, and to achieve perfect scaling up to 18000 nodes and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
