The Universe at Extreme Scale: Multi-Petaflop Sky Simulation on the BG/Q
Salman Habib, Vitali Morozov, Hal Finkel, Adrian Pope, Katrin, Heitmann, Kalyan Kumaran, Tom Peterka, Joe Insley, David Daniel, Patricia, Fasel, Nicholas Frontiere, and Zarija Lukic

TL;DR
This paper presents a highly scalable cosmological simulation framework, HACC, capable of running trillion-particle simulations at petaflop speeds on the IBM BG/Q, enabling detailed exploration of the universe's dark components.
Contribution
The paper introduces the HACC framework's novel algorithmic design and demonstrates its unprecedented performance and scalability on the IBM BG/Q supercomputer for extremely large cosmological simulations.
Findings
Achieved 13.94 PFlops performance at 69.2% of peak on 1.57 million cores.
Successfully simulated over 3.6 trillion particles, the largest cosmological simulation to date.
Demonstrated flexible tuning of the framework across diverse high-performance architectures.
Abstract
Remarkable observational advances have established a compelling cross-validated model of the Universe. Yet, two key pillars of this model -- dark matter and dark energy -- remain mysterious. Sky surveys that map billions of galaxies to explore the `Dark Universe', demand a corresponding extreme-scale simulation capability; the HACC (Hybrid/Hardware Accelerated Cosmology Code) framework has been designed to deliver this level of performance now, and into the future. With its novel algorithmic structure, HACC allows flexible tuning across diverse architectures, including accelerated and multi-core systems. On the IBM BG/Q, HACC attains unprecedented scalable performance -- currently 13.94 PFlops at 69.2% of peak and 90% parallel efficiency on 1,572,864 cores with an equal number of MPI ranks, and a concurrency of 6.3 million. This level of performance was achieved at extreme problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
