Strong scaling of general-purpose molecular dynamics simulations on GPUs
Jens Glaser, Trung Dac Nguyen, Joshua A. Anderson, Pak Lui, Filippo, Spiga, Jaime A. Millan, David C. Morse, Sharon C. Glotzer

TL;DR
This paper presents an optimized GPU implementation of molecular dynamics simulations in HOOMD-blue, achieving excellent strong scaling and significant speed-ups over CPU implementations for large particle systems.
Contribution
It introduces a GPU-optimized MPI domain decomposition method in HOOMD-blue, enabling scalable, high-performance molecular dynamics simulations on thousands of GPUs.
Findings
Achieves near-linear scaling up to 3,375 GPUs.
Demonstrates up to 12.5x speed-up over CPU nodes.
Supports large simulations with up to 108 million particles.
Abstract
We describe a highly optimized implementation of MPI domain decomposition in a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson and Glotzer, arXiv:1308.5587). Our approach is inspired by a traditional CPU-based code, LAMMPS (Plimpton, J. Comp. Phys. 117, 1995), but is implemented within a code that was designed for execution on GPUs from the start (Anderson et al., J. Comp. Phys. 227, 2008). The software supports short-ranged pair force and bond force fields and achieves optimal GPU performance using an autotuning algorithm. We are able to demonstrate equivalent or superior scaling on up to 3,375 GPUs in Lennard-Jones and dissipative particle dynamics (DPD) simulations of up to 108 million particles. GPUDirect RDMA capabilities in recent GPU generations provide better performance in full double precision calculations. For a representative polymer physics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
