Parallel implementation of the Density Matrix Renormalization Group method achieving a quarter petaFLOPS performance on a single DGX-H100 GPU node
Andor Menczer, Maarten van Damme, Alan Rask, Lee Huntington, Jeff, Hammond, Sotiris S. Xantheas, Martin Ganahl, \"Ors Legeza

TL;DR
This paper demonstrates a highly optimized GPU implementation of the DMRG method, achieving unprecedented performance levels that enable tackling complex quantum chemistry problems efficiently.
Contribution
The paper introduces a hybrid CPU-GPU implementation of DMRG that reaches 246 teraFLOPS, significantly surpassing previous architectures and showcasing the potential of tensor network algorithms on modern GPUs.
Findings
Achieved 246 teraFLOPS performance on a single DGX-H100 GPU node.
Outperformed previous DGX-A100 implementations by more than 2.5x.
Provided a scalable approach for large-scale quantum chemistry calculations.
Abstract
We report cutting edge performance results for a hybrid CPU-multi GPU implementation of the spin adapted ab initio Density Matrix Renormalization Group (DMRG) method on current state-of-the-art NVIDIA DGX-H100 architectures. We evaluate the performance of the DMRG electronic structure calculations for the active compounds of the FeMoco and cytochrome P450 (CYP) enzymes with complete active space (CAS) sizes of up to 113 electrons in 76 orbitals [CAS(113, 76)] and 63 electrons in 58 orbitals [CAS(63, 58)], respectively. We achieve 246 teraFLOPS of sustained performance, an improvement of more than 2.5x compared to the performance achieved on the DGX-A100 architectures and an 80x acceleration compared to an OpenMP parallelized implementation on a 128-core CPU architecture. Our work highlights the ability of tensor network algorithms to efficiently utilize high-performance GPU hardware and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTheoretical and Computational Physics · Physics of Superconductivity and Magnetism · Advanced Condensed Matter Physics
