Efficient Coupled-Cluster Python Frameworks for Next-Generation GPUs: A Comparative Study of CuPy and PyTorch on the Hopper and Grace Hopper Architecture
Antonina Dobrowolska, Julian \'Swierczy\'nski, Pawe{\l} Tecmer, Emil Sujkowski, Somayeh Ahmadkhani, Grzegorz Mazur, Klemens Noga, Jeff Hammond, Katharina Boguslawski

TL;DR
This paper develops and benchmarks GPU-accelerated coupled-cluster algorithms in Python using CuPy and PyTorch, demonstrating significant speedups and architecture-dependent performance improvements on Hopper and Grace Hopper GPUs.
Contribution
Introduces new batching algorithms and a generic tensor contraction protocol for efficient GPU-based CCSD calculations in Python, with comprehensive benchmarking on modern GPU architectures.
Findings
PyTorch outperforms CuPy by 20% on H100.
Both libraries perform similarly on GH200.
Achieved a 10-fold speedup over previous GPU implementation.
Abstract
In this work, we introduce new batching algorithms to effectively handle large contractions encountered in coupled-cluster singles and doubles (CCSD) implementations in Python on the Video Random Access Memory (VRAM) of graphical processing units (GPUs), thereby improving performance. Specifically, we benchmark the performance of the CuPy and PyTorch libraries on a single NVIDIA Hopper (H100) and the Grace Hopper (GH200) architectures. We begin by optimizing the particle-particle ladder bottleneck contraction in CCSD using an asymmetric and dynamic splitting recipe, and then move toward a generic tensor contraction protocol that enables tensor contractions to be performed almost exclusively on GPUs. We benchmark our new, fully generic GPU-accelerated coupled-cluster implementations for various molecular systems and basis-set sizes, using both the CuPy and PyTorch libraries. While…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Chemical Physics Studies · Protein Structure and Dynamics
