Accelerating an Iterative Eigensolver for Nuclear Structure Configuration Interaction Calculations on GPUs using OpenACC
Pieter Maris, Chao Yang, Dossay Oryspayev, Brandon Cook

TL;DR
This paper demonstrates how using OpenACC directives can efficiently accelerate large eigenvalue problem solutions in nuclear physics on GPUs, with minimal code modifications and notable speedups over CPU implementations.
Contribution
The paper presents a minimal-change OpenACC-based approach to accelerate a hybrid MPI/OpenMP eigensolver for nuclear structure calculations on GPUs, highlighting architectural considerations.
Findings
Significant speedup of eigensolver on GPUs compared to CPUs.
Optimal OpenACC directive placement differs from OpenMP due to GPU architecture.
Communication overhead limits overall speedup on multiple GPUs.
Abstract
To accelerate the solution of large eigenvalue problems arising from many-body calculations in nuclear physics on distributed-memory parallel systems equipped with general-purpose Graphic Processing Units (GPUs), we modified a previously developed hybrid MPI/OpenMP implementation of an eigensolver written in FORTRAN 90 by using an OpenACC directives based programming model. Such an approach requires making minimal changes to the original code and enables a smooth migration of large-scale nuclear structure simulations from a distributed-memory many-core CPU system to a distributed GPU system. However, in order to make the OpenACC based eigensolver run efficiently on GPUs, we need to take into account the architectural differences between a many-core CPU and a GPU device. Consequently, the optimal way to insert OpenACC directives may be different from the original way of inserting OpenMP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced NMR Techniques and Applications · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
