GPU acceleration of non-equilibrium Green's function calculation using OpenACC and CUDA FORTRAN
Jia Yin, Khaled Z. Ibrahim, Mauro Del Ben, Jack Deslippe, Yang-hao Chan, Chao Yang

TL;DR
This paper develops GPU-accelerated methods using OpenACC and CUDA FORTRAN for solving the computationally intensive non-equilibrium Green's function equations, achieving significant speedups and scalability on modern hardware.
Contribution
It introduces optimized GPU implementations of NEGF calculations with detailed performance comparisons, highlighting the advantages of CUDA FORTRAN over OpenACC.
Findings
GPU approaches outperform CPU implementations significantly
Both CPU and GPU methods show excellent scalability
CUDA FORTRAN offers more advanced control and better performance
Abstract
The numerical solution of the Kadanoff-Baym nonlinear integro-differential equations, which yields the non-equilibrium Green's functions (NEGFs) of quantum many-body systems, poses significant computational challenges due to its high computational complexity. In this work, we present efficient implementations of a numerical method for solving these equations on distributed-memory architectures, including many-core CPUs and multi-GPU systems. For CPU-based platforms, we adopt a hybrid MPI/OpenMP programming model to exploit both inter-node and intra-node parallelism. On GPU-accelerated systems, we implement the method using two distinct approaches: MPI/OpenACC and MPI/CUDA FORTRAN. Several optimization strategies are employed to enhance GPU performance, including techniques to maximize computational resource utilization and minimize the overhead associated with kernel launches and memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Scientific Research and Discoveries · Parallel Computing and Optimization Techniques
