DFT-FE 1.0: A massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization
Sambit Das, Phani Motamarri, Vishal Subramanian, David M. Rogers,, Vikram Gavini

TL;DR
DFT-FE 1.0 is a highly parallelized density functional theory code that leverages CPU-GPU hybrid architectures to perform large-scale, accurate electronic structure calculations efficiently, achieving significant speed-ups and scalability.
Contribution
This work introduces DFT-FE 1.0 with improved electrostatic treatment and GPU acceleration, enabling large-scale DFT calculations with enhanced efficiency and parallel scalability.
Findings
Achieves ~20x CPU-GPU speed-up with GPU acceleration.
Demonstrates accurate results comparable to established DFT codes.
Performs ground-state calculations on systems with 6,000-15,000 electrons in 80-140 seconds.
Abstract
We present DFT-FE 1.0, building on DFT-FE 0.6 [Comput. Phys. Commun. 246, 106853 (2020)], to conduct fast and accurate large-scale density functional theory (DFT) calculations (reaching ~ electrons) on both many-core CPU and hybrid CPU-GPU computing architectures. This work involves improvements in the real-space formulation -- via an improved treatment of the electrostatic interactions that substantially enhances the computational efficiency -- as well high-performance computing aspects, including the GPU acceleration of all the key compute kernels in DFT-FE. We demonstrate the accuracy by comparing the ground-state energies, ionic forces and cell stresses on a wide-range of benchmark systems against those obtained from widely used DFT codes. Further, we demonstrate the numerical efficiency of our implementation, which yields CPU-GPU speed-up by using GPU…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
