Advanced Techniques for High-Performance Fock Matrix Construction on GPU Clusters
Elise Palethorpe, Ryan Stocks, and Giuseppe M. J. Barca

TL;DR
This paper introduces two optimized multi-GPU algorithms, opt-UM and opt-Brc, for efficient Fock matrix construction in electronic structure calculations, achieving significant speedups and scalability improvements over existing methods.
Contribution
The paper presents novel multi-GPU algorithms that enhance Fock matrix construction by exploiting sparsity, symmetry, and linear scaling, extending capabilities to higher angular momentum functions.
Findings
Algorithms outperform existing GPU and CPU implementations in speed.
Achieve up to 8.5× speedup on benchmark systems.
Maintain over 91% parallel efficiency on four GPUs.
Abstract
This Article presents two optimized multi-GPU algorithms for Fock matrix construction, building on the work of Ufimtsev et al. and Barca et al. The novel algorithms, opt-UM and opt-Brc, introduce significant enhancements, including improved integral screening, exploitation of sparsity and symmetry, a linear scaling exchange matrix assembly algorithm, and extended capabilities for Hartree-Fock caculations up to -type angular momentum functions. Opt-Brc excels for smaller systems and for highly contracted triple- basis sets, while opt-UM is advantageous for large molecular systems. Performance benchmarks on NVIDIA A100 GPUs show that our algorithms in the EXtreme-scale Electronic Structure System (EXESS), when combined, outperform all current GPU and CPU Fock build implementations in TeraChem, QUICK, GPU4PySCF, LibIntX, ORCA, and Q-Chem. The implementations were benchmarked on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Nanocluster Synthesis and Applications · Machine Learning and ELM
