A Precision Emulation Approach to the GPU Acceleration of Ab Initio Electronic Structure Calculations
Hang Liu, Junjie Li, Yinzhi Wang, Niraj K. Nepal, Yang Wang

TL;DR
This paper presents an INT8-based emulation method for GPU acceleration of HPC workloads, enabling tunable precision to improve both accuracy and performance without altering existing algorithms.
Contribution
It introduces a novel INT8 emulation approach that preserves original algorithms while optimizing hardware utilization for scientific computing.
Findings
Accuracy depends on arithmetic precision and operator properties.
Tunable precision emulation can improve both accuracy and performance.
The approach enables GPU acceleration without code modifications.
Abstract
This study explores the use of INT8-based emulation for accelerating traditional FP64-based HPC workloads on modern GPU architectures. Through SCILIB-Accel automatic BLAS offload tool for cache-coherent Unified Memory Architecture, we emulate FP64 matrix multiplications in the LSMS CPU application in the MuST suite without code changes. We find that accuracy depends on both arithmetic precision and the properties of the operator, which can be dealt with through tunable precision emulation. Unlike traditional mixed-precision approaches, this method preserves original algorithms while optimizing hardware utilization. We showcase the potential of improving accuracy and performance at the same time. This work highlights the potential of AI-driven hardware to transform HPC, advocating for adaptive precision strategies in future scientific computing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
