Performance Enhancement of the Ozaki Scheme on Integer Matrix Multiplication Unit
Yuki Uchino, Katsuhisa Ozaki, Toshiyuki Imamura

TL;DR
This paper enhances the Ozaki scheme for integer matrix multiplication by proposing methods to reduce computational steps, aiming to improve accuracy and performance on architectures like GPUs designed for low-precision calculations.
Contribution
It introduces alternative approaches that decrease the number of lower-precision multiplications and higher-precision additions in the Ozaki scheme, boosting efficiency.
Findings
Numerical experiments confirm the accuracy of the proposed methods.
Performance benchmarks show improved efficiency over previous implementations.
Approaches are suitable for next-generation architectures with low-precision units.
Abstract
This study was aimed at simultaneously achieving sufficient accuracy and high performance for general matrix multiplications. Recent architectures, such as NVIDIA GPUs, feature high-performance units designed for low-precision matrix multiplications in machine learning models, and next-generation architectures are expected to follow the same design principle. The key to achieving superior performance is to fully leverage such architectures. The Ozaki scheme, a highly accurate matrix multiplication algorithm using error-free transformations, enables higher-precision matrix multiplication to be performed through multiple lower-precision matrix multiplications and higher-precision matrix additions. Ootomo et al. implemented the Ozaki scheme on high-performance matrix multiplication units with the aim of achieving both sufficient accuracy and high performance. This paper proposes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Residue Arithmetic · Numerical Methods and Algorithms · Coding theory and cryptography
