An Optimized Sparse Approximate Matrix Multiply for Matrices with Decay
Nicolas Bock, Matt Challacombe

TL;DR
This paper introduces an optimized sparse matrix multiplication algorithm with decay, achieving lower error and higher speed than dense matrix routines, suitable for quantum chemical matrices and scalable to large sizes.
Contribution
The paper presents an optimized implementation of the extsc{SpAMM} algorithm that outperforms standard dense routines and naive sparse implementations in both accuracy and speed.
Findings
Achieves $ ext{O}(n ext{ log } n)$ complexity for matrices with decay.
Outperforms dense routines like { t SGEMM} in accuracy and speed for matrices with around 1000 size.
Potential hardware prefetch improvements could further double or triple the speed.
Abstract
We present an optimized single-precision implementation of the Sparse Approximate Matrix Multiply (\SpAMM{}) [M. Challacombe and N. Bock, arXiv {\bf 1011.3534} (2010)], a fast algorithm for matrix-matrix multiplication for matrices with decay that achieves an computational complexity with respect to matrix dimension . We find that the max norm of the error achieved with a \SpAMM{} tolerance below is lower than that of the single-precision {\tt SGEMM} for dense quantum chemical matrices, while outperforming {\tt SGEMM} with a cross-over already for small matrices (). Relative to naive implementations of \SpAMM{} using Intel's Math Kernel Library ({\tt MKL}) or AMD's Core Math Library ({\tt ACML}), our optimized version is found to be significantly faster. Detailed performance comparisons are made for quantum chemical matrices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
