A survey of sparse matrix-vector multiplication performance on large matrices
Max Grossman, Christopher Thiele, Mauricio Araya-Polo, Florian Frank,, Faruk O. Alpak, Vivek Sarkar

TL;DR
This paper surveys the performance of sparse matrix-vector multiplication on large matrices across various software libraries, hardware architectures, and matrix formats, providing a comprehensive comparison of their efficiency.
Contribution
It offers a detailed third-party performance comparison of multiple SpMV implementations on large matrices, hardware, and formats, which was previously lacking.
Findings
Performance varies significantly across libraries and hardware.
Certain formats outperform others depending on the architecture.
The survey highlights optimal configurations for large-scale SpMV computations.
Abstract
We contribute a third-party survey of sparse matrix-vector (SpMV) product performance on industrial-strength, large matrices using: (1) The SpMV implementations in Intel MKL, the Trilinos project (Tpetra subpackage), the CUSPARSE library, and the CUSP library, each running on modern architectures. (2) NVIDIA GPUs and Intel multi-core CPUs (supported by each software package). (3) The CSR, BSR, COO, HYB, and ELL matrix formats (supported by each software package).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
