CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
Weifeng Liu, Brian Vinter

TL;DR
CSR5 is a new sparse matrix storage format that enables high-performance matrix-vector multiplication across multiple platforms, especially benefiting irregular matrices with low conversion overhead.
Contribution
The paper introduces CSR5, a novel storage format that improves SpMV efficiency on diverse hardware and handles irregular matrices effectively with minimal conversion costs.
Findings
Achieves comparable or better performance on regular matrices.
Significant performance improvements on irregular matrices across platforms.
Low overhead for format conversion makes CSR5 practical for real-world applications.
Abstract
Sparse matrix-vector multiplication (SpMV) is a fundamental building block for numerous applications. In this paper, we propose CSR5 (Compressed Sparse Row 5), a new storage format, which offers high-throughput SpMV on various platforms including CPUs, GPUs and Xeon Phi. First, the CSR5 format is insensitive to the sparsity structure of the input matrix. Thus the single format can support an SpMV algorithm that is efficient both for regular matrices and for irregular matrices. Furthermore, we show that the overhead of the format conversion from the CSR to the CSR5 can be as low as the cost of a few SpMV operations. We compare the CSR5-based SpMV algorithm with 11 state-of-the-art formats and algorithms on four mainstream processors using 14 regular and 10 irregular matrices as a benchmark suite. For the 14 regular matrices in the suite, we achieve comparable or better performance over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
