Compressed Multi-Row Storage Format for Sparse Matrices on Graphics Processing Units
Zbigniew Koza, Maciej Matyka, Sebastian Szkoda, and {\L}ukasz, Miros{\l}aw

TL;DR
This paper introduces a new sparse matrix storage format optimized for GPUs, significantly improving the performance of sparse matrix-vector multiplication compared to existing methods.
Contribution
The paper proposes a novel compressed multi-row storage format for sparse matrices that enhances GPU-based SpMV performance and is easily convertible from standard CRS format.
Findings
Achieved up to 60% speedup over existing GPU kernels
Validated performance across 130 sparse matrices on different GPU architectures
Demonstrated compatibility with standard CRS format for easy adoption
Abstract
A new format for storing sparse matrices is proposed for efficient sparse matrix-vector (SpMV) product calculation on modern graphics processing units (GPUs). This format extends the standard compressed row storage (CRS) format and can be quickly converted to and from it. Computational performance of two SpMV kernels for the new format is determined for over 130 sparse matrices on Fermi-class and Kepler-class GPUs and compared with that of five existing generic algorithms and industrial implementations, including Nvidia cuSparse CSR and HYB kernels. We found the speedup of up to over the best of the five alternative kernels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
