New Row-grouped CSR format for storing the sparse matrices on GPU with implementation in CUDA
Tom\'a\v{s} Oberhuber, Atsushi Suzuki, Jan Vacata

TL;DR
This paper introduces a new GPU-optimized sparse matrix storage format called Row-grouped CSR, implemented in CUDA, and compares its performance against the Hybrid format across 1,600 matrices.
Contribution
The paper presents a novel sparse matrix format tailored for GPUs, with detailed implementation and performance comparison against existing formats.
Findings
The new format performs better on certain matrix types.
It shows advantages in both strong and weak scaling scenarios.
Performance varies depending on matrix characteristics.
Abstract
In this article we present a new format for storing sparse matrices. The format is designed to perform well mainly on the GPU devices. We present its implementation in CUDA. The performance has been tested on 1,600 different types of matrices and we compare our format with the Hybrid format. We give detailed comparison of both formats and show their strong and weak parts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Distributed and Parallel Computing Systems
