CB-SpMV:A Data Aggregating and Balance Algorithm for Cache-Friendly Block-Based SpMV on GPUs

Xing Cong; Fukai Sun; Yifan Chen; Chenhao Xie*; Yi Liu; and Depei Qian

arXiv:2605.18515·cs.DC·May 19, 2026

CB-SpMV:A Data Aggregating and Balance Algorithm for Cache-Friendly Block-Based SpMV on GPUs

Xing Cong, Fukai Sun, Yifan Chen, Chenhao Xie*, Yi Liu, and Depei Qian

PDF

1 Repo

TL;DR

CB-SpMV introduces a novel cache-friendly, adaptive, and load-balanced algorithm for sparse matrix-vector multiplication on GPUs, significantly improving performance and cache efficiency.

Contribution

The paper proposes a new data convergent 2D blocking structure, adaptive sub-block formats, and an inter-block load-balancing algorithm for optimized GPU-based SpMV.

Findings

01

Achieves up to 3.95x speedup over existing methods.

02

Significantly improves cache hit rates.

03

Effectively balances workload across GPU thread blocks.

Abstract

Sparse matrix-vector multiplication (SpMV) is crucial in computational science, engineering, and machine learning. Despite substantial efforts to improve SpMV performance on GPUs through various techniques, issues related to data locality, hardware utilization, and load balancing persist, leaving room for further optimization. This paper presents CB-SpMV, a cache-friendly SpMV optimization algorithm, using a novel data convergent and adaptable 2D blocking structure. The matrix in CB-SpMV is divided into independent sub-blocks, with virtual pointers aggregating different types of intra-block data for better cache-level data locality. To enhance hardware utilization, a block-aware column aggregation strategy and the selection of sub-block formats are proposed to accelerate computation and adapt to varying sparse matrices. Finally, an inter-block load-balancing algorithm is designed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xing-cong/CB-Sparse
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.