Technical Report on Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication
Kadir Akbudak, Enver Kayaaslan, Cevdet Aykanat

TL;DR
This paper introduces hypergraph-based models and methods for reordering and splitting sparse matrices to improve cache locality in sparse matrix-vector multiplication, significantly enhancing performance.
Contribution
It proposes novel cache-aware hypergraph partitioning and matrix splitting techniques for optimizing SpMxV computations, outperforming existing schemes.
Findings
Proposed methods outperform state-of-the-art schemes.
Cache-size-aware reordering improves cache utilization.
Matrix splitting enhances temporal locality in SpMxV.
Abstract
The sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers. Matrices with irregular sparsity patterns make it difficult to utilize cache locality effectively in SpMxV computations. In this work, we investigate single- and multiple-SpMxV frameworks for exploiting cache locality in SpMxV computations. For the single-SpMxV framework, we propose two cache-size-aware top-down row/column-reordering methods based on 1D and 2D sparse matrix partitioning by utilizing the column-net and enhancing the row-column-net hypergraph models of sparse matrices. The multiple-SpMxV framework depends on splitting a given matrix into a sum of multiple nonzero-disjoint matrices so that the SpMxV operation is performed as a sequence of multiple input- and output- dependent SpMxV…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Distributed and Parallel Computing Systems
