Technical Report on Hypergraph-Partitioning-Based Models and Methods for   Exploiting Cache Locality in Sparse-Matrix Vector Multiplication

Kadir Akbudak; Enver Kayaaslan; Cevdet Aykanat

arXiv:1202.3856·cs.NA·February 28, 2012·2 cites

Technical Report on Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication

Kadir Akbudak, Enver Kayaaslan, Cevdet Aykanat

PDF

Open Access

TL;DR

This paper introduces hypergraph-based models and methods for reordering and splitting sparse matrices to improve cache locality in sparse matrix-vector multiplication, significantly enhancing performance.

Contribution

It proposes novel cache-aware hypergraph partitioning and matrix splitting techniques for optimizing SpMxV computations, outperforming existing schemes.

Findings

01

Proposed methods outperform state-of-the-art schemes.

02

Cache-size-aware reordering improves cache utilization.

03

Matrix splitting enhances temporal locality in SpMxV.

Abstract

The sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers. Matrices with irregular sparsity patterns make it difficult to utilize cache locality effectively in SpMxV computations. In this work, we investigate single- and multiple-SpMxV frameworks for exploiting cache locality in SpMxV computations. For the single-SpMxV framework, we propose two cache-size-aware top-down row/column-reordering methods based on 1D and 2D sparse matrix partitioning by utilizing the column-net and enhancing the row-column-net hypergraph models of sparse matrices. The multiple-SpMxV framework depends on splitting a given matrix into a sum of multiple nonzero-disjoint matrices so that the SpMxV operation is performed as a sequence of multiple input- and output- dependent SpMxV…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Matrix Theory and Algorithms · Distributed and Parallel Computing Systems