Algorithms for Parallel Shared-Memory Sparse Matrix-Vector Multiplication on Unstructured Matrices
Kobe Bergmans, Karl Meerbergen, Raf Vandebril

TL;DR
This paper develops and compares new parallel algorithms for sparse matrix-vector multiplication on unstructured matrices, addressing load balancing and memory-bound challenges, and provides high-performance open-source implementations.
Contribution
Six new hybrid algorithms for parallel SpMV on shared-memory systems are introduced, combining optimization techniques and analyzing format conversion costs.
Findings
One hybrid algorithm outperforms existing algorithms by 19% on multi-CPU systems.
Conversion costs can require hundreds of multiplications to amortize.
Open-source implementation enables benchmarking across architectures.
Abstract
The sparse matrix-vector (SpMV) multiplication is an important computational kernel, but it is notoriously difficult to execute efficiently. This paper investigates algorithm performance for unstructured sparse matrices, which are more common than ever because of the trend towards large-scale data collection. The development of an SpMV multiplication algorithm for this type of data is hard due to two factors. First, parallel load balancing issues arise because of the unpredictable nonzero structure. Secondly, SpMV multiplication algorithms are inevitably memory-bound because the sparsity causes a low arithmetic intensity. Three state-of-the-art algorithms for parallel SpMV multiplication on shared-memory systems are discussed. Six new hybrid algorithms are developed which combine optimization techniques of the current algorithms. These techniques include parallelization strategies,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Matrix Theory and Algorithms
