A work-efficient parallel sparse matrix-sparse vector multiplication algorithm
Ariful Azad, Aydin Buluc

TL;DR
This paper presents a work-efficient multithreaded algorithm for sparse matrix-sparse vector multiplication that improves performance and scalability across diverse graph-related matrices on modern manycore processors.
Contribution
The paper introduces a simple, work-efficient parallel SpMSpV algorithm that avoids per-thread matrix scans and performs well on various sparse matrices, enhancing existing methods.
Findings
Achieves up to 15x speedup on 24-core processors
Attains up to 49x speedup on 64-core manycore processors
Performs efficiently on diverse matrices with heterogeneous sparsity patterns
Abstract
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multiplication (SpMSpV) where the matrix, the input vector, and the output vector are all sparse. SpMSpV is an important primitive in the emerging GraphBLAS standard and is the workhorse of many graph algorithms including breadth-first search, bipartite graph matching, and maximal independent set. As thread counts increase, existing multithreaded SpMSpV algorithms can spend more time accessing the sparse matrix data structure than doing arithmetic. Our shared-memory parallel SpMSpV algorithm is work efficient in the sense its total work is proportional to the number of arithmetic operations required. The key insight is to avoid each thread individually scan the list of matrix columns. Our algorithm is simple to implement and operates on existing column-based sparse matrix formats. It…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Parallel Computing and Optimization Techniques · Interconnection Networks and Systems
