Improving SpGEMM Performance Through Matrix Reordering and Cluster-wise Computation
Abdullah Al Raqibul Islam, Helen Xu, Dong Dai, Ayd{\i}n Bulu\c{c}

TL;DR
This paper introduces hierarchical clustering and matrix reordering techniques to significantly improve the performance of sparse matrix multiplication (SpGEMM), reducing data movement bottlenecks and accelerating computation.
Contribution
It proposes a novel hierarchical clustering approach combined with row reordering for SpGEMM, decoupling these optimizations for flexible application and demonstrating substantial speedups.
Findings
Hierarchical clustering speeds up SpGEMM by 1.39x on average.
Reordering based on graph partitioning outperforms other methods.
The combined approach achieves higher speedups with comparable preprocessing costs.
Abstract
Sparse matrix-sparse matrix multiplication (SpGEMM) is a key kernel in many scientific applications and graph workloads. Unfortunately, SpGEMM is bottlenecked by data movement due to its irregular memory access patterns. Significant work has been devoted to developing row reordering schemes towards improving locality in sparse operations, but prior studies mostly focus on the case of sparse-matrix vector multiplication (SpMV). In this paper, we address these issues with hierarchical clustering for SpGEMM that leverages both row reordering and cluster-wise computation to improve reuse in the second input (B) matrix with a novel row-clustered matrix format and access pattern in the first input (A) matrix. We find that hierarchical clustering can speed up SpGEMM by 1.39x on average with low preprocessing cost (less than 20x the cost of a single SpGEMM on about 90% of inputs).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
