GE-SpMM: General-purpose Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks
Guyue Huang, Guohao Dai, Yu Wang, Huazhong Yang

TL;DR
This paper introduces GE-SpMM, a GPU-optimized sparse matrix multiplication method that supports general GNN operations without preprocessing overheads, significantly accelerating GNN training and inference.
Contribution
GE-SpMM provides a novel GPU algorithm for SpMM in CSR format, enabling efficient, general GNN operations with reduced data conversion and improved parallelism.
Findings
Up to 1.41X speedup over Nvidia cuSPARSE
Up to 1.81X speedup over GraphBLAST
Up to 3.67X speedup in GNN frameworks
Abstract
Graph Neural Networks (GNNs) have achieved significant improvements in various domains. Sparse Matrix-Matrix multiplication (SpMM) is a fundamental operator in GNNs, which performs a multiplication between a sparse matrix and a dense matrix. Accelerating SpMM on parallel hardware like GPUs can face the following challenges: From the GNN application perspective, the compatibility needs to be considered. General GNN algorithms require SpMM-like operations (e.g., pooling) between matrices, which are not supported in current high-performance GPU libraries (e.g., Nvidia cuSPARSE). Moreover, the sophisticated preprocessing in previous implementations will lead to heavy data format conversion overheads in GNN frameworks. From the GPU hardware perspective, optimizations in SpMV (Sparse Matrix-Vector) designs on GPUs do not apply well to SpMM. SpMM exposes the column-wise parallelism in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Machine Learning in Materials Science
