Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks
Chien-Yu Lin, Liang Luo, Luis Ceze

TL;DR
This paper introduces ES-SpMM, a cache-first edge sampling method and co-designed kernel that significantly accelerates sparse-dense matrix multiplication in GNNs on GPUs, enabling larger-scale applications.
Contribution
The paper presents a novel cache-first edge sampling mechanism and a co-designed SpMM kernel that substantially improves GNN inference performance on GPUs.
Findings
ES-SpMM outperforms cuSPARSE SpMM by up to 4.35x without accuracy loss.
ES-SpMM achieves up to 45.3x speedup with less than 1% accuracy loss.
The method effectively reduces computation and improves cache locality in GNN inference.
Abstract
Graph neural networks (GNNs), an emerging deep learning model class, can extract meaningful representations from highly expressive graph-structured data and are therefore gaining popularity for wider ranges of applications. However, current GNNs suffer from the poor performance of their sparse-dense matrix multiplication (SpMM) operator, even when using powerful GPUs. Our analysis shows that 95% of the inference time could be spent on SpMM when running popular GNN models on NVIDIA's advanced V100 GPU. Such SpMM performance bottleneck hinders GNNs' applicability to large-scale problems or the development of more sophisticated GNN models. To address this inference time bottleneck, we introduce ES-SpMM, a cache-first edge sampling mechanism and codesigned SpMM kernel. ES-SpMM uses edge sampling to downsize the graph to fit into GPU's shared memory. It thus reduces the computation cost and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Stochastic Gradient Optimization Techniques
