Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural   Networks

Chien-Yu Lin; Liang Luo; Luis Ceze

arXiv:2104.10716·cs.LG·April 27, 2021·5 cites

Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks

Chien-Yu Lin, Liang Luo, Luis Ceze

PDF

Open Access

TL;DR

This paper introduces ES-SpMM, a cache-first edge sampling method and co-designed kernel that significantly accelerates sparse-dense matrix multiplication in GNNs on GPUs, enabling larger-scale applications.

Contribution

The paper presents a novel cache-first edge sampling mechanism and a co-designed SpMM kernel that substantially improves GNN inference performance on GPUs.

Findings

01

ES-SpMM outperforms cuSPARSE SpMM by up to 4.35x without accuracy loss.

02

ES-SpMM achieves up to 45.3x speedup with less than 1% accuracy loss.

03

The method effectively reduces computation and improves cache locality in GNN inference.

Abstract

Graph neural networks (GNNs), an emerging deep learning model class, can extract meaningful representations from highly expressive graph-structured data and are therefore gaining popularity for wider ranges of applications. However, current GNNs suffer from the poor performance of their sparse-dense matrix multiplication (SpMM) operator, even when using powerful GPUs. Our analysis shows that 95% of the inference time could be spent on SpMM when running popular GNN models on NVIDIA's advanced V100 GPU. Such SpMM performance bottleneck hinders GNNs' applicability to large-scale problems or the development of more sophisticated GNN models. To address this inference time bottleneck, we introduce ES-SpMM, a cache-first edge sampling mechanism and codesigned SpMM kernel. ES-SpMM uses edge sampling to downsize the graph to fit into GPU's shared memory. It thus reduces the computation cost and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Stochastic Gradient Optimization Techniques