Scalable Graph Neural Network Training: The Case for Sampling

Marco Serafini; Hui Guan

arXiv:2105.02315·cs.LG·June 15, 2021

Scalable Graph Neural Network Training: The Case for Sampling

Marco Serafini, Hui Guan

PDF

TL;DR

This paper reviews and compares approaches to scalable training of Graph Neural Networks, advocating for sample-based methods as more promising for handling large graphs efficiently.

Contribution

It provides a comprehensive comparison of whole-graph and sample-based training approaches and argues for prioritizing research on sample-based methods for scalability.

Findings

01

Sample-based training is more scalable than whole-graph methods.

02

Recent systems support efficient sample-based GNN training.

03

The paper highlights the limitations of traditional distributed training for GNNs.

Abstract

Graph Neural Networks (GNNs) are a new and increasingly popular family of deep neural network architectures to perform learning on graphs. Training them efficiently is challenging due to the irregular nature of graph data. The problem becomes even more challenging when scaling to large graphs that exceed the capacity of single devices. Standard approaches to distributed DNN training, such as data and model parallelism, do not directly apply to GNNs. Instead, two different approaches have emerged in the literature: whole-graph and sample-based training. In this paper, we review and compare the two approaches. Scalability is challenging with both approaches, but we make a case that research should focus on sample-based training since it is a more promising approach. Finally, we review recent systems supporting sample-based training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.