Accelerating Graph Sampling for Graph Machine Learning using GPUs
Abhinav Jangda, Sandeep Polisetty, Arjun Guha, Marco Serafini

TL;DR
This paper introduces NextDoor, a GPU-accelerated system for graph sampling that significantly speeds up data preparation for graph machine learning models by employing transit-parallelism.
Contribution
NextDoor presents a novel GPU-based graph sampling system using transit-parallelism, enabling efficient load balancing and caching for various sampling algorithms.
Findings
NextDoor runs sampling algorithms orders of magnitude faster than existing systems.
The system effectively utilizes GPU resources despite graph irregularity.
It provides a high-level abstraction for diverse graph sampling methods.
Abstract
Representation learning algorithms automatically learn the features of data. Several representation learning algorithms for graph data, such as DeepWalk, node2vec, and GraphSAGE, sample the graph to produce mini-batches that are suitable for training a DNN. However, sampling time can be a significant fraction of training time, and existing systems do not efficiently parallelize sampling. Sampling is an embarrassingly parallel problem and may appear to lend itself to GPU acceleration, but the irregularity of graphs makes it hard to use GPU resources effectively. This paper presents NextDoor, a system designed to effectively perform graph sampling on GPUs. NextDoor employs a new approach to graph sampling that we call transit-parallelism, which allows load balancing and caching of edges. NextDoor provides end-users with a high-level abstraction for writing a variety of graph sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Caching and Content Delivery
MethodsGraphSAGE · DeepWalk · node2vec
