Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a   Single Machine via Provably Optimal In-memory Caching

Yeonhong Park; Sunhong Min; Jae W. Lee

arXiv:2208.09151·cs.LG·August 22, 2022

Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching

Yeonhong Park, Sunhong Min, Jae W. Lee

PDF

1 Repo

TL;DR

Ginex is a novel SSD-based GNN training system that enables billion-scale graph training on a single machine by optimizing in-memory caching to reduce data movement bottlenecks.

Contribution

Ginex introduces a new training pipeline restructuring and a provably optimal caching algorithm for efficient single-machine billion-scale GNN training.

Findings

01

Achieves 2.11x higher throughput than SSD-extended PyTorch Geometric.

02

Effectively processes four billion-scale graph datasets on a single machine.

03

Demonstrates the effectiveness of optimal in-memory caching in SSD-based GNN training.

Abstract

Recently, Graph Neural Networks (GNNs) have been receiving a spotlight as a powerful tool that can effectively serve various inference tasks on graph structured data. As the size of real-world graphs continues to scale, the GNN training system faces a scalability challenge. Distributed training is a popular approach to address this challenge by scaling out CPU nodes. However, not much attention has been paid to disk-based GNN training, which can scale up the single-node system in a more cost-effective manner by leveraging high-performance storage devices like NVMe SSDs. We observe that the data movement between the main memory and the disk is the primary bottleneck in the SSD-based training system, and that the conventional GNN training pipeline is sub-optimal without taking this overhead into account. Thus, we propose Ginex, the first SSD-based GNN training system that can process…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

snu-arc/ginex
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.