Accelerating Storage-Based Training for Graph Neural Networks
Myung-Hwan Jang, Jeong-Min Park, Yunyong Ko, and Sang-Wook Kim

TL;DR
This paper introduces AGNES, a novel storage-based framework for GNN training that significantly improves data I/O efficiency and accelerates training on large-scale graphs by employing block-wise I/O processing and hyperbatch strategies.
Contribution
AGNES is the first to address storage I/O bottlenecks in large-scale GNN training by combining block-wise I/O and hyperbatch processing, achieving substantial speedups.
Findings
AGNES outperforms four state-of-the-art methods by up to 4.1X in training speed.
Block-wise I/O processing fully utilizes high-performance storage bandwidth.
Hyperbatch strategy enhances I/O efficiency tailored to real-world graph characteristics.
Abstract
Graph neural networks (GNNs) have achieved breakthroughs in various real-world downstream tasks due to their powerful expressiveness. As the scale of real-world graphs has been continuously growing, a storage-based approach to GNN training has been studied, which leverages external storage (e.g., NVMe SSDs) to handle such web-scale graphs on a single machine. Although such storage-based GNN training methods have shown promising potential in large-scale GNN training, we observed that they suffer from a severe bottleneck in data preparation since they overlook a critical challenge: how to handle a large number of small storage I/Os. To address the challenge, in this paper, we propose a novel storage-based GNN training framework, named AGNES, that employs a method of block-wise storage I/O processing to fully utilize the I/O bandwidth of high-performance storage devices. Moreover, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Big Data and Digital Economy
