Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
Chen Zhuang, Lingqi Zhang, Du Wu, Peng Chen, Jiajun Huang, Xin Liu, Rio Yokota, Nikoli Dryden, Toshio Endo, Satoshi Matsuoka, and Mohamed Wahib

TL;DR
This paper introduces extmethod{}, a scalable CPU-based GCN training framework that reduces communication overhead and improves efficiency, achieving significant speedups and scaling to thousands of processors without losing accuracy.
Contribution
The paper presents novel aggregation operators, a hierarchical communication scheme, and a quantization method tailored for CPU supercomputers in large-scale GCN training.
Findings
Achieves up to 6× speedup over state-of-the-art methods.
Scales to thousands of CPUs on large datasets.
Outperforms GPU-based frameworks in large-scale scenarios.
Abstract
Graph Convolutional Networks (GCNs), particularly for large-scale graphs, are crucial across numerous domains. However, training distributed full-batch GCNs on large-scale graphs suffers from inefficient memory access patterns and high communication overhead. To address these challenges, we introduce \method{}, an efficient and scalable distributed GCN training framework tailored for CPU-powered supercomputers. Our contributions are threefold: (1) we develop general and efficient aggregation operators designed for irregular memory access, (2) we propose a hierarchical aggregation scheme that reduces communication costs without altering the graph structure, and (3) we present a communication-aware quantization scheme to enhance performance. Experimental results demonstrate that \method{} achieves a speedup of up to 6 compared with the SoTA implementations, and scales to 1000s of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Brain Tumor Detection and Classification
