MassiveGNN: Efficient Training via Prefetching for Massively Connected Distributed Graphs
Aishwarya Sarkar, Sayan Ghosh, Nathan R. Tallent, Ali Jannesari

TL;DR
This paper introduces MassiveGNN, a prefetching scheme that improves distributed GNN training efficiency on massively connected graphs, achieving significant performance gains by reducing communication and load imbalance.
Contribution
It presents a novel prefetch and eviction scheme integrated into DistDGL, enhancing training performance on large-scale distributed graphs with minimal overhead.
Findings
Achieved 15-40% training speedup on NERSC Perlmutter supercomputer.
Demonstrated effectiveness on various OGB datasets.
Reduced communication overhead and load imbalance in distributed GNN training.
Abstract
Graph Neural Networks (GNN) are indispensable in learning from graph-structured data, yet their rising computational costs, especially on massively connected graphs, pose significant challenges in terms of execution performance. To tackle this, distributed-memory solutions such as partitioning the graph to concurrently train multiple replicas of GNNs are in practice. However, approaches requiring a partitioned graph usually suffer from communication overhead and load imbalance, even under optimal partitioning and communication strategies due to irregularities in the neighborhood minibatch sampling. This paper proposes practical trade-offs for improving the sampling and communication overheads for representation learning on distributed graphs (using popular GraphSAGE architecture) by developing a parameterized continuous prefetch and eviction scheme on top of the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · IoT and Edge/Fog Computing · Advanced Graph Neural Networks
MethodsDistDGL · GraphSAGE
