Plexus: Taming Billion-edge Graphs with 3D Parallel Full-graph GNN Training

Aditya K. Ranjan; Siddharth Singh; Cunyang Wei; Abhinav Bhatele

arXiv:2505.04083·cs.LG·October 30, 2025

Plexus: Taming Billion-edge Graphs with 3D Parallel Full-graph GNN Training

Aditya K. Ranjan, Siddharth Singh, Cunyang Wei, Abhinav Bhatele

PDF

TL;DR

Plexus introduces a 3D parallel full-graph GNN training method that efficiently scales to billion-edge graphs, significantly reducing training time and outperforming previous approaches on large GPU clusters.

Contribution

The paper presents a novel 3D parallel approach for full-graph GNN training, including load balancing and a performance model, enabling scalable training on billion-edge graphs.

Findings

01

Achieves 2.3-12.5x speedup over prior methods.

02

Reduces training time by up to 54.2x on large GPU clusters.

03

Successfully scales to billion-edge graphs with up to 2048 GPUs.

Abstract

Graph neural networks (GNNs) leverage the connectivity and structure of real-world graphs to learn intricate properties and relationships between nodes. Many real-world graphs exceed the memory capacity of a GPU due to their sheer size, and training GNNs on such graphs requires techniques such as mini-batch sampling to scale. The alternative approach of distributed full-graph training suffers from high communication overheads and load imbalance due to the irregular structure of graphs. We propose a three-dimensional (3D) parallel approach for full-graph training that tackles these issues and scales to billion-edge graphs. In addition, we introduce optimizations such as a double permutation scheme for load balancing, and a performance model to predict the optimal 3D configuration of our parallel implementation -- Plexus. We evaluate Plexus on six different graph datasets and show scaling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.