Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training
Borui Wan, Juntao Zhao, Chuan Wu

TL;DR
This paper introduces AdaQP, a system that accelerates distributed full-graph GNN training by adaptive message quantization and parallelization, significantly improving throughput with minimal accuracy loss.
Contribution
The paper proposes a novel adaptive quantization scheme and communication-computation parallelization for efficient distributed GNN training, with theoretical convergence guarantees.
Findings
Up to 3.01x throughput improvement in training speed.
Negligible accuracy drop of at most 0.30%.
Effective adaptive bit-width assignment for messages.
Abstract
Distributed full-graph training of Graph Neural Networks (GNNs) over large graphs is bandwidth-demanding and time-consuming. Frequent exchanges of node features, embeddings and embedding gradients (all referred to as messages) across devices bring significant communication overhead for nodes with remote neighbors on other devices (marginal nodes) and unnecessary waiting time for nodes without remote neighbors (central nodes) in the training graph. This paper proposes an efficient GNN training system, AdaQP, to expedite distributed full-graph GNN training. We stochastically quantize messages transferred across devices to lower-precision integers for communication traffic reduction and advocate communication-computation parallelization between marginal nodes and central nodes. We provide theoretical analysis to prove fast training convergence (at the rate of O(T^{-1}) with T being the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Age of Information Optimization · IoT and Edge/Fog Computing
