Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks
Peng Jiang, Masuma Akter Rumi

TL;DR
This paper introduces a communication-efficient neighbor sampling method for distributed GCN training, significantly reducing communication costs while maintaining accuracy.
Contribution
It proposes a novel sampling algorithm that prioritizes local nodes to minimize inter-machine communication in distributed GCN training.
Findings
Reduces communication overhead in distributed GCN training
Maintains comparable accuracy with traditional methods
Effective on node classification benchmarks
Abstract
Training Graph Convolutional Networks (GCNs) is expensive as it needs to aggregate data recursively from neighboring nodes. To reduce the computation overhead, previous works have proposed various neighbor sampling methods that estimate the aggregation result based on a small number of sampled neighbors. Although these methods have successfully accelerated the training, they mainly focus on the single-machine setting. As real-world graphs are large, training GCNs in distributed systems is desirable. However, we found that the existing neighbor sampling methods do not work well in a distributed setting. Specifically, a naive implementation may incur a huge amount of communication of feature vectors among different machines. To address this problem, we propose a communication-efficient neighbor sampling method in this work. Our main idea is to assign higher sampling probabilities to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Recommender Systems and Techniques · Privacy-Preserving Technologies in Data
MethodsGraph Convolutional Networks · Graph Convolutional Network
