TL;DR
This paper systematically compares full-graph and mini-batch GNN training approaches, analyzing their performance and efficiency through empirical and theoretical methods focusing on batch and fan-out sizes.
Contribution
It introduces a novel Wasserstein distance-based generalization analysis and reveals non-isotropic effects of batch and fan-out sizes in GNN training.
Findings
Full-graph training does not always outperform mini-batch training.
Fan-out size significantly impacts GNN convergence and generalization.
Practical guidance for hyperparameter tuning under resource constraints.
Abstract
Full-graph and mini-batch Graph Neural Network (GNN) training approaches have distinct system design demands, making it crucial to choose the appropriate approach to develop. A core challenge in comparing these two GNN training approaches lies in characterizing their model performance (i.e., convergence and generalization) and computational efficiency. While a batch size has been an effective lens in analyzing such behaviors in deep neural networks (DNNs), GNNs extend this lens by introducing a fan-out size, as full-graph training can be viewed as mini-batch training with the largest possible batch size and fan-out size. However, the impact of the batch and fan-out size for GNNs remains insufficiently explored. To this end, this paper systematically compares full-graph vs. mini-batch training of GNNs through empirical and theoretical analyses from the view points of the batch size and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
