Implicit Regularization of Mini-Batch Training in Graph Neural Networks
Clement Wang, Antoine Vialle, Robin Vaysse, and Thomas Bonald

TL;DR
This paper reveals that simple random node sampling in GNN training acts as an implicit regularizer, often outperforming more complex structure-aware methods by reducing gradient variance and aligning mini-batch loss with full-graph loss.
Contribution
The study demonstrates that random node sampling implicitly regularizes GNN training, providing a theoretically grounded, scalable alternative to structure-aware sampling methods.
Findings
RNS matches or outperforms full-graph training on 8 of 10 datasets.
RNS produces mini-batches with lower gradient variance.
Implicit regularization explains the effectiveness of RNS in GNN training.
Abstract
Mini-batch training of Graph Neural Networks (GNNs) is fundamentally different from training on i.i.d. data: sampling a subgraph alters the topology and introduces boundary effects, leading prior work to develop structure-aware samplers that preserve local connectivity and reduce embedding variance. Surprisingly, we demonstrate that the simplest possible scheme, Random Node Sampling (RNS), training on the induced subgraph of uniformly sampled nodes, matches or outperforms full-graph training on 8 of 10 datasets at a fraction of the wall-clock time and memory. To explain this, we apply backward error analysis to graph mini-batch Stochastic Gradient Descent (SGD) and show that it implicitly minimizes the sampled loss plus a regularizer proportional to the mini-batch gradient variance, a quantity directly shaped by the sampler. Although RNS discards local structure, it produces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
