Pitfalls of Graph Neural Network Evaluation
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, Stephan, G\"unnemann

TL;DR
This paper critically examines the evaluation methods of graph neural networks, revealing that current practices can lead to misleading comparisons and that simpler models can outperform complex ones with proper tuning.
Contribution
It highlights flaws in GNN evaluation strategies and demonstrates the importance of fair data splits and training procedures for accurate model comparison.
Findings
Different data splits cause significant ranking changes.
Simpler GNNs can outperform complex models with fair tuning.
Evaluation strategies need standardization for reliable comparisons.
Abstract
Semi-supervised node classification in graphs is a fundamental problem in graph mining, and the recently proposed graph neural networks (GNNs) have achieved unparalleled results on this task. Due to their massive success, GNNs have attracted a lot of attention, and many novel architectures have been put forward. In this paper we show that existing evaluation strategies for GNN models have serious shortcomings. We show that using the same train/validation/test splits of the same datasets, as well as making significant changes to the training procedure (e.g. early stopping criteria) precludes a fair comparison of different architectures. We perform a thorough empirical evaluation of four prominent GNN models and show that considering different splits of the data leads to dramatically different rankings of models. Even more importantly, our findings suggest that simpler GNN architectures…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Graph Theory and Algorithms
MethodsEarly Stopping
