Rethinking the Setting of Semi-supervised Learning on Graphs
Ziang Li, Ming Ding, Weikai Li, Zihan Wang, Ziyu Zeng, Yukuo Cen, Jie, Tang

TL;DR
This paper critically examines the hyper-parameter tuning process in semi-supervised learning on graphs, introduces a new benchmark to reduce over-tuning bias, and demonstrates improved stability and performance evaluation.
Contribution
It proposes ValidUtil for hyper-parameter tuning and introduces IGB, a new benchmark with i.i.d. graphs to ensure fairer and more stable evaluation.
Findings
ValidUtil achieves 85.8% accuracy on Cora with GCN.
IGB reduces evaluation variance compared to previous datasets.
Over-tuning hyper-parameters can significantly inflate performance metrics.
Abstract
We argue that the present setting of semisupervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To explore the limit of over-tuning hyperparameters, we propose ValidUtil, an approach to fully utilize the label information in the validation set through an extra group of hyper-parameters. With ValidUtil, even GCN can easily get high accuracy of 85.8% on Cora. To avoid over-tuning, we merge the training set and the validation set and construct an i.i.d. graph benchmark (IGB) consisting of 4 datasets. Each dataset contains 100 i.i.d. graphs sampled from a large graph to reduce the evaluation variance. Our experiments suggest that IGB is a more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning and Data Classification · Text and Document Classification Technologies
MethodsGraph Convolutional Network
