Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks
P\'eter Mernyei, C\u{a}t\u{a}lina Cangea

TL;DR
Wiki-CS is a new Wikipedia-derived dataset designed for benchmarking Graph Neural Networks, focusing on semi-supervised node classification and link prediction in a domain with distinct structural properties.
Contribution
It introduces Wiki-CS, a novel dataset for GNN benchmarking based on Wikipedia, with comprehensive evaluation and publicly available resources.
Findings
GNN models perform well on the new domain
Structural properties differ from previous benchmarks
Dataset facilitates diverse GNN evaluations
Abstract
We present Wiki-CS, a novel dataset derived from Wikipedia for benchmarking Graph Neural Networks. The dataset consists of nodes corresponding to Computer Science articles, with edges based on hyperlinks and 10 classes representing different branches of the field. We use the dataset to evaluate semi-supervised node classification and single-relation link prediction models. Our experiments show that these methods perform well on a new domain, with structural properties different from earlier benchmarks. The dataset is publicly available, along with the implementation of the data pipeline and the benchmark experiments, at https://github.com/pmernyei/wiki-cs-dataset .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Bioinformatics and Genomic Networks
