A data-centric approach for assessing progress of Graph Neural Networks
Tianqi Zhao, Ngan Thi Dong, Alan Hanjalic, Megha Khosla

TL;DR
This paper introduces a data-centric approach for evaluating multi-label graph neural networks, including new datasets, metrics, and a comprehensive comparison of existing methods.
Contribution
It provides the first multi-label graph datasets, new homophily metrics for multi-label scenarios, and a large-scale comparative study of GNN methods.
Findings
New multi-label biological datasets released
Proposed homophily metrics for multi-label graphs
Comparative analysis of 8 methods across 9 datasets
Abstract
Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks. However, most improvements are in multi-class classification, with less focus on the cases where each node could have multiple labels. The first challenge in studying multi-label node classification is the scarcity of publicly available datasets. To address this, we collected and released three real-world biological datasets and developed a multi-label graph generator with tunable properties. We also argue that traditional notions of homophily and heterophily do not apply well to multi-label scenarios. Therefore, we define homophily and Cross-Class Neighborhood Similarity for multi-label classification and investigate collected multi-label datasets. Lastly, we conducted a large-scale comparative study with methods across nine datasets to evaluate current progress in multi-label node…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Neural Networks and Applications
MethodsFocus
