When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark
Junhong Lin, Xiaojie Guo, Shuaicheng Zhang, Yada Zhu, Julian Shun

TL;DR
This paper introduces H2GB, a comprehensive large-scale benchmark for graph learning that combines heterogeneity and heterophily, revealing current methods' limitations and proposing a new model, H2G-former, to address these challenges.
Contribution
The paper presents H2GB, a novel benchmark dataset and framework for heterophilic and heterogeneous graphs, and introduces H2G-former, a new model optimized for this complex setting.
Findings
Current methods struggle with heterophilic and heterogeneous graphs.
H2G-former outperforms existing models on the benchmark.
The benchmark facilitates standardized evaluation and development of graph models.
Abstract
Graph mining has become crucial in fields such as social science, finance, and cybersecurity. Many large-scale real-world networks exhibit both heterogeneity, where multiple node and edge types exist in the graph, and heterophily, where connected nodes may have dissimilar labels and attributes. However, existing benchmarks primarily focus on either heterophilic homogeneous graphs or homophilic heterogeneous graphs, leaving a significant gap in understanding how models perform on graphs with both heterogeneity and heterophily. To bridge this gap, we introduce H2GB, a large-scale node-classification graph benchmark that brings together the complexities of both the heterophily and heterogeneity properties of real-world graphs. H2GB encompasses 9 real-world datasets spanning 5 diverse domains, 28 baseline models, and a unified benchmarking library with a standardized data loader, evaluator,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Lib · Residual Connection · Byte Pair Encoding · Layer Normalization · Laplacian EigenMap · Label Smoothing · Linear Layer · Adam · Dropout
