Examining the Effects of Degree Distribution and Homophily in Graph Learning Models
Mustafa Yasir, John Palowitch, Anton Tsitsulin, Long Tran-Thanh, Bryan, Perozzi

TL;DR
This paper enhances the GraphWorld benchmarking framework by integrating LFR and CABAM graph generators, enabling more diverse and realistic synthetic graph datasets for evaluating GNN performance under various structural properties.
Contribution
The authors introduce LFR and CABAM generators into GraphWorld, expanding its ability to produce diverse, realistic graphs for GNN benchmarking beyond the limitations of SBM.
Findings
GNN performance varies with homophily and degree distribution.
Models show different sensitivities to new graph structures.
Extended GraphWorld generates 300,000 graphs for comprehensive benchmarking.
Abstract
Despite a surge in interest in GNN development, homogeneity in benchmarking datasets still presents a fundamental issue to GNN research. GraphWorld is a recent solution which uses the Stochastic Block Model (SBM) to generate diverse populations of synthetic graphs for benchmarking any GNN task. Despite its success, the SBM imposed fundamental limitations on the kinds of graph structure GraphWorld could create. In this work we examine how two additional synthetic graph generators can improve GraphWorld's evaluation; LFR, a well-established model in the graph clustering literature and CABAM, a recent adaptation of the Barabasi-Albert model tailored for GNN benchmarking. By integrating these generators, we significantly expand the coverage of graph space within the GraphWorld framework while preserving key graph properties observed in real-world networks. To demonstrate their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Complex Network Analysis Techniques · Data Quality and Management
