GLBench: A Comprehensive Benchmark for Graph with Large Language Models
Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng, Cai, Victor Wai Kin Chan, Jia Li

TL;DR
GLBench is the first comprehensive benchmark for evaluating GraphLLM methods across supervised and zero-shot tasks, revealing insights into their performance, limitations, and the importance of structure and semantics.
Contribution
Introduces GLBench, a standardized benchmark for GraphLLM, enabling fair comparison and analysis of different methods on real-world datasets.
Findings
GraphLLM outperforms traditional baselines in supervised tasks.
Using LLMs as predictors often causes uncontrollable outputs.
No clear scaling laws are observed for current GraphLLM methods.
Abstract
The emergence of large language models (LLMs) has revolutionized the way we interact with graphs, leading to a new paradigm called GraphLLM. Despite the rapid development of GraphLLM methods in recent years, the progress and understanding of this field remain unclear due to the lack of a benchmark with consistent experimental protocols. To bridge this gap, we introduce GLBench, the first comprehensive benchmark for evaluating GraphLLM methods in both supervised and zero-shot scenarios. GLBench provides a fair and thorough evaluation of different categories of GraphLLM methods, along with traditional baselines such as graph neural networks. Through extensive experiments on a collection of real-world datasets with consistent data processing and splitting strategies, we have uncovered several key findings. Firstly, GraphLLM methods outperform traditional baselines in supervised settings,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Topic Modeling · Advanced Graph Neural Networks
