WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora
Pengyu Wang, Benfeng Xu, Licheng Zhang, Shaohan Wang, Mingxuan Du, Chiwei Zhu, Zhendong Mao

TL;DR
WildGraphBench introduces a realistic benchmark for GraphRAG systems using Wikipedia's long, heterogeneous documents, revealing strengths in multi-fact retrieval but challenges in detailed summarization.
Contribution
This work presents WildGraphBench, a novel benchmark that evaluates GraphRAG performance on long, complex, real-world documents, addressing limitations of previous short-passage benchmarks.
Findings
GraphRAG aids multi-fact aggregation with moderate sources
High-level statements are overemphasized in current systems
Performance on summarization tasks is weaker due to focus on high-level info
Abstract
Graph-based Retrieval-Augmented Generation (GraphRAG) organizes external knowledge as a hierarchical graph, enabling efficient retrieval and aggregation of scattered evidence across multiple documents. However, many existing benchmarks for GraphRAG rely on short, curated passages as external knowledge, failing to adequately evaluate systems in realistic settings involving long contexts and large-scale heterogeneous documents. To bridge this gap, we introduce WildGraphBench, a benchmark designed to assess GraphRAG performance in the wild. We leverage Wikipedia's unique structure, where cohesive narratives are grounded in long and heterogeneous external reference documents, to construct a benchmark reflecting real-word scenarios. Specifically, we sample articles across 12 top-level topics, using their external references as the retrieval corpus and citation-linked statements as ground…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Information Retrieval and Search Behavior
