Towards Global Retrieval Augmented Generation: A Benchmark for Corpus-Level Reasoning

Qi Luo; Xiaonan Li; Tingshuo Fan; Xinchi Chen; Xipeng Qiu

arXiv:2510.26205·cs.CL·November 5, 2025

Towards Global Retrieval Augmented Generation: A Benchmark for Corpus-Level Reasoning

Qi Luo, Xiaonan Li, Tingshuo Fan, Xinchi Chen, Xipeng Qiu

PDF

TL;DR

This paper introduces GlobalQA, a benchmark for evaluating global retrieval-augmented generation (RAG) capabilities across entire document collections, revealing current models' limitations and proposing a new collaborative framework, GlobalRAG, that significantly improves performance.

Contribution

The paper presents the first benchmark for global RAG tasks and proposes GlobalRAG, a novel multi-tool framework that enhances retrieval and reasoning across large corpora.

Findings

01

Existing RAG models perform poorly on global tasks, with the best baseline achieving only 1.51 F1.

02

GlobalRAG improves performance to 6.63 F1 on Qwen2.5-14B, demonstrating its effectiveness.

03

GlobalQA covers core tasks like counting, extremum, sorting, and top-k extraction for corpus-level reasoning.

Abstract

Retrieval-augmented generation (RAG) has emerged as a leading approach to reducing hallucinations in large language models (LLMs). Current RAG evaluation benchmarks primarily focus on what we call local RAG: retrieving relevant chunks from a small subset of documents to answer queries that require only localized understanding within specific text chunks. However, many real-world applications require a fundamentally different capability -- global RAG -- which involves aggregating and analyzing information across entire document collections to derive corpus-level insights (for example, "What are the top 10 most cited papers in 2023?"). In this paper, we introduce GlobalQA -- the first benchmark specifically designed to evaluate global RAG capabilities, covering four core task types: counting, extremum queries, sorting, and top-k extraction. Through systematic evaluation across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.