Aggregation Queries over Unstructured Text: Benchmark and Agentic Method
Haojia Zhu, Qinyuan Xu, Haoyu Li, Yuxi Liu, Hanchen Qiu, Jiaoyan Chen, Jiahui Jin

TL;DR
This paper introduces AGGBench, a benchmark for entity-level aggregation queries over text, and proposes DFA, a modular agentic method that improves evidence coverage in large-scale corpus analysis.
Contribution
It formalizes entity-level aggregation queries, introduces AGGBench for evaluation, and presents DFA, a novel modular baseline that enhances completeness in aggregation tasks.
Findings
DFA outperforms strong RAG baselines in evidence coverage.
AGGBench enables principled evaluation of aggregation completeness.
DFA exposes key failure modes in ambiguity, filtering, and aggregation.
Abstract
Aggregation query over free text is a long-standing yet underexplored problem. Unlike ordinary question answering, aggregate queries require exhaustive evidence collection and systems are required to "find all," not merely "find one." Existing paradigms such as Text-to-SQL and Retrieval-Augmented Generation fail to achieve this completeness. In this work, we formalize entity-level aggregation querying over text in a corpus-bounded setting with strict completeness requirement. To enable principled evaluation, we introduce AGGBench, a benchmark designed to evaluate completeness-oriented aggregation under realistic large-scale corpus. To accompany the benchmark, we propose DFA (Disambiguation--Filtering--Aggregation), a modular agentic baseline that decomposes aggregation querying into interpretable stages and exposes key failure modes related to ambiguity, filtering, and aggregation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Expert finding and Q&A systems · Topic Modeling
