Incorporating Q&A Nuggets into Retrieval-Augmented Generation

Laura Dietz; Bryan Li; Gabrielle Liu; Jia-Huei Ju; Eugene Yang; Dawn Lawrie; William Walden; James Mayfield

arXiv:2601.13222·cs.IR·March 30, 2026

Incorporating Q&A Nuggets into Retrieval-Augmented Generation

Laura Dietz, Bryan Li, Gabrielle Liu, Jia-Huei Ju, Eugene Yang, Dawn Lawrie, William Walden, James Mayfield

PDF

1 Datasets

TL;DR

Crucible is a novel retrieval-augmented generation system that uses explicit Q&A nuggets from documents to improve citation accuracy and reasoning clarity, outperforming previous systems on the TREC NeuCLIR 2024 dataset.

Contribution

We introduce Crucible, a system that incorporates Q&A nuggets into RAG, enhancing citation provenance and interpretability in generated content.

Findings

01

Crucible significantly outperforms Ginger in nugget recall.

02

Crucible achieves higher nugget density and better citation grounding.

03

Evaluation on TREC NeuCLIR 2024 demonstrates improved performance.

Abstract

RAGE systems integrate ideas from automatic evaluation (E) into Retrieval-augmented Generation (RAG). As one such example, we present Crucible, a Nugget-Augmented Generation System that preserves explicit citation provenance by constructing a bank of Q&A nuggets from retrieved documents and uses them to guide extraction, selection, and report generation. Reasoning on nuggets avoids repeated information through clear and interpretable Q&A semantics - instead of opaque cluster abstractions - while maintaining citation provenance throughout the entire generation process. Evaluated on the TREC NeuCLIR 2024 collection, our Crucible system substantially outperforms Ginger, a recent nugget-based RAG system, in nugget recall, density, and citation grounding.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

molmohsen/awesome-ai-agent-papers
dataset· 39 dl
39 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.