TL;DR
This paper introduces a training approach enabling large language models to reliably attribute information to specific documents without test-time retrieval, enhancing citation accuracy and robustness.
Contribution
It proposes Active Indexing, a novel training method that improves LLMs' ability to generate and attribute citations from pretraining data, eliminating the need for external retrieval.
Findings
Active Indexing outperforms Passive Indexing with up to 30.2% citation precision gains.
Scaling augmented data improves citation performance.
Internal citations increase robustness to retrieval noise.
Abstract
Trustworthy language models should provide both correct and verifiable answers. However, citations generated directly by standalone LLMs are often unreliable. As a result, current systems insert citations by querying an external retriever at inference time, introducing latency, infrastructure dependence, and vulnerability to retrieval noise. We explore whether LLMs can be made to reliably attribute to the documents seen during continual pretraining without test-time retrieval, by revising the training process. To study this, we construct CitePretrainBench, a benchmark that mixes real-world corpora (Wikipedia, Common Crawl, arXiv) with novel documents and probes both short-form (single-fact) and long-form (multi-fact) citation tasks. Our approach follows a two-stage process: (1) continual pretraining to index factual knowledge by binding it to persistent document identifiers; and (2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
