Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage
Saron Samuel, Alexander Martin, Eugene Yang, Andrew Yates, Dawn Lawrie, Laura Dietz, Benjamin Van Durme

TL;DR
This paper systematically studies how retrieval metrics relate to the information coverage of generated responses in RAG systems, across multiple benchmarks and retrieval stacks.
Contribution
It provides empirical evidence that retrieval metrics can serve as reliable proxies for RAG response coverage, especially when aligned with generation objectives.
Findings
Strong correlation between retrieval coverage metrics and generated response coverage.
Alignment of retrieval objectives with generation goals enhances this correlation.
Complex iterative RAG pipelines can weaken the retrieval-coverage relationship.
Abstract
Retrieval-augmented generation (RAG) systems combine document retrieval with a generative model to address complex information seeking tasks like report generation. While the relationship between retrieval quality and generation effectiveness seems intuitive, it has not been systematically studied. We investigate whether upstream retrieval metrics can serve as reliable early indicators of the final generated response's information coverage. Through experiments across two text RAG benchmarks (TREC NeuCLIR 2024 and TREC RAG 2024) and one multimodal benchmark (WikiVideo), we analyze 15 text retrieval stacks and 10 multimodal retrieval stacks across four RAG pipelines and multiple evaluation frameworks (Auto-ARGUE and MiRAGE). Our findings demonstrate strong correlations between coverage-based retrieval metrics and nugget coverage in generated responses at both topic and system levels. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
