Predicting Retrieval Utility and Answer Quality in Retrieval-Augmented Generation
Fangzheng Tian, Debasis Ganguly, Craig Macdonald

TL;DR
This paper introduces methods to predict the usefulness of retrieved documents and the quality of answers in retrieval-augmented generation, improving the ability to estimate RAG performance using various features.
Contribution
It proposes new prediction tasks (RPP and GPP) and combines multiple feature categories to enhance prediction accuracy in RAG systems.
Findings
Combining multiple feature categories improves prediction accuracy.
Reader-centric features like perplexity enhance predictions.
Topical relevance correlates with retrieval utility.
Abstract
The quality of answers generated by large language models (LLMs) in retrieval-augmented generation (RAG) is largely influenced by the contextual information contained in the retrieved documents. A key challenge for improving RAG is to predict both the utility of retrieved documents -- quantified as the performance gain from using context over generation without context -- and the quality of the final answers in terms of correctness and relevance. In this paper, we define two prediction tasks within RAG. The first is retrieval performance prediction (RPP), which estimates the utility of retrieved documents. The second is generation performance prediction (GPP), which estimates the final answer quality. We hypothesise that in RAG, the topical relevance of retrieved documents correlates with their utility, suggesting that query performance prediction (QPP) approaches can be adapted for RPP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Text Readability and Simplification
