Towards Dependable Retrieval-Augmented Generation Using Factual Confidence Prediction
Florian Geissler, Francesco Carella, Laura Fieback, and Jakob Spiegelberg

TL;DR
This paper introduces a two-stage method to improve the factual reliability of retrieval-augmented language models by predicting the factual confidence of retrieved information and generated answers.
Contribution
It proposes a novel approach combining conformal prediction and an attention-based classifier to assess and enhance the factual faithfulness of RAG outputs.
Findings
Conformal prediction can improve answer quality by up to 6%.
The attention-based classifier detects inconsistent answers with up to 77% accuracy.
Diagnostic metrics help determine the suitability of retriever setups.
Abstract
Incorporating specific knowledge into large language models via retrieval-augmented generation (RAG) is a widespread technique that fuels many of today's industry AI applications. A fundamental problem is to assess if the context retrieved by some similarity search provides indeed supporting facts, or instead misguides the generator with irrelevant information. It is critical to associate meaningful confidence measures about the factuality of the retrieval process with the generated answers. We present a new, two-staged approach to predict fact faithfulness of the output of retrieval-augmented generations. First, we employ conformal prediction to select only those retrieved chunks who have a high chance to come from the correct source. This approach in itself can improve answer quality by up to 6% in some of the studied datasets, however, the associated statistical guarantees do not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
