SURE-RAG: Sufficiency and Uncertainty-Aware Evidence Verification for Selective Retrieval-Augmented Generation
Jingxi Qiu, Zeyu Han, Cheng Huang

TL;DR
SURE-RAG introduces a verification framework for retrieval-augmented generation that assesses evidence sufficiency and uncertainty, improving answer support and safety in multi-hop question answering.
Contribution
It presents a set-level evidence verification protocol with interpretable signals, enhancing transparency and safety in RAG systems.
Findings
Achieves 0.9075 Macro-F1 on HotpotQA-RAG v3, outperforming baselines.
Reduces unsafe answer risk by 37% at 30% coverage.
Demonstrates that sufficiency verification and hallucination detection are distinct tasks.
Abstract
Retrieval-augmented generation (RAG) grounds answers in retrieved passages, but retrieval is not verification: a passage can be topical and still fail to justify the answer. We frame this gap as evidence sufficiency verification for selective RAG answering: given a question, a candidate answer, and retrieved evidence, predict whether the evidence supports, refutes, or is insufficient, and abstain unless support is established. We present SURE-RAG, a transparent aggregation protocol built on the observation that evidence sufficiency is a set-level property: missing hops and unresolved conflicts cannot be detected by independent passage scoring. A shared pair-level claim-evidence verifier produces local relation distributions, which SURE-RAG aggregates into interpretable answer-level signals -- coverage, relation strength, disagreement, conflict, and retrieval uncertainty -- yielding a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
