Claim-Selective Certification for High-Risk Medical Retrieval-Augmented Generation

Shao Kan

arXiv:2605.21949·cs.CL·May 22, 2026

Claim-Selective Certification for High-Risk Medical Retrieval-Augmented Generation

Shao Kan

PDF

TL;DR

This paper introduces claim-selective certification for high-risk medical retrieval-augmented generation, decomposing responses into claims scored against evidence and mapped to actions, improving verification and trustworthiness.

Contribution

It proposes a novel claim-selective certification framework that decomposes responses into verifiable claims and maps them to actions, enhancing evaluation in high-risk medical QA systems.

Findings

01

Achieved UCCR=0.0000 on dev and test sets, indicating no unsupported claims.

02

High action accuracy of over 90% demonstrates effective claim-action mapping.

03

Source-missing counterfactuals evaluate abstain behavior under empty evidence.

Abstract

Medical RAG systems in high-risk QA settings are often evaluated through a single answer-or-abstain decision, but mixed evidence may support one claim, require conditions for another, and contradict a third. We study claim-selective certification: each response is decomposed into verifiable claims, scored against retrieved evidence, and mapped by an intent-aware selector to {full, partial, conflict, abstain}. On the primary weak-label certificate protocol, whose real-source-only dev/test rows cover the naturally occurring non-abstain actions, the full system records UCCR=0.0000, PAU=1.0000, PAU Precision=0.9901, and action accuracy=0.9204 on dev (n=314), and UCCR=0.0000, PAU=0.9967, PAU Precision=0.9739, and action accuracy=0.8997 on test (n=319). UCCR measures unsupported-claim risk within the certificate definition, and a source-missing counterfactual slice evaluates abstain under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.