VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

Weixin Liu; Congning Ni; Qingyuan Song; Susannah L. Rose; Christopher Symons; Murat Kantarcioglu; Bradley A. Malin; Zhijun Yin

arXiv:2603.10494·cs.CL·March 12, 2026

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

Weixin Liu, Congning Ni, Qingyuan Song, Susannah L. Rose, Christopher Symons, Murat Kantarcioglu, Bradley A. Malin, Zhijun Yin

PDF

Open Access

TL;DR

This paper introduces VERI-DPO, a method that improves clinical summarization by verifying claims against evidence and optimizing preferences to produce more faithful and informative summaries from electronic health records.

Contribution

It presents a novel approach combining claim verification with direct preference optimization to enhance the faithfulness and informativeness of clinical summaries.

Findings

01

Reduces unsupported claims from around 10-11% to below 7%.

02

Improves summary validity from 76.7% to 82.5%.

03

Maintains informative length while reducing contradictions.

Abstract

Brief Hospital Course (BHC) narratives must be clinically useful yet faithful to fragmented EHR evidence. LLM-based clinical summarizers still introduce unsupported statements, and alignment can encourage omissions ("say-less" degeneration). We introduce VERI-DPO, which uses claim verification to mine preferences and distill them into the summarizer with Direct Preference Optimization (DPO). On MIMIC-III-Ext-VeriFact-BHC (100 ICU patients; patient-level splits), we train a retrieval-augmented verifier to label claim-evidence pairs as Supported, Not Supported, or Not Addressed via a single-token format. The verifier scores sentence-level claims from sampled BHC candidates and aggregates margins into a coverage-aware utility to mine length-controlled, contradiction-anchored preference pairs. On held-out patients, verifier-mined preferences separate candidates by contradiction density, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare