CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making
Songtao Jiang, Yuan Wang, Ruizhe Chen, Yan Zhang, Ruilin Luo, Bohan Lei, Sibo Song, Yang Feng, Jimeng Sun, Jian Wu, Zuozhu Liu

TL;DR
This paper introduces CAPO, a reinforcement learning framework for medical vision-language models that improves reasoning consistency and accuracy in medical visual question answering, supported by a new dataset and extensive experiments.
Contribution
The paper presents Med-Zero-17K dataset and a novel RL framework CAPO that enhances reasoning fidelity and answer accuracy in medical VQA tasks.
Findings
CAPO outperforms strong VLM baselines in experiments.
Demonstrates strong generalization to 3D Med-VQA benchmarks.
Effective in both in-domain and out-of-domain scenarios.
Abstract
In medical visual question answering (Med-VQA), achieving accurate responses relies on three critical steps: precise perception of medical imaging data, logical reasoning grounded in visual input and textual questions, and coherent answer derivation from the reasoning process. Recent advances in general vision-language models (VLMs) show that large-scale reinforcement learning (RL) could significantly enhance both reasoning capabilities and overall model performance. However, their application in medical domains is hindered by two fundamental challenges: 1) misalignment between perceptual understanding and reasoning stages, and 2) inconsistency between reasoning pathways and answer generation, both compounded by the scarcity of high-quality medical datasets for effective large-scale RL. In this paper, we first introduce Med-Zero-17K, a curated dataset for pure RL-based training,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Electronic Health Records Systems
