Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning

Shaohao Rui; Kaitao Chen; Weijie Ma; Xiaosong Wang

arXiv:2505.19213·cs.AI·May 27, 2025

Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning

Shaohao Rui, Kaitao Chen, Weijie Ma, Xiaosong Wang

PDF

Open Access

TL;DR

This paper presents MedCCO, a curriculum-aware reinforcement learning framework that improves medical visual question answering by unifying close-ended and open-ended tasks, leading to better reasoning and generalization.

Contribution

MedCCO is the first multimodal RL framework for medical VQA that integrates close-ended and open-ended data through curriculum learning, enhancing reasoning and interpretability.

Findings

01

Achieves 11.4% accuracy gain on in-domain tasks.

02

Improves 5.7% performance on out-of-domain benchmarks.

03

Enhances reasoning and generalization in medical VQA models.

Abstract

Recent advances in reinforcement learning with verifiable, rule-based rewards have greatly enhanced the reasoning capabilities and out-of-distribution generalization of VLMs/LLMs, obviating the need for manually crafted reasoning chains. Despite these promising developments in the general domain, their translation to medical imaging remains limited. Current medical reinforcement fine-tuning (RFT) methods predominantly focus on close-ended VQA, thereby restricting the model's ability to engage in world knowledge retrieval and flexible task adaptation. More critically, these methods fall short of addressing the critical clinical demand for open-ended, reasoning-intensive decision-making. To bridge this gap, we introduce \textbf{MedCCO}, the first multimodal reinforcement learning framework tailored for medical VQA that unifies close-ended and open-ended data within a curriculum-driven RFT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning

MethodsFocus · Sparse Evolutionary Training