Med-R$^3$: Enhancing Medical Retrieval-Augmented Reasoning of LLMs via Progressive Reinforcement Learning

Keer Lu; Zheng Liang; Youquan Li; Jiejun Tan; Xili Wang; Da Pan; Shusen Zhang; Guosheng Dong; Bin Cui; Yunhuai Liu; Wentao Zhang

arXiv:2507.23541·cs.CL·January 21, 2026

Med-R$^3$: Enhancing Medical Retrieval-Augmented Reasoning of LLMs via Progressive Reinforcement Learning

Keer Lu, Zheng Liang, Youquan Li, Jiejun Tan, Xili Wang, Da Pan, Shusen Zhang, Guosheng Dong, Bin Cui, Yunhuai Liu, Wentao Zhang

PDF

Open Access

TL;DR

Med-R$^3$ introduces a progressive reinforcement learning framework that jointly optimizes retrieval and reasoning in medical language models, significantly improving their performance on medical reasoning tasks.

Contribution

This work presents the first joint optimization approach for retrieval and reasoning in medical LLMs using progressive reinforcement learning, enhancing generalization and domain-specific reasoning.

Findings

01

Achieves state-of-the-art performance on medical reasoning benchmarks.

02

LLaMA3.1-8B-Instruct + Med-R$^3$ surpasses GPT-4o-mini by 3.93%.

03

Qwen2.5-14B + Med-R$^3$ improves by 13.53%.

Abstract

In medical scenarios, effectively retrieving external knowledge and leveraging it for rigorous logical reasoning is of significant importance. Despite their potential, existing work has predominantly focused on enhancing either retrieval or reasoning capabilities of the models in isolation, with little attention given to their joint optimization, which leads to limited coordination between the two processes. Additionally, current methods rely heavily on supervised fine-tuning (SFT), which can cause models to memorize existing problem-solving pathways, thereby restricting their generalization ability when confronted with novel problem contexts. Furthermore, while some studies have explored to improve retrieval-augmented reasoning in general domains via reinforcement learning, their reward function designs do not adequately capture the specific demands of the medical domain. To address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Machine Learning in Healthcare · Topic Modeling