Med-R$^3$: Enhancing Medical Retrieval-Augmented Reasoning of LLMs via Progressive Reinforcement Learning
Keer Lu, Zheng Liang, Youquan Li, Jiejun Tan, Xili Wang, Da Pan, Shusen Zhang, Guosheng Dong, Bin Cui, Yunhuai Liu, Wentao Zhang

TL;DR
Med-R$^3$ introduces a progressive reinforcement learning framework that jointly optimizes retrieval and reasoning in medical language models, significantly improving their performance on medical reasoning tasks.
Contribution
This work presents the first joint optimization approach for retrieval and reasoning in medical LLMs using progressive reinforcement learning, enhancing generalization and domain-specific reasoning.
Findings
Achieves state-of-the-art performance on medical reasoning benchmarks.
LLaMA3.1-8B-Instruct + Med-R$^3$ surpasses GPT-4o-mini by 3.93%.
Qwen2.5-14B + Med-R$^3$ improves by 13.53%.
Abstract
In medical scenarios, effectively retrieving external knowledge and leveraging it for rigorous logical reasoning is of significant importance. Despite their potential, existing work has predominantly focused on enhancing either retrieval or reasoning capabilities of the models in isolation, with little attention given to their joint optimization, which leads to limited coordination between the two processes. Additionally, current methods rely heavily on supervised fine-tuning (SFT), which can cause models to memorize existing problem-solving pathways, thereby restricting their generalization ability when confronted with novel problem contexts. Furthermore, while some studies have explored to improve retrieval-augmented reasoning in general domains via reinforcement learning, their reward function designs do not adequately capture the specific demands of the medical domain. To address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning in Healthcare · Topic Modeling
