Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards

Jaehoon Yun; Jiwoong Sohn; Jungwoo Park; Hyunjae Kim; Xiangru Tang; Yanjun Shao; Yonghoe Koo; Minhyeok Ko; Qingyu Chen; Mark Gerstein; Michael Moor; Jaewoo Kang

arXiv:2506.11474·cs.CL·September 23, 2025

Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards

Jaehoon Yun, Jiwoong Sohn, Jungwoo Park, Hyunjae Kim, Xiangru Tang, Yanjun Shao, Yonghoe Koo, Minhyeok Ko, Qingyu Chen, Mark Gerstein, Michael Moor, Jaewoo Kang

PDF

Open Access 2 Models 3 Datasets 1 Video

TL;DR

Med-PRM introduces a process reward framework that verifies each reasoning step in medical decision making against clinical guidelines, significantly improving accuracy and error localization in large language models.

Contribution

The paper presents Med-PRM, a novel retrieval-augmented process reward modeling approach that enhances medical reasoning accuracy by verifying intermediate steps with medical knowledge bases.

Findings

01

Achieves state-of-the-art performance on five medical QA benchmarks.

02

Improves base model performance by up to 13.50% with Med-PRM.

03

Attains over 80% accuracy on MedQA with small-scale models.

Abstract

Large language models have shown promise in clinical decision making, but current approaches struggle to localize and correct errors at specific steps of the reasoning process. This limitation is critical in medicine, where identifying and addressing reasoning errors is essential for accurate diagnosis and effective patient care. We introduce Med-PRM, a process reward modeling framework that leverages retrieval-augmented generation to verify each reasoning step against established medical knowledge bases. By verifying intermediate reasoning steps with evidence retrieved from clinical guidelines and literature, our model can precisely assess the reasoning quality in a fine-grained manner. Evaluations on five medical QA benchmarks and two open-ended diagnostic tasks demonstrate that Med-PRM achieves state-of-the-art performance, with improving the performance of base models by up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards· underline

Taxonomy

TopicsSemantic Web and Ontologies · Biomedical Text Mining and Ontologies · AI-based Problem Solving and Planning

MethodsBalanced Selection