Med-REFL: Medical Reasoning Enhancement via Self-Corrected Fine-grained Reflection

Zongxian Yang; Jiayu Qian; Zegao Peng; Haoyu Zhang; Yu-An Huang; KC Tan; Zhi-An Huang

arXiv:2506.13793·cs.AI·February 26, 2026

Med-REFL: Medical Reasoning Enhancement via Self-Corrected Fine-grained Reflection

Zongxian Yang, Jiayu Qian, Zegao Peng, Haoyu Zhang, Yu-An Huang, KC Tan, Zhi-An Huang

PDF

Open Access 1 Repo 4 Models 1 Datasets

TL;DR

Med-REFL introduces a self-correcting framework for medical reasoning models that automatically generates reflection data, significantly improving accuracy and reliability in high-stakes medical AI applications.

Contribution

It presents a novel, label-free reflection learning method that enhances reasoning accuracy by automatically assessing and correcting model fallacies in medical domains.

Findings

01

Boosts performance of Llama3.1-8B by +5.82% on MedQA

02

Achieves state-of-the-art results with Med-REFL-8B among 7-8B models

03

Generalizes to logical reasoning and reduces fake reflection phenomena

Abstract

Large reasoning models excel in domains like mathematics where intermediate reasoning is straightforward to verify, but struggle to self-correct in medicine fields where evaluating intermediate reasoning is cumbersome and expensive. This verification bottleneck hinders the development of reliable AI reasoners for high-stakes application. Here we propose Med-REFL, a novel framework that learns fine-grained reflection without human labels or model distillation. Med-REFL introduces a deterministic structural assessment of the reasoning space to automatically generate preference data for reflection. By globally evaluating all explored reasoning paths in a tree-of-thoughts, our method quantifies the value of corrective actions, enabling the automated construction of direct preference optimization pairs. This trains the model to recognize and amend its own reasoning fallacies. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tianyin123/med-refl
noneOfficial

Models

Datasets

HANI-LAB/Med-REFL-DPO
dataset· 13 dl
13 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies

MethodsFocus