ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Yu Sun, Xingyu Qian, Weiwen Xu, Hao Zhang, Chenghao Xiao, Long Li, Deli Zhao, Wenbing Huang, Tingyang Xu, Qifeng Bai, Yu Rong

TL;DR
ReasonMed is the largest medical reasoning dataset with 370,000 examples, created through multi-agent generation and refinement, enabling training of models that outperform previous benchmarks in medical question answering.
Contribution
This paper introduces ReasonMed, a large-scale, high-quality medical reasoning dataset created via a multi-agent process, and demonstrates its effectiveness in training models that surpass existing benchmarks.
Findings
Models trained on ReasonMed outperform previous models on medical QA benchmarks.
Integrating detailed CoT reasoning with concise answers improves model robustness.
Scaling ReasonMed to larger models maintains high performance and scalability.
Abstract
Reasoning-based large language models have excelled in mathematics and programming, yet their potential in knowledge-intensive medical question answering remains underexplored and insufficiently validated in clinical contexts. To bridge this gap, we introduce ReasonMed, the largest medical reasoning dataset to date, comprising 370k high-quality examples distilled from 1.75 million initial reasoning paths generated by complementary LLMs and curated through a cost-efficient easy-medium-difficult (EMD) pipeline. ReasonMed is built through a multi-agent generation, verification, and refinement process, in which an Error Refiner improves reasoning paths by correcting error-prone steps identified by a verifier. Using ReasonMed, we investigate effective strategies for training medical reasoning models and find that integrating detailed CoT reasoning with concise answer summaries yields the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare
