End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning
Qiaoyu Zheng, Yuze Sun, Chaoyi Wu, Weike Zhao, Pengcheng Qiu, Yongguo Yu, Kun Sun, Jian Zhang, Yanfeng Wang, Ya Zhang, Weidi Xie

TL;DR
This paper presents Deep-DxSearch, an end-to-end reinforcement learning-trained agentic RAG system that improves traceable diagnostic reasoning in healthcare by effectively integrating large-scale biomedical data and outperforming existing models.
Contribution
The paper introduces Deep-DxSearch, a novel RL-trained agentic RAG system that enhances diagnostic reasoning and accuracy in healthcare applications.
Findings
Outperforms prompt-engineering and training-free RAG methods.
Achieves 22.7% higher accuracy on benchmarks.
Increases physicians' diagnostic accuracy from 45.6% to 69.1%.
Abstract
The integration of Large Language Models (LLMs) into healthcare is constrained by knowledge limitations, hallucinations, and a disconnect from Evidence-Based Medicine (EBM). While Retrieval-Augmented Generation (RAG) offers a solution, current systems often rely on static workflows that miss the iterative, hypothetico-deductive reasoning of clinicians. To address this, we introduce Deep-DxSearch, an agentic RAG system trained end-to-end via reinforcement learning (RL) for traceable diagnostic reasoning. Deep-DxSearch acts as an active investigator, treating the LLM as an agent within an environment of 16,000+ guideline-derived disease profiles, 150,000+ patient records for case-based reasoning, and over 27 million biomedical documents. Using soft verifiable rewards that co-optimize retrieval and reasoning, the model learns to formulate queries, evaluate evidence, and refine searches to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Topic Modeling
