MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments

Yicheng Gao; Xiaolin Zhou; Yahan Li; Yue Zhao; Ruishan Liu

arXiv:2605.07058·cs.CL·May 11, 2026

MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments

Yicheng Gao, Xiaolin Zhou, Yahan Li, Yue Zhao, Ruishan Liu

PDF

TL;DR

This paper introduces MedExAgent, a reinforcement learning-based system trained to perform interactive, noisy, and uncertain clinical diagnosis by asking questions, ordering exams, and diagnosing, modeled as a POMDP.

Contribution

It formalizes clinical diagnosis as a POMDP with noise models and trains an agent using supervised fine-tuning and DAPO to optimize diagnostic accuracy and cost-efficiency.

Findings

01

MedExAgent achieves diagnostic performance comparable to larger models.

02

The system effectively balances diagnostic accuracy with exam costs.

03

Extensive experiments validate the robustness of MedExAgent in noisy environments.

Abstract

Real-world clinical diagnosis is a complex process in which the doctor is required to obtain information from both interaction with the patient and conducting medical exams. Additionally, the doctor needs to adapt to different patient personas, as well as noisy and incomplete information that can happen at any time during the process. However, existing benchmarks for medical LLMs and methods for automatic diagnosis largely simplify this process by reducing it to single-turn question answering, noise-free conversations, or sequential exam making, etc., ignoring the interactive and uncertain nature of clinical diagnosis. In this paper, we aim to address this gap by formalizing clinical diagnosis as a Partially Observable Markov Decision Process (POMDP) with three action types: questioning the patient, ordering medical exams as tool calls, and issuing a diagnosis. We also introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.