Process-Supervised Multi-Agent Reinforcement Learning for Reliable Clinical Reasoning
Chaeeun Lee, T. Michael Yates, Pasquale Minervini, T. Ian Simpson

TL;DR
This paper presents a multi-agent reinforcement learning framework for clinical reasoning that emphasizes process-grounded decision-making, improving both outcome accuracy and process fidelity in gene-disease validity curation.
Contribution
It introduces a process-supervised multi-agent RL approach with hierarchical coordination, enhancing clinical reasoning alignment and accuracy over outcome-only methods.
Findings
Outcome accuracy improved from 0.195 to 0.732 with outcome-only rewards.
Process fidelity increased from 0.392 to 0.520 with combined process and outcome rewards.
Hierarchical multi-agent system effectively balances reasoning quality and decision accuracy.
Abstract
Clinical decision-making requires nuanced reasoning over heterogeneous evidence and traceable justifications. While recent LLM multi-agent systems (MAS) show promise, they largely optimise for outcome accuracy while overlooking process-grounded reasoning aligned with clinical standards. One critical real-world case of this is gene-disease validity curation, where experts must determine whether a gene is causally implicated in a disease by synthesising diverse biomedical evidence. We introduce an agent-as-tool reinforcement learning framework for this task with two objectives: (i) process-level supervision to ensure reasoning follows valid clinical pathways, and (ii) efficient coordination via a hierarchical multi-agent system. Our evaluation on the ClinGen dataset shows that with outcome-only rewards, MAS with a GRPO-trained Qwen3-4B supervisor agent substantially improves final outcome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
