Multi-Modal Multi-Agent Reinforcement Learning for Radiology Report Generation

Kaito Baba; Risa Kishikawa; Satoshi Kodera

arXiv:2603.16876·cs.CV·May 11, 2026

Multi-Modal Multi-Agent Reinforcement Learning for Radiology Report Generation

Kaito Baba, Risa Kishikawa, Satoshi Kodera

PDF

TL;DR

MARL-Rad is a multi-modal reinforcement learning framework that enhances radiology report generation by jointly optimizing region-specific and global agents within the clinical workflow.

Contribution

It introduces a novel multi-agent reinforcement learning approach that directly optimizes report quality in a clinical setting, surpassing fixed LLM-based methods.

Findings

01

Achieves state-of-the-art scores on RadGraph, CheXbert, and GREEN metrics.

02

Improves report consistency and detail accuracy.

03

Clinicians find reports produced by MARL-Rad clinically comparable to ground truth.

Abstract

We propose MARL-Rad, a multi-modal multi-agent reinforcement learning framework for radiology report generation that trains the entire agentic system on policy within its deployed radiology workflow. MARL-Rad addresses the limitation of post-hoc agentization, where fixed LLMs are organized into hand-designed agentic workflows without being optimized for their assigned roles. Our framework decomposes chest X-ray interpretation into region-specific agents and a global integrating agent, and jointly optimizes them using clinically verifiable rewards. Experiments on the MIMIC-CXR and IU X-ray datasets show that MARL-Rad consistently improves clinical efficacy metrics such as RadGraph, CheXbert, and GREEN scores, achieving state-of-the-art clinical efficacy performance. Further analyses show that MARL-Rad improves laterality consistency and produces more accurate and detailed reports. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.