CXRMate-2: Structured Multimodal Temporal Embeddings and Tractable Reinforcement Learning for Clinically Acceptable Chest X-ray Radiology Report Generation

Aaron Nicolson; Elizabeth J. Cooper; Hwan-Jin Yoon; Claire McCafferty; Ramya Krishnan; Michelle Craigie; Nivene Saad; Jason Dowling; Ian A. Scott; and Bevan Koopman

arXiv:2604.18967·cs.CV·May 5, 2026

CXRMate-2: Structured Multimodal Temporal Embeddings and Tractable Reinforcement Learning for Clinically Acceptable Chest X-ray Radiology Report Generation

Aaron Nicolson, Elizabeth J. Cooper, Hwan-Jin Yoon, Claire McCafferty, Ramya Krishnan, Michelle Craigie, Nivene Saad, Jason Dowling, Ian A. Scott, and Bevan Koopman

PDF

1 Models

TL;DR

CXRMate-2 is a novel CXR report generation model that combines structured multimodal embeddings and reinforcement learning to improve clinical relevance and radiologist acceptance.

Contribution

The paper introduces CXRMate-2, a new model that uses structured embeddings and RL for more clinically acceptable radiology report generation.

Findings

01

CXRMate-2 outperforms benchmarks on multiple datasets.

02

Generated reports are acceptable in 45% of cases compared to radiologists.

03

Radiologists preferred reports for readability, with similar acceptance rates for most findings.

Abstract

Chest X-ray (CXR) radiology report generation (RRG) models have shown rapid progress on automated metrics, yet their clinical utility remains uncertain due to limited qualitative evaluation by radiologists. We present CXRMate-2, a state-of-the-art CXR RRG model that enables tractable reinforcement learning (RL) through structured multimodal temporal embeddings and high-resolution visual feature compression, for efficient, unified conditioning of an LLM decoder on visual, textual, and temporal context from a study and its prior. This enables group relative policy optimisation (GRPO), where a proposed reward function is used to improve semantic alignment with radiologist reports. Across the MIMIC-CXR, CheXpert Plus, and ReXgradient datasets, CXRMate-2 achieves statistically significant improvements over strong benchmarks, including gains of 11.2% and 24.4% in GREEN and RadGraph-XL,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
aehrc/cxrmate-2
model· 2.1k dl
2.1k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.