MedScribe: Clinically Grounded CT Reporting through Agentic Workflows

Giuseppe A. Orlando; Paolo Papotti; Maria A. Zuluaga; Olivier Humbert; Marco Lorenzi

arXiv:2605.01779·cs.CV·May 5, 2026

MedScribe: Clinically Grounded CT Reporting through Agentic Workflows

Giuseppe A. Orlando, Paolo Papotti, Maria A. Zuluaga, Olivier Humbert, Marco Lorenzi

PDF

TL;DR

MedScribe is a novel framework that improves CT report generation by iteratively gathering localized evidence through pathology-specific tools, enhancing accuracy and grounding without fine-tuning.

Contribution

It introduces a hypothesis-driven, sequential decision process for CT report generation that explicitly accumulates evidence, reducing hallucinations and improving clinical reliability.

Findings

01

Outperforms state-of-the-art models on CT-RATE and RadChestCT datasets.

02

Enforces fine-grained grounding and reduces unsupported claims.

03

Improves clinical accuracy, factual consistency, and interpretability.

Abstract

Vision-language models (VLMs) have shown potential for automated radiology report generation, yet existing approaches rely on global embedding compression of volumetric data, often leading to hallucinated findings and limited anatomical grounding in 3D CT imaging. We introduce MedScribe, a hypothesis-driven framework that reformulates report generation as an iterative evidence acquisition process rather than a single-pass encoding task. MedScribe models reporting as a sequential decision process in which a large language model dynamically invokes pathology-specific diagnostic tools to extract localized volumetric features. These structured features are used to query a multidimensional retrieval space aligned with pathology-specific textual evidence. By explicitly accumulating quantitative evidence prior to synthesis, the framework enforces fine-grained grounding and reduces unsupported…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.