MedScribe: Clinically Grounded CT Reporting through Agentic Workflows
Giuseppe A. Orlando, Paolo Papotti, Maria A. Zuluaga, Olivier Humbert, Marco Lorenzi

TL;DR
MedScribe is a novel framework that improves CT report generation by iteratively gathering localized evidence through pathology-specific tools, enhancing accuracy and grounding without fine-tuning.
Contribution
It introduces a hypothesis-driven, sequential decision process for CT report generation that explicitly accumulates evidence, reducing hallucinations and improving clinical reliability.
Findings
Outperforms state-of-the-art models on CT-RATE and RadChestCT datasets.
Enforces fine-grained grounding and reduces unsupported claims.
Improves clinical accuracy, factual consistency, and interpretability.
Abstract
Vision-language models (VLMs) have shown potential for automated radiology report generation, yet existing approaches rely on global embedding compression of volumetric data, often leading to hallucinated findings and limited anatomical grounding in 3D CT imaging. We introduce MedScribe, a hypothesis-driven framework that reformulates report generation as an iterative evidence acquisition process rather than a single-pass encoding task. MedScribe models reporting as a sequential decision process in which a large language model dynamically invokes pathology-specific diagnostic tools to extract localized volumetric features. These structured features are used to query a multidimensional retrieval space aligned with pathology-specific textual evidence. By explicitly accumulating quantitative evidence prior to synthesis, the framework enforces fine-grained grounding and reduces unsupported…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
