Reasoning Visual Language Model for Chest X-Ray Analysis

Andriy Myronenko; Dong Yang; Baris Turkbey; Mariam Aboian; Sena Azamat; Esra Akcicek; Hongxu Yin; Pavlo Molchanov; Marc Edgar; Yufan He; Pengfei Guo; Yucheng Tang; Daguang Xu

arXiv:2510.23968·cs.CV·October 31, 2025

Reasoning Visual Language Model for Chest X-Ray Analysis

Andriy Myronenko, Dong Yang, Baris Turkbey, Mariam Aboian, Sena Azamat, Esra Akcicek, Hongxu Yin, Pavlo Molchanov, Marc Edgar, Yufan He, Pengfei Guo, Yucheng Tang, Daguang Xu

PDF

TL;DR

This paper introduces a reasoning-based vision-language model for chest X-ray analysis that provides transparent, step-by-step explanations of its diagnoses, enhancing interpretability, trustworthiness, and clinical utility.

Contribution

It presents a novel framework that incorporates chain-of-thought reasoning into chest X-ray interpretation, aligning model reasoning with radiological workflows and expert reasoning patterns.

Findings

01

Achieves competitive accuracy in multi-label classification

02

Improves interpretability and reasoning transparency

03

Enhances radiologist confidence and error auditing

Abstract

Vision-language models (VLMs) have shown strong promise for medical image analysis, but most remain opaque, offering predictions without the transparent, stepwise reasoning clinicians rely on. We present a framework that brings chain-of-thought (CoT) reasoning to chest X-ray interpretation. Inspired by reasoning-first training paradigms, our approach is designed to learn how experts reason, not just what they conclude, by aligning intermediate steps with observable image evidence and radiology workflow. Beyond accuracy, the explicit reasoning traces support clinical auditability: they reveal why a conclusion was reached, which alternatives were considered, and where uncertainty remains, enabling quality assurance, error analysis, and safer human-AI collaboration. Our model couples high-fidelity visual encoding with a two-stage training recipe: a reasoning-style supervised fine-tuning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.