CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation

Sonali Sharma; Jin Long; George Shih; Sarah Eid; Christian Bluethgen; Francine L. Jacobson; Emily B. Tsai; Global Radiology Consortium; Ahmed M. Alaa; Curtis P. Langlotz

arXiv:2604.26288·cs.CV·May 1, 2026

CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation

Sonali Sharma, Jin Long, George Shih, Sarah Eid, Christian Bluethgen, Francine L. Jacobson, Emily B. Tsai, Global Radiology Consortium, Ahmed M. Alaa, Curtis P. Langlotz

PDF

TL;DR

CheXthought is a comprehensive multimodal dataset of chest X-ray reasoning traces and visual attention annotations, designed to improve AI clinical reasoning, interpretability, and accuracy in medical imaging.

Contribution

It introduces a large, multimodal dataset with reasoning and attention data, enabling advancements in AI interpretability and clinical reasoning in chest X-ray analysis.

Findings

01

CheXthought outperforms existing models in accuracy and spatial grounding.

02

Visual attention data reduces hallucinations and recovers missed findings.

03

Models trained on CheXthought improve pathology classification and uncertainty communication.

Abstract

Chest X-ray interpretation is one of the most frequently performed diagnostic tasks in medicine and a primary target for AI development, yet current vision-language models are primarily trained on datasets of paired images and reports, not the cognitive processes and visual attention that underlie clinical reasoning. Here, we present CheXthought, a global, multimodal resource containing 103,592 chain-of-thought reasoning traces and 6,609,082 synchronized visual attention annotations across 50,312 multi-read chest X-rays from 501 radiologists in 71 countries. Our analysis reveals clinical reasoning patterns in how experts deploy distinct visual search strategies, integrate clinical context, and communicate uncertainty. We demonstrate the clinical utility of CheXthought across four dimensions. First, CheXthought reasoning significantly outperforms state-of-the-art vision-language model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.