EXACT: an explainable anomaly-aware vision foundation model for analysis of 3D chest CT

Xuguang Bai; Mingxuan Liu; Tongxi Song; Yifei Chen; Hongjia Yang; Kasidit Anmahapong; Zihan Li; Ying Zhou; Qiyuan Tian

arXiv:2604.24146·cs.CV·April 28, 2026

EXACT: an explainable anomaly-aware vision foundation model for analysis of 3D chest CT

Xuguang Bai, Mingxuan Liu, Tongxi Song, Yifei Chen, Hongjia Yang, Kasidit Anmahapong, Zihan Li, Ying Zhou, Qiyuan Tian

PDF

TL;DR

EXACT is a novel 3D chest CT foundation model that learns spatially detailed, explainable representations from scans and reports, improving diagnosis, anomaly localization, and report generation without manual annotations.

Contribution

It introduces a weakly supervised, anatomy-aware training method for spatially resolved anomaly detection in chest CTs, outperforming existing models in multiple clinical tasks.

Findings

01

EXACT improves multi-disease diagnosis accuracy.

02

It enables zero-shot anomaly localization.

03

The model enhances report generation with visual grounding.

Abstract

Chest computed tomography (CT) is central to the detection and management of thoracic disease, yet the growing scale and complexity of volumetric imaging increasingly exceed what can be addressed by scan-level prediction alone. Clinically useful AI for CT must not only recognize disease across the whole volume, but also localize abnormalities and provide interpretable visual evidence. Existing vision-language foundation models typically compress scans and reports into global image-text representations, limiting their ability to preserve spatial evidence and support clinically meaningful interpretation. Here we developed EXACT, an explainable anomaly-aware foundation model for three-dimensional chest CT that learns spatially resolved representations from paired clinical scans and radiology reports. EXACT was pre-trained on 25,692 CT-reports pairs using anatomy-aware weak supervision,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.