EXACT: an explainable anomaly-aware vision foundation model for analysis of 3D chest CT
Xuguang Bai, Mingxuan Liu, Tongxi Song, Yifei Chen, Hongjia Yang, Kasidit Anmahapong, Zihan Li, Ying Zhou, Qiyuan Tian

TL;DR
EXACT is a novel 3D chest CT foundation model that learns spatially detailed, explainable representations from scans and reports, improving diagnosis, anomaly localization, and report generation without manual annotations.
Contribution
It introduces a weakly supervised, anatomy-aware training method for spatially resolved anomaly detection in chest CTs, outperforming existing models in multiple clinical tasks.
Findings
EXACT improves multi-disease diagnosis accuracy.
It enables zero-shot anomaly localization.
The model enhances report generation with visual grounding.
Abstract
Chest computed tomography (CT) is central to the detection and management of thoracic disease, yet the growing scale and complexity of volumetric imaging increasingly exceed what can be addressed by scan-level prediction alone. Clinically useful AI for CT must not only recognize disease across the whole volume, but also localize abnormalities and provide interpretable visual evidence. Existing vision-language foundation models typically compress scans and reports into global image-text representations, limiting their ability to preserve spatial evidence and support clinically meaningful interpretation. Here we developed EXACT, an explainable anomaly-aware foundation model for three-dimensional chest CT that learns spatially resolved representations from paired clinical scans and radiology reports. EXACT was pre-trained on 25,692 CT-reports pairs using anatomy-aware weak supervision,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
