TotalFM: An Organ-Separated Framework for 3D-CT Vision Foundation Models
Kohei Yamamoto, Tomohiro Kikuchi

TL;DR
TotalFM introduces an organ-separated framework for 3D-CT vision models that improves efficiency and generalization in clinical tasks by leveraging large-scale data, segmentation, and contrastive learning.
Contribution
The paper presents a novel organ-separated learning framework for 3D-CT foundation models, combining segmentation, large language models, and contrastive learning for improved clinical task performance.
Findings
Outperforms CT-CLIP and Merlin in zero-shot organ-wise lesion classification.
Achieves higher AUROC in zero-shot finding-wise lesion classification.
Comparable performance to existing Vision-Language Models in report generation.
Abstract
While foundation models in radiology are expected to be applied to various clinical tasks, computational cost constraints remain a major challenge when training on 3D-CT volumetric data. In this study, we propose TotalFM, a radiological foundation model that efficiently learns the correspondence between 3D-CT images and linguistic expressions based on the concept of organ separation, utilizing a large-scale dataset of 140,000 series. By automating the creation of organ volume and finding-sentence pairs through segmentation techniques and Large Language Model (LLM)-based radiology report processing, and by combining self-supervised pre-training via VideoMAE with contrastive learning using volume-text pairs, we aimed to balance computational efficiency and representation capability. In zero-shot organ-wise lesion classification tasks, the proposed model achieved higher F1 scores in 83%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Artificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging
