Universal CT Representations from Anatomy to Disease Phenotype through Agglomerative Pretraining

Yuheng Li; Yuan Gao; Haoyu Dong; Yuxiang Lai; Shansong Wang; Mojtaba Safari; James E. Baciak; and Xiaofeng Yang

arXiv:2605.21906·cs.CV·May 22, 2026

Universal CT Representations from Anatomy to Disease Phenotype through Agglomerative Pretraining

Yuheng Li, Yuan Gao, Haoyu Dong, Yuxiang Lai, Shansong Wang, Mojtaba Safari, James E. Baciak, and Xiaofeng Yang

PDF

1 Repo 1 Models

TL;DR

FlexiCT is a comprehensive CT foundation model trained through agglomerative pretraining on a large dataset, enabling versatile analysis and capturing disease-related features.

Contribution

This work introduces FlexiCT, a novel large-scale CT foundation model trained via multi-stage pretraining for diverse medical imaging tasks.

Findings

01

FlexiCT matches or exceeds prior methods on multiple benchmarks.

02

Embeddings organize scans along tumor stage gradients.

03

Supports slice, volume, and vision-language analysis.

Abstract

Computed tomography (CT) is a central to three-dimensional medical imaging, yet CT-based artificial intelligence remains fragmented across task-specific models for segmentation, classification, registration, and report analysis. Here we present FlexiCT, a family of CT foundation models trained by agglomerative continual pretraining on 266,227 CT volumes from 56 publicly available datasets, forming a large-scale public resource for CT representation learning. FlexiCT uses agglomerative pretraining across three stages: two-dimensional axial pretraining, three-dimensional anatomical pretraining and report-guided semantic alignment. This training strategy supports slice-level, volume-level and vision-language analysis. Across five downstream task families (segmentation, classification, registration, vision-language understanding and clinical retrieval), FlexiCT matches or exceeds prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ricklisz/FlexiCT
github

Models

🤗
ricklisz123/FlexiCT
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.