Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement
Yuhan Wei, Yuting He, Linshan Wu, Fuxiang Huang, Junlin Hou, Hao Chen

TL;DR
This paper introduces RaSD, a synthetic data-based pre-training framework for medical image foundation models that achieves competitive performance across diverse tasks, reducing reliance on costly real datasets.
Contribution
RaSD is a scalable, synthetic data-driven pre-training method that enhances robustness and transferability of medical image models without real data dependence.
Findings
RaSD pre-trained models outperform from-scratch models on all tasks.
RaSD achieves top performance on 17 out of 56 tasks.
Synthetic data alone can enable robust medical image representation learning.
Abstract
Medical image foundation models (MIFMs) have demonstrated remarkable potential for a wide range of clinical tasks, yet their development is constrained by the scarcity, heterogeneity, and high cost of large-scale annotated datasets. Here, we propose RaSD (Randomized Synthesis and Disentanglement), a scalable framework for pre-training MIFMs entirely on synthetic data. By modeling anatomical structures and appearance variations with randomized Gaussian distributions, RaSD exposes models to sufficient multi-scale structural and appearance perturbations, forcing them to rely on invariant and task-relevant anatomical cues rather than dataset-specific textures, thereby enabling robust and transferable representation learning. We pre-trained RaSD on 1.2 million 3D volumes and 9.6 million 2D images, and extensively evaluated the resulting models across 6 imaging modalities, 48 datasets, and 56…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Artificial Intelligence in Healthcare and Education · AI in cancer detection
