Knowledge Transfer Scaling Laws for 3D Medical Imaging
Ho Hin Lee, Dongna Du, Chu Wang, Yuankai Huo, Shi Gu, James C. Gee, Yifan Wu

TL;DR
This paper investigates how different 3D medical imaging domains transfer knowledge during pretraining, revealing variable scaling behaviors and proposing an optimized data allocation strategy that improves downstream clinical tasks.
Contribution
It uncovers asymmetric transfer dynamics among medical imaging domains and formulates a scaling-law based data allocation method for better pretraining.
Findings
Transfer-aware data allocation outperforms proportional sampling by up to 58%.
Highly transferable domains act as hubs, benefiting many others.
Transfer-aware mixtures improve downstream disease classification and segmentation.
Abstract
Vision foundation models are increasingly moving beyond 2D to volumetric domains such as 3D medical imaging, where unified pretraining across different imaging modalities (i.e. CT, MRI, and PET) could provide foundational models for diverse clinical tasks. However, training such models requires mixing heterogeneous imaging domains, and current mixture strategies remain largely heuristic. In this work, we observe that different medical imaging domains scale at variable rates during pretraining, and knowledge transfer between domains is strongly asymmetric: training on one domain can substantially improve another, but the reverse may be much weaker. Interestingly, both MAE reconstruction loss and cross-domain transfer follow predictable power-law trends with domain-specific behaviors. Motivated by these findings, we formulate data allocation as a scaling-law optimization problem. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
