Revisiting 2D Foundation Models for Scalable 3D Medical Image Classification
Han Liu, Bogdan Georgescu, Yanbo Zhang, Youngjin Yoo, Michael Baumgartner, Riqiang Gao, Jianing Wang, Gengyan Zhao, Eli Gibson, Dorin Comaniciu, Sasa Grbic

TL;DR
This paper introduces AnyMC3D, a scalable 3D medical image classifier based on 2D foundation models, addressing key pitfalls and achieving state-of-the-art results across diverse tasks with a unified framework.
Contribution
The paper presents a lightweight, scalable 3D classifier built on 2D foundation models, supporting multi-view inputs and auxiliary supervision, and provides a comprehensive benchmark analysis.
Findings
Effective adaptation unlocks foundation model potential.
General-purpose FMs can match medical-specific FMs with proper adaptation.
2D-based methods outperform 3D architectures in 3D classification.
Abstract
3D medical image classification is essential for modern clinical workflows. Medical foundation models (FMs) have emerged as a promising approach for scaling to new tasks, yet current research suffers from three critical pitfalls: data-regime bias, suboptimal adaptation, and insufficient task coverage. In this paper, we address these pitfalls and introduce AnyMC3D, a scalable 3D classifier adapted from 2D FMs. Our method scales efficiently to new tasks by adding only lightweight plugins (about 1M parameters per task) on top of a single frozen backbone. This versatile framework also supports multi-view inputs, auxiliary pixel-level supervision, and interpretable heatmap generation. We establish a comprehensive benchmark of 12 tasks covering diverse pathologies, anatomies, and modalities, and systematically analyze state-of-the-art 3D classification techniques. Our analysis reveals key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
