Benchmarking Foundation Models for Renal Lesion Stratification in CT
Hartmut H\"antze, Sarah de Boer, Myrthe Buser, Alessa Hering, Bram van Ginneken, Mathias Prokop, Jawed Nawabi, Sebastian Ziegelmayer, Lisa Adams, Keno Bressem

TL;DR
This study benchmarks medical foundation models for renal lesion classification in CT, finding they match deep learning models but are outperformed by traditional radiomics, highlighting current limitations in FMs' fine-grained feature capture.
Contribution
It provides a comparative analysis of foundation models versus traditional methods for renal lesion classification, emphasizing the current performance gap.
Findings
FM embeddings match deep learning models in AUC but require less computation.
Radiomics significantly outperforms all deep learning approaches in AUC.
Current FMs do not yet capture the texture and shape details critical for histological differentiation.
Abstract
The rapid proliferation of open-source medical foundation models (FMs) raises a practical question: how well do their pre-trained representations transfer to clinically relevant but data-scarce classification tasks? Particularly in CT-based renal lesion classification, a push toward greater generalizability would be meaningful, as the field is constrained by inherently limited training data. We addressed this through a benchmark of three medical FMs on this specific task. This six-class problem spans common entities like cysts and clear cell renal cell carcinoma, alongside rare subtypes. Using a frozen feature-probing protocol, we compared FM embeddings against a handcrafted radiomics classifier and a 3D ResNet-50 trained from scratch. Models were trained on a composite dataset of 2,854 lesions and evaluated on an external test set of 234 lesions from The Cancer Imaging Archive. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
