Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task Model
Raphael Sch\"afer, Till Nicke, Henning H\"ofener, Annkristin Lange,, Dorit Merhof, Friedrich Feuerhake, Volkmar Schulz, Johannes Lotz, Fabian, Kiessling

TL;DR
This paper introduces UMedPT, a multi-task foundational model trained on diverse biomedical imaging datasets, which achieves high performance with minimal data and enhances cross-center transferability.
Contribution
The authors develop UMedPT, a multi-task foundational model for biomedical imaging that reduces data requirements and improves transferability across different imaging domains.
Findings
UMedPT outperforms ImageNet pretraining and previous models.
It maintains performance with only 1% of training data on in-domain tasks.
It requires no more than 50% of training data for out-of-domain tasks.
Abstract
Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more heterogeneous datasets common in biomedical imaging. Here, we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a Universal bioMedical PreTrained model (UMedPT) on a multi-task database including tomographic, microscopic, and X-ray images, with various labelling strategies such as classification, segmentation, and object detection. The UMedPT foundational model outperformed ImageNet pretraining and the previous state-of-the-art models. For tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · COVID-19 diagnosis using AI · AI in cancer detection
