General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Jakob Ambsdorf; Asbj{\o}rn Munk; Sebastian Llambias; Anders Nymark Christensen; Kamil Mikolaj; Randall Balestriero; Martin Tolsgaard; Aasa Feragen; Mads Nielsen

arXiv:2506.19552·cs.CV·June 25, 2025

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Jakob Ambsdorf, Asbj{\o}rn Munk, Sebastian Llambias, Anders Nymark Christensen, Kamil Mikolaj, Randall Balestriero, Martin Tolsgaard, Aasa Feragen, Mads Nielsen

PDF

Open Access

TL;DR

This study demonstrates that using well-established computer vision methods to pretrain domain-specific foundation models on large medical datasets yields superior results without extensive methodological changes.

Contribution

The paper shows that standard computer vision pretraining methods are effective for medical domain models, reducing the need for novel techniques and emphasizing the value of domain-specific pretraining.

Findings

01

Pretraining on custom medical data improves performance over natural image pretraining.

02

Scaling natural image models does not necessarily enhance ultrasound performance.

03

Minimal methodological adaptation suffices for effective medical domain model training.

Abstract

With access to large-scale, unlabeled medical datasets, researchers are confronted with two questions: Should they attempt to pretrain a custom foundation model on this medical data, or use transfer-learning from an existing generalist model? And, if a custom model is pretrained, are novel methods required? In this paper we explore these questions by conducting a case-study, in which we train a foundation model on a large regional fetal ultrasound dataset of 2M images. By selecting the well-established DINOv2 method for pretraining, we achieve state-of-the-art results on three fetal ultrasound datasets, covering data from different countries, classification, segmentation, and few-shot tasks. We compare against a series of models pretrained on natural images, ultrasound images, and supervised baselines. Our results demonstrate two key insights: (i) Pretraining on custom data is worth it,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Fetal and Pediatric Neurological Disorders