USF-MAE: Ultrasound Self-Supervised Foundation Model with Masked Autoencoding

Youssef Megahed; Robin Ducharme; Aylin Erman; Mark Walker; Steven Hawken; and Adrian D. C. Chan

arXiv:2510.22990·eess.IV·November 10, 2025

USF-MAE: Ultrasound Self-Supervised Foundation Model with Masked Autoencoding

Youssef Megahed, Robin Ducharme, Aylin Erman, Mark Walker, Steven Hawken, and Adrian D. C. Chan

PDF

TL;DR

USF-MAE introduces a large-scale self-supervised ultrasound model trained with masked autoencoding, significantly improving ultrasound image classification accuracy across multiple clinical benchmarks without requiring labeled pretraining data.

Contribution

This work is the first to develop a large-scale self-supervised MAE framework exclusively for ultrasound data, leveraging 370,000 images from diverse sources to learn modality-specific representations.

Findings

01

Outperforms CNN and ViT baselines on three benchmarks.

02

Approaches supervised model performance without using labels during pretraining.

03

Demonstrates strong cross-anatomical generalization.

Abstract

Ultrasound imaging is one of the most widely used diagnostic modalities, offering real-time, radiation-free assessment across diverse clinical domains. However, interpretation of ultrasound images remains challenging due to high noise levels, operator dependence, and limited field of view, resulting in substantial inter-observer variability. Current Deep Learning approaches are hindered by the scarcity of large labeled datasets and the domain gap between general and sonographic images, which limits the transferability of models pretrained on non-medical data. To address these challenges, we introduce the Ultrasound Self-Supervised Foundation Model with Masked Autoencoding (USF-MAE), the first large-scale self-supervised MAE framework pretrained exclusively on ultrasound data. The model was pre-trained on 370,000 2D and 3D ultrasound images curated from 46 open-source datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.