MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection
Marawan Elbatel, Anbang Wang, Keyuan Liu, Kaouther Mouheb, Enrique Almar-Munoz, Lizhuo Lin, Yanqi Yang, Karim Lekadir, and Xiaomeng Li

TL;DR
MedSapiens adapts a human pose estimation foundation model for medical imaging landmark detection, achieving state-of-the-art results and demonstrating strong performance even with limited annotated data.
Contribution
This work repurposes a human-centric foundation model for anatomical landmark detection, establishing a new baseline and demonstrating its effectiveness across multiple datasets.
Findings
MedSapiens outperforms existing models by up to 5.26% in SDR.
Achieves up to 21.81% improvement over specialist models.
Demonstrates strong few-shot learning capabilities with 2.69% SDR improvement.
Abstract
This paper does not introduce a novel architecture; instead, it revisits a fundamental yet overlooked baseline: adapting human-centric foundation models for anatomical landmark detection in medical imaging. While landmark detection has traditionally relied on domain-specific models, the emergence of large-scale pre-trained vision models presents new opportunities. In this study, we investigate the adaptation of Sapiens, a human-centric foundation model designed for pose estimation, to medical imaging through multi-dataset pretraining, establishing a new state of the art across multiple datasets. Our proposed model, MedSapiens, demonstrates that human-centric foundation models, inherently optimized for spatial pose localization, provide strong priors for anatomical landmark detection, yet this potential has remained largely untapped. We benchmark MedSapiens against existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
