Are General-Purpose Vision Models All We Need for 2D Medical Image Segmentation? A Cross-Dataset Empirical Study

Vanessa Borst; Samuel Kounev

arXiv:2603.13044·cs.CV·March 16, 2026

Are General-Purpose Vision Models All We Need for 2D Medical Image Segmentation? A Cross-Dataset Empirical Study

Vanessa Borst, Samuel Kounev

PDF

Open Access

TL;DR

This study empirically compares general-purpose vision models and specialized medical segmentation architectures for 2D medical image segmentation, finding that GP-VMs often outperform specialized models across multiple datasets and provide clinically relevant explanations.

Contribution

It provides a comprehensive empirical evaluation of GP-VMs versus specialized models for 2D MIS, demonstrating the effectiveness of GP-VMs and analyzing their explainability in clinical contexts.

Findings

01

GP-VMs outperform most specialized models on multiple datasets.

02

GP-VMs capture clinically relevant structures without domain-specific design.

03

Explainability analysis shows GP-VMs align with clinical features.

Abstract

Medical image segmentation (MIS) is a fundamental component of computer-assisted diagnosis and clinical decision support systems. Over the past decade, numerous architectures specifically tailored to medical imaging have emerged to address domain-specific challenges such as low contrast, small anatomical structures, and limited annotated data. In parallel, rapid progress in computer vision has produced highly capable general-purpose vision models (GP-VMs) originally designed for natural images. Despite their strong performance on standard vision benchmarks, their effectiveness for MIS remains insufficiently understood. In this work, we conduct a controlled empirical study to examine whether specialized medical segmentation architectures (SMAs) provide systematic advantages over modern GP-VMs for 2D MIS. We compare eleven SMAs and GP-VMs using a unified training and evaluation protocol.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Advanced Neural Network Applications · Multimodal Machine Learning Applications