Comparing SAM 2 and SAM 3 for Zero-Shot Segmentation of 3D Medical Data
Satrajit Chakrabarty, Ravi Soni

TL;DR
This study compares SAM 2 and SAM 3 models for zero-shot segmentation of 3D medical data, finding SAM 3 generally outperforms SAM 2 across various modalities and prompting strategies.
Contribution
First controlled comparison of SAM 2 and SAM 3 for 3D medical segmentation, evaluating performance and failure modes across multiple datasets and modalities.
Findings
SAM 3 performs better with click prompts across modalities.
SAM 3 has fewer over-segmentation failures and slower decay of predictions.
Performance differences narrow with bounding-box and mask prompts in some structures.
Abstract
Foundation models, such as the Segment Anything Model (SAM), have heightened interest in promptable zero-shot segmentation. Although these models perform strongly on natural images, their behavior on medical data remains insufficiently characterized. While SAM 2 has been widely adopted for annotation in 3D medical workflows, the recently released SAM 3 introduces a new architecture that may change how visual prompts are interpreted and propagated. Therefore, to assess whether SAM 3 can serve as an out-of-the-box replacement for SAM 2 for zero-shot segmentation of 3D medical data, we present the first controlled comparison of both models by evaluating SAM 3 in its Promptable Visual Segmentation (PVS) mode using a variety of prompting strategies. We benchmark on 16 public datasets (CT, MRI, Ultrasound, endoscopy) covering 54 anatomical structures, pathologies, and surgical instruments. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
