TAGS: 3D Tumor-Adaptive Guidance for SAM

Sirui Li; Linkai Peng; Zheyuan Zhang; Gorkem Durak; Ulas Bagci

arXiv:2505.17096·eess.IV·August 28, 2025

TAGS: 3D Tumor-Adaptive Guidance for SAM

Sirui Li, Linkai Peng, Zheyuan Zhang, Gorkem Durak, Ulas Bagci

PDF

TL;DR

TAGS is a novel framework that adapts 2D foundation models like SAM and CLIP for 3D medical tumor segmentation, significantly improving accuracy by multi-prompt fusion while preserving pre-trained weights.

Contribution

The paper introduces TAGS, a framework that effectively adapts 2D foundation models for 3D medical imaging, addressing domain gaps with multi-prompt fusion and minimal weight modification.

Findings

01

Outperforms state-of-the-art models by +46.88% on tumor segmentation.

02

Surpasses existing medical FMs and interactive segmentation methods.

03

Demonstrates robustness across diverse datasets and tasks.

Abstract

Foundation models (FMs) such as CLIP and SAM have recently shown great promise in image segmentation tasks, yet their adaptation to 3D medical imaging-particularly for pathology detection and segmentation-remains underexplored. A critical challenge arises from the domain gap between natural images and medical volumes: existing FMs, pre-trained on 2D data, struggle to capture 3D anatomical context, limiting their utility in clinical applications like tumor segmentation. To address this, we propose an adaptation framework called TAGS: Tumor Adaptive Guidance for SAM, which unlocks 2D FMs for 3D medical tasks through multi-prompt fusion. By preserving most of the pre-trained weights, our approach enhances SAM's spatial feature extraction using CLIP's semantic insights and anatomy-specific prompts. Extensive experiments on three open-source tumor segmentation datasets prove that our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Language-Image Pre-training · Segment Anything Model