Proxy Prompt: Endowing SAM and SAM 2 with Auto-Interactive-Prompt for Medical Segmentation
Wang Xinyi, Kang Hongyu, Wei Peishan, Shuai Li, Yu Sun, Sai Kit Lam,, Yongping Zheng

TL;DR
This paper introduces Proxy Prompt, an automated prompting method for SAM and SAM2 that uses non-target data and a novel context-selection strategy to improve medical image segmentation, achieving state-of-the-art results with minimal training data.
Contribution
The paper proposes Proxy Prompt, a new auto-generated prompting technique that enhances SAM models with adaptive context selection and interactive colorization for medical segmentation.
Findings
Achieves state-of-the-art performance on four datasets.
Performs comparably to fully-trained models with only 16 masks.
Enhances human-model interaction through contextual colorization.
Abstract
In this paper, we aim to address the unmet demand for automated prompting and enhanced human-model interactions of SAM and SAM2 for the sake of promoting their widespread clinical adoption. Specifically, we propose Proxy Prompt (PP), auto-generated by leveraging non-target data with a pre-annotated mask. We devise a novel 3-step context-selection strategy for adaptively selecting the most representative contextual information from non-target data via vision mamba and selective maps, empowering the guiding capability of non-target image-mask pairs for segmentation on target image/video data. To reinforce human-model interactions in PP, we further propose a contextual colorization module via a dual-reverse cross-attention to enhance interactions between target features and contextual-embedding with amplifying distinctive features of user-defined object(s). Via extensive evaluations, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Image Retrieval and Classification Techniques
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · Colorization · Segment Anything Model
