PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
Minjae Lee, Sungwoo Hur, Soojin Hwang, Won Hwa Kim

TL;DR
PR-MaGIC is a training-free framework that refines prompts during inference for in-context image segmentation, significantly improving quality without extra training.
Contribution
It introduces a novel, training-free prompt refinement method using mask decoder gradient flow, enhancing segmentation performance in in-context models.
Findings
Consistently improves segmentation quality across benchmarks.
Effectively mitigates prompt inadequacy without additional training.
Seamlessly integrates into existing in-context segmentation frameworks.
Abstract
Visual Foundation Models (VFMs) such as the Segment Anything Model (SAM) have significantly advanced broad use of image segmentation. However, SAM and its variants necessitate substantial manual effort for prompt generation and additional training for specific applications. Recent approaches address these limitations by integrating SAM into in-context (one/few shot) segmentation, enabling auto-prompting through semantic alignment between query and support images. Despite these efforts, they still generate sub-optimal prompts that degrade segmentation quality due to visual inconsistencies between support and query images. To tackle this limitation, we introduce PR-MaGIC (Prompt Refinement via Mask Decoder Gradient Flow for In-Context Segmentation), a training-free test-time framework that refines prompts via gradient flow derived from SAM's mask decoder. PR-MaGIC seamlessly integrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
