Visual Prompt Selection for In-Context Learning Segmentation

Wei Suo; Lanqing Lai; Mengyang Sun; Hanwang Zhang; Peng Wang; Yanning; Zhang

arXiv:2407.10233·cs.CV·July 16, 2024

Visual Prompt Selection for In-Context Learning Segmentation

Wei Suo, Lanqing Lai, Mengyang Sun, Hanwang Zhang, Peng Wang, Yanning, Zhang

PDF

Open Access 1 Repo

TL;DR

This paper improves image segmentation by developing a new context search method for visual prompts in in-context learning, leading to better performance and reduced annotation costs.

Contribution

It introduces a stepwise context search strategy that adaptively selects diverse and well-matched visual prompts, enhancing segmentation accuracy.

Findings

01

The proposed method outperforms existing prompt selection strategies.

02

Diversity of prompts significantly impacts segmentation quality.

03

The approach reduces annotation costs by narrowing the search space.

Abstract

As a fundamental and extensively studied task in computer vision, image segmentation aims to locate and identify different semantic concepts at the pixel level. Recently, inspired by In-Context Learning (ICL), several generalist segmentation frameworks have been proposed, providing a promising paradigm for segmenting specific objects. However, existing works mostly ignore the value of visual prompts or simply apply similarity sorting to select contextual examples. In this paper, we focus on rethinking and improving the example selection strategy. By comprehensive comparisons, we first demonstrate that ICL-based segmentation models are sensitive to different contexts. Furthermore, empirical evidence indicates that the diversity of contextual prompts plays a crucial role in guiding segmentation. Based on the above insights, we propose a new stepwise context search method. Different from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lanqingl/scs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection

MethodsFocus