Towards Global Optimal Visual In-Context Learning Prompt Selection

Chengming Xu; Chen Liu; Yikai Wang; Yuan Yao; Yanwei Fu

arXiv:2405.15279·cs.CV·October 11, 2024·1 cites

Towards Global Optimal Visual In-Context Learning Prompt Selection

Chengming Xu, Chen Liu, Yikai Wang, Yuan Yao, Yanwei Fu

PDF

Open Access 1 Video

TL;DR

This paper introduces Partial2Global, a transformer-based framework for selecting the best in-context examples in visual learning tasks, significantly improving prompt selection and performance across multiple vision applications.

Contribution

The paper proposes a novel list-wise ranking approach with a consistency-aware aggregator to approximate the global optimal prompt in visual in-context learning, outperforming existing methods.

Findings

01

Partial2Global outperforms previous prompt selection methods.

02

It achieves state-of-the-art results in segmentation, detection, and colorization.

03

The framework demonstrates consistent improvement across diverse visual tasks.

Abstract

Visual In-Context Learning (VICL) is a prevailing way to transfer visual foundation models to new tasks by leveraging contextual information contained in in-context examples to enhance learning and prediction of query sample. The fundamental problem in VICL is how to select the best prompt to activate its power as much as possible, which is equivalent to the ranking problem to test the in-context behavior of each candidate in the alternative set and select the best one. To utilize more appropriate ranking metric and leverage more comprehensive information among the alternative set, we propose a novel in-context example selection framework to approximately identify the global optimal prompt, i.e. choosing the best performing in-context examples from all alternatives for each query sample. Our method, dubbed Partial2Global, adopts a transformer-based list-wise ranker to provide a more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Towards Global Optimal Visual In-Context Learning Prompt Selection· slideslive

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Face and Expression Recognition

MethodsSparse Evolutionary Training