Instruct Me More! Random Prompting for Visual In-Context Learning
Jiahao Zhang, Bowen Wang, Liangzhi Li, Yuta Nakashima, Hajime Nagahara

TL;DR
This paper introduces InMeMo, a learnable prompt augmentation method for visual in-context learning, which significantly improves performance on tasks like segmentation and detection without extensive retraining.
Contribution
InMeMo is a novel, lightweight approach that enhances visual in-context learning by augmenting prompts with learnable perturbations, surpassing current state-of-the-art results.
Findings
InMeMo improves mIoU scores by 7.35 for segmentation.
InMeMo boosts mIoU by 15.13 for object detection.
The method is versatile and requires minimal training.
Abstract
Large-scale models trained on extensive datasets, have emerged as the preferred approach due to their high generalizability across various tasks. In-context learning (ICL), a popular strategy in natural language processing, uses such models for different tasks by providing instructive prompts but without updating model parameters. This idea is now being explored in computer vision, where an input-output image pair (called an in-context pair) is supplied to the model with a query image as a prompt to exemplify the desired output. The efficacy of visual ICL often depends on the quality of the prompts. We thus introduce a method coined Instruct Me More (InMeMo), which augments in-context pairs with a learnable perturbation (prompt), to explore its potential. Our experiments on mainstream tasks reveal that InMeMo surpasses the current state-of-the-art performance. Specifically, compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Instruct Me More! Random Prompting for Visual In-Context Learning· youtube
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
