Efficient and Effective In-context Demonstration Selection with Coreset
Zihua Wang, Jiarui Wang, Haiyang Xu, Ming Yan, Fei Huang, Xu Yang, Xiu-Shen Wei, Siya Mi, Yu Zhang

TL;DR
This paper introduces CoDR, a coreset-based dual retrieval framework that improves demonstration selection in in-context learning for LVLMs, balancing efficiency and effectiveness through diversity and mutual information maximization.
Contribution
The paper proposes a novel coreset construction and dual retrieval method for demonstration selection, outperforming existing strategies in efficiency and effectiveness.
Findings
Significant performance improvement over existing demonstration selection methods.
Effective balance between diversity and relevance in demonstration samples.
Enhanced in-context learning results on large visual language models.
Abstract
In-context learning (ICL) has emerged as a powerful paradigm for Large Visual Language Models (LVLMs), enabling them to leverage a few examples directly from input contexts. However, the effectiveness of this approach is heavily reliant on the selection of demonstrations, a process that is NP-hard. Traditional strategies, including random, similarity-based sampling and infoscore-based sampling, often lead to inefficiencies or suboptimal performance, struggling to balance both efficiency and effectiveness in demonstration selection. In this paper, we propose a novel demonstration selection framework named Coreset-based Dual Retrieval (CoDR). We show that samples within a diverse subset achieve a higher expected mutual information. To implement this, we introduce a cluster-pruning method to construct a diverse coreset that aligns more effectively with the query while maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Data Visualization and Analytics
