Incomplete In-context Learning
Wenqiang Wang, Yangshijie Zhang

TL;DR
This paper introduces a novel two-stage framework, IJIP, to improve vision language model classification when the retrieval database is incomplete, achieving high accuracy and adaptability across various scenarios and domains.
Contribution
The paper proposes IJIP, a two-stage method that reformulates multi-class classification into binary tasks and refines predictions, addressing incomplete retrieval databases in vision language models.
Findings
Achieves up to 93.9% accuracy on benchmark datasets.
Outperforms six baseline methods across different label completeness conditions.
Applicable to prompt learning and text domain tasks.
Abstract
Large vision language models (LVLMs) achieve remarkable performance through Vision In-context Learning (VICL), a process that depends significantly on demonstrations retrieved from an extensive collection of annotated examples (retrieval database). Existing studies often assume that the retrieval database contains annotated examples for all labels. However, in real-world scenarios, delays in database updates or incomplete data annotation may result in the retrieval database containing labeled samples for only a subset of classes. We refer to this phenomenon as an \textbf{incomplete retrieval database} and define the in-context learning under this condition as \textbf{Incomplete In-context Learning (IICL)}. To address this challenge, we propose \textbf{Iterative Judgments and Integrated Prediction (IJIP)}, a two-stage framework designed to mitigate the limitations of IICL. The Iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
