Manipulating the Label Space for In-Context Classification

Haokun Chen; Xu Yang; Yuhang Huang; Zihan Wu; Jing Wang; Xin Geng

arXiv:2312.00351·cs.CV·December 8, 2023·1 cites

Manipulating the Label Space for In-Context Classification

Haokun Chen, Xu Yang, Yuhang Huang, Zihan Wu, Jing Wang, Xin Geng

PDF

Open Access

TL;DR

This paper introduces label space manipulation strategies to enhance in-context classification in vision-language models, achieving higher accuracy with fewer examples compared to existing methods.

Contribution

It proposes two novel techniques, Label Distribution Enhancement and Visual Descriptions Enhancement, to increase knowledge density in in-context examples, improving classification performance.

Findings

01

Achieved 76.21% accuracy on ImageNet with 2 shots, surpassing CLIP.

02

Raised 1-shot accuracy on CUB-200 from 48.86% to 69.05%.

03

Demonstrated effectiveness across diverse datasets.

Abstract

After pre-training by generating the next word conditional on previous words, the Language Model (LM) acquires the ability of In-Context Learning (ICL) that can learn a new task conditional on the context of the given in-context examples (ICEs). Similarly, visually-conditioned Language Modelling is also used to train Vision-Language Models (VLMs) with ICL ability. However, such VLMs typically exhibit weaker classification abilities compared to contrastive learning-based models like CLIP, since the Language Modelling objective does not directly contrast whether an object is paired with a text. To improve the ICL of classification, using more ICEs to provide more knowledge is a straightforward way. However, this may largely increase the selection time, and more importantly, the inclusion of additional in-context images tends to extend the length of the in-context sequence beyond the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI

MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training