In-Context Learning for Text Classification with Many Labels
Aristides Milios, Siva Reddy, Dzmitry Bahdanau

TL;DR
This paper introduces a retrieval-based approach to enhance in-context learning for text classification with many labels, achieving state-of-the-art results without fine-tuning by leveraging large language models and a partial view of label space.
Contribution
The paper proposes using a dense retrieval model to overcome context window limitations in ICL, enabling better performance on multi-label classification tasks without fine-tuning.
Findings
Larger models better utilize increased context lengths.
Retrieval-based ICL outperforms previous methods on intent classification datasets.
Model performance depends on example similarity, class name semantics, and label-example correspondence.
Abstract
In-context learning (ICL) using large language models for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Text and Document Classification Technologies
