Ambiguity-Aware In-Context Learning with Large Language Models
Lingyu Gao, Aditi Chaudhary, Krishna Srinivasan, Kazuma Hashimoto,, Karthik Raman, Michael Bendersky

TL;DR
This paper proposes an ambiguity-aware demonstration selection method for in-context learning with large language models, improving performance by considering the model's existing knowledge and label ambiguity.
Contribution
It introduces a novel demonstration selection strategy that accounts for the LLM's knowledge and label ambiguity, outperforming traditional semantic similarity approaches.
Findings
Selecting demonstrations that resolve label ambiguity improves accuracy.
Including demonstrations previously misclassified by the LLM enhances performance.
Considering the model's knowledge about the task leads to better demonstration choices.
Abstract
In-context learning (ICL) i.e. showing LLMs only a few task-specific demonstrations has led to downstream gains with no task-specific fine-tuning required. However, LLMs are sensitive to the choice of prompts, and therefore a crucial research question is how to select good demonstrations for ICL. One effective strategy is leveraging semantic similarity between the ICL demonstrations and test inputs by using a text retriever, which however is sub-optimal as that does not consider the LLM's existing knowledge about that task. From prior work (Lyu et al., 2023), we already know that labels paired with the demonstrations bias the model predictions. This leads us to our hypothesis whether considering LLM's existing knowledge about the task, especially with respect to the output label space can help in a better demonstration selection strategy. Through extensive experimentation on three text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
