Active Learning Principles for In-Context Learning with Large Language   Models

Katerina Margatina; Timo Schick; Nikolaos Aletras; Jane; Dwivedi-Yu

arXiv:2305.14264·cs.CL·November 23, 2023·1 cites

Active Learning Principles for In-Context Learning with Large Language Models

Katerina Margatina, Timo Schick, Nikolaos Aletras, Jane, Dwivedi-Yu

PDF

Open Access

TL;DR

This paper explores how active learning algorithms can improve the selection of demonstrations for in-context learning with large language models, showing that similarity-based methods outperform uncertainty-based ones across various tasks and models.

Contribution

It introduces the application of active learning principles to select effective in-context demonstrations, highlighting the superiority of similarity-based methods over uncertainty sampling.

Findings

01

Similarity-based AL methods outperform other strategies.

02

Uncertainty sampling performs poorly in in-context demonstration selection.

03

Effective demonstration selection improves LLM performance across tasks.

Abstract

The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively grasp the task at hand through in-context learning. However, the process of selecting appropriate demonstrations has received limited attention in prior work. This paper addresses the issue of identifying the most informative demonstrations for few-shot learning by approaching it as a pool-based Active Learning (AL) problem over a single iteration. Our objective is to investigate how AL algorithms can serve as effective demonstration selection methods for in-context learning. We compare various standard AL algorithms based on uncertainty, diversity, and similarity, and consistently observe that the latter outperforms all other methods, including random…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Topic Modeling · Machine Learning and Data Classification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Test · Cosine Annealing · Weight Decay · Residual Connection · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Softmax · Layer Normalization