Exploring Demonstration Ensembling for In-context Learning

Muhammad Khalifa; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Lu; Wang

arXiv:2308.08780·cs.CL·August 22, 2023

Exploring Demonstration Ensembling for In-context Learning

Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Demonstration Ensembling (DENSE), a method that combines outputs from subsets of demonstrations to improve in-context learning performance in language models, addressing limitations of standard concatenation approaches.

Contribution

The paper proposes DENSE, an ensembling technique for in-context learning that enhances performance by combining outputs from demonstration subsets, outperforming traditional concatenation methods.

Findings

01

Weighted max ensembling improves accuracy by up to 2.4 points.

02

DENSE effectively handles irrelevant demonstrations and input length constraints.

03

Experiments on 12 language tasks demonstrate its robustness.

Abstract

In-context learning (ICL) operates by showing language models (LMs) examples of input-output pairs for a given task, i.e., demonstrations. The standard approach for ICL is to prompt the LM with concatenated demonstrations followed by the test input. This approach suffers from some issues. First, concatenation offers almost no control over the contribution of each demo to the model prediction. This can be sub-optimal when some demonstrations are irrelevant to the test example. Second, due to the input length limit of some transformer models, it might be infeasible to fit many examples into the context, especially when dealing with long-input tasks. In this work, we explore Demonstration Ensembling (DENSE) as an alternative to simple concatenation. DENSE predicts outputs using subsets (i.e., buckets) of the demonstrations and then combines the output probabilities resulting from each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mukhal/icl-ensembling
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification