Dynamic Demonstrations Controller for In-Context Learning

Fei Zhao; Taotian Pang; Zhen Wu; Zheng Ma; Shujian Huang; Xinyu Dai

arXiv:2310.00385·cs.CL·December 12, 2024

Dynamic Demonstrations Controller for In-Context Learning

Fei Zhao, Taotian Pang, Zhen Wu, Zheng Ma, Shujian Huang, Xinyu Dai

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces D$^2$Controller, a dynamic method to optimize the number of demonstrations in in-context learning, leading to significant performance improvements across various language models and datasets.

Contribution

It challenges the assumption that more demonstrations always improve ICL performance and proposes a novel dynamic controller to adaptively select demonstrations.

Findings

01

D$^2$Controller improves ICL performance by 4.6% across multiple LLMs.

02

Increasing demonstrations does not always enhance performance, contrary to common belief.

03

The method achieves competitive results when extended to previous ICL models.

Abstract

In-context learning (ICL) is a new paradigm for natural language processing (NLP), where a large language model (LLM) observes a small number of demonstrations and a test instance as its input, and directly makes predictions without updating model parameters. Previous studies have revealed that ICL is sensitive to the selection and the ordering of demonstrations. However, there are few studies regarding the impact of the demonstration number on the ICL performance within a limited input length of LLM, because it is commonly believed that the number of demonstrations is positively correlated with model performance. In this paper, we found this conclusion does not always hold true. Through pilot experiments, we discover that increasing the number of demonstrations does not necessarily lead to improved performance. Building upon this insight, we propose a Dynamic Demonstrations Controller…

Peer Reviews

Decision·Submitted to ICLR 2024

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

The paper is most clearly written and methodologically sound. The research question makes sense, the set of baselines is large and appropriate. But there may be one crucial baseline that's missing (see Weaknesses Section).

Weaknesses

This is basically a hyperparameter selection paper, and as such it is missing a key baseline: what if one uses as many examples as possible for selecting k? That would correspond to the classic setting of having your dataset split into training, validation and test sets. While it would be more computationally expensive at the hyperparameter selection time, the key concern in practical applications of LLMs is the inference speed at test time, which would not be affected by using more validation e

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

1. Useful topic: As the authors describe, there are few work studying how the number of demonstrations impacts an LLM's performance in the ICL setting. I agree this is an important topic because empirically the study could benefit millions of LLM practitioners. 2. Neat idea: I think the method is well designed, I especially like the IICScore part, where it takes both inter- and intra-class similarity into consideration. 3. Experiments and results: The authors study their methods on a wide ra

Weaknesses

Please see my questions and concerns below.

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

1. The authors have done an excellent job of motivating the problem and providing a thorough description of their research. The paper is well-written in a high-standard and easy to understand. 2. The authors have conducted extensive experiments to demonstrate that the length of in-context learning examples is not necessarily better. Furthermore, the experimental evaluation shows that their proposed method has promising performance. 3. Validation set selection is critical to in-context learning

Weaknesses

The novelty of this paper is my main concern. The idea of minimizing intra-class distance and maximizing inter-class distance has been widely used in previous machine learning works [1][2]. Similarly, the paradigm of using a validation set to choose in-context learning examples/tune in-context learning hyperparameters has also been well-explored in previous works [3][4]. If the author can provide more content to illustrate their unique contribution, I will consider improving my score. [1] Nadag

Code & Models

Repositories

tjtp/d2controller
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification