Understanding In-Context Learning from Repetitions

Jianhao Yan; Jin Xu; Chiyu Song; Chenming Wu; Yafu Li; Yue Zhang

arXiv:2310.00297·cs.CL·February 22, 2024·1 cites

Understanding In-Context Learning from Repetitions

Jianhao Yan, Jin Xu, Chiyu Song, Chenming Wu, Yafu Li, Yue Zhang

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates how surface repetitions influence in-context learning in Large Language Models, introducing the concept of token co-occurrence reinforcement to explain internal mechanisms and limitations.

Contribution

It offers a novel perspective by analyzing in-context learning through surface repetitions and empirically demonstrates token co-occurrence reinforcement as a key factor.

Findings

01

Token co-occurrence reinforcement exists and influences in-context learning.

02

Surface features significantly impact text generation.

03

Insights into reasons for in-context learning failures.

Abstract

This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

This paper provide a novel framework to understanding in-context learning via the notion of token co-occurrence reinforcement. Through various experiments, the authors have shown how token reinforcement causes spurious correlations in in-context learning.

Weaknesses

Althogh there is a novelty in showing experimentally that token reinforcement can cause some problems in in-context learning, the paper lacks the important perspective of analyzing why token reinforcement exists and causes problems. For example, the following paper, which is only briefly mentioned in this paper, analyzes the impact of repetition structures in a corpus on in-context learning from an information-theoretic perspective. A Theory of Emergent In-Context Learning as Implicit Structur

Reviewer 02Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

- This work provides meaningful explanations on the possible failures of ICL. - The experiments seem comprehensive and convincing.

Weaknesses

- This work tries to understand the inherent ICL behavior of LLMs, yet is in lack of theoretical analysis. For example, how is such token co-occurrence reinforcement established? This may involve the detailed interactions between prompts and self-attention mechanism, etc., which I would like the authors to delve into. - As an experiment-oriented work, the authors should examine their assumptions on more LLMs; otherwise, it's hard to reach a common conclusion. - The findings of this work are not

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 1

Strengths

The authors conducted extensive experiments to explore the nuances of token repetition and its effects on ICL.

Weaknesses

1. **Lack of Clear Motivation**: The rationale behind investigating token repetition patterns is not adequately clear, leaving the reader uncertain about the study's purpose. Additionally, the abstract does not effectively capture the paper's core idea, necessitating further clarification to provide a concise summary of the work. 2. **Need for Enhanced Clarity**: The manuscript's presentation is complex, making it challenging for readers to navigate through the content. A more detailed explanati

Code & Models

Repositories

elliottyan/understand-icl-from-repetition
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis