Technical Debt in In-Context Learning: Diminishing Efficiency in Long Context
Taejong Joo, Diego Klabjan

TL;DR
This paper investigates the limitations of in-context learning in transformers, revealing that their efficiency diminishes with longer contexts, and provides theoretical insights into this phenomenon.
Contribution
It introduces a meta ICL framework to benchmark sample complexity and demonstrates the inherent efficiency decline in ICL with increasing context length.
Findings
ICL initially matches Bayes optimal efficiency
Efficiency significantly deteriorates in long contexts
Information-theoretic analysis explains the diminishing efficiency
Abstract
Transformers have demonstrated remarkable in-context learning (ICL) capabilities, adapting to new tasks by simply conditioning on demonstrations without parameter updates. Compelling empirical and theoretical evidence suggests that ICL, as a general-purpose learner, could outperform task-specific models. However, it remains unclear to what extent the transformers optimally learn in-context compared to principled learning algorithms. To investigate this, we employ a meta ICL framework in which each prompt defines a distinctive regression task whose target function is drawn from a hierarchical distribution, requiring inference over both the latent model class and task-specific parameters. Within this setup, we benchmark sample complexity of ICL against principled learning algorithms, including the Bayes optimal estimator, under varying performance requirements. Our findings reveal a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Efficiency and Management
