Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations
Yuxin Dong, Jiachen Jiang, Zhihui Zhu, Xia Ning

TL;DR
This paper investigates the emergence and functionality of task vectors in in-context learning, proposing a linear combination hypothesis supported by theoretical and empirical evidence, and exploring their limitations and potential enhancements.
Contribution
It introduces the Linear Combination Conjecture for task vectors, supported by analysis and experiments, advancing understanding of their role in in-context learning.
Findings
Task vectors naturally emerge in linear transformers trained on triplet prompts.
Task vectors fail to represent high-rank mappings, confirmed on practical LLMs.
Injecting multiple task vectors into prompts can enhance in-context learning performance.
Abstract
Task vectors offer a compelling mechanism for accelerating inference in in-context learning (ICL) by distilling task-specific information into a single, reusable representation. Despite their empirical success, the underlying principles governing their emergence and functionality remain unclear. This work proposes the Linear Combination Conjecture, positing that task vectors act as single in-context demonstrations formed through linear combinations of the original ones. We provide both theoretical and empirical support for this conjecture. First, we show that task vectors naturally emerge in linear transformers trained on triplet-formatted prompts through loss landscape analysis. Next, we predict the failure of task vectors on representing high-rank mappings and confirm this on practical LLMs. Our findings are further validated through saliency analyses and parameter visualization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
