Task Vectors in In-Context Learning: Emergence, Formation, and Benefit
Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert, Nowak

TL;DR
This paper investigates how task vectors emerge in transformer models during in-context learning, demonstrating their formation, conditions affecting their strength, and proposing a training method to enhance their robustness and utility.
Contribution
The study reveals the conditions for task vector emergence and introduces TVP-loss, an auxiliary training method to strengthen and localize task vectors within models.
Findings
Task vectors naturally emerge under certain training conditions.
TVP-loss improves robustness and generalization of task vectors.
Task vectors can be weakly or non-locally encoded without intervention.
Abstract
In-context learning is a remarkable capability of transformers, referring to their ability to adapt to specific tasks based on a short history or context. Previous research has found that task-specific information is locally encoded within models, though their emergence and functionality remain unclear due to opaque pre-training processes. In this work, we investigate the formation of task vectors in a controlled setting, using models trained from scratch on synthetic datasets. Our findings confirm that task vectors naturally emerge under certain conditions, but the tasks may be relatively weakly and/or non-locally encoded within the model. To promote strong task vectors encoded at a prescribed location within the model, we propose an auxiliary training mechanism based on a task vector prompting loss (TVP-loss). This method eliminates the need to search for task-correlated encodings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Online Learning and Analytics
