Task Vectors in In-Context Learning: Emergence, Formation, and Benefit

Liu Yang; Ziqian Lin; Kangwook Lee; Dimitris Papailiopoulos; Robert; Nowak

arXiv:2501.09240·cs.LG·January 17, 2025

Task Vectors in In-Context Learning: Emergence, Formation, and Benefit

Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert, Nowak

PDF

Open Access

TL;DR

This paper investigates how task vectors emerge in transformer models during in-context learning, demonstrating their formation, conditions affecting their strength, and proposing a training method to enhance their robustness and utility.

Contribution

The study reveals the conditions for task vector emergence and introduces TVP-loss, an auxiliary training method to strengthen and localize task vectors within models.

Findings

01

Task vectors naturally emerge under certain training conditions.

02

TVP-loss improves robustness and generalization of task vectors.

03

Task vectors can be weakly or non-locally encoded without intervention.

Abstract

In-context learning is a remarkable capability of transformers, referring to their ability to adapt to specific tasks based on a short history or context. Previous research has found that task-specific information is locally encoded within models, though their emergence and functionality remain unclear due to opaque pre-training processes. In this work, we investigate the formation of task vectors in a controlled setting, using models trained from scratch on synthetic datasets. Our findings confirm that task vectors naturally emerge under certain conditions, but the tasks may be relatively weakly and/or non-locally encoded within the model. To promote strong task vectors encoded at a prescribed location within the model, we propose an auxiliary training mechanism based on a task vector prompting loss (TVP-loss). This method eliminates the need to search for task-correlated encodings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Online Learning and Analytics