Label Words as Local Task Vectors in In-Context Learning

Bowen Zheng; Ming Ma; Zhongqiao Lin; Tianming Yang

arXiv:2406.16007·cs.CL·December 23, 2025

Label Words as Local Task Vectors in In-Context Learning

Bowen Zheng, Ming Ma, Zhongqiao Lin, Tianming Yang

PDF

Open Access

TL;DR

This paper investigates how large language models perform in-context learning by analyzing local and global task vectors, revealing that local vectors encode rule abstractions and that global vectors may not always exist, especially in rule-dependent tasks.

Contribution

The study introduces the concept of local task vectors in ICL, showing their role in rule encoding and their convergence into global vectors in certain tasks, providing new insights into LLM mechanisms.

Findings

01

Local task vectors encode rule abstractions.

02

Global task vectors may not exist in all tasks.

03

ICL operates through an information aggregation mechanism.

Abstract

Large Language Models (LLMs) have demonstrated remarkable abilities, one of the most important being in-context learning (ICL). With ICL, LLMs can derive the underlying rule from a few demonstrations and provide answers that comply with the rule. Previous work hypothesized that the network creates a task vector in specific positions during ICL. The task vector can be computed by averaging across the dataset. It conveys the overall task information and can thus be considered global. Patching the global task vector allows LLMs to achieve zero-shot performance with dummy inputs comparable to few-shot learning. However, we find that such a global task vector does not exist in all tasks, especially in tasks that rely on rules that can only be inferred from multiple demonstrations, such as categorization tasks. Instead, the information provided by each demonstration is first transmitted to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsActivation Patching