How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning

Entang Wang; Yiwei Wang; Aleksandra Bakalova; Michael Hahn

arXiv:2605.16591·cs.LG·May 19, 2026

How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning

Entang Wang, Yiwei Wang, Aleksandra Bakalova, Michael Hahn

PDF

TL;DR

This paper provides a mechanistic, causal explanation of how few-shot prompts influence in-context learning by decomposing function vectors into additive, context-dependent components, highlighting the roles of attention and representation updates.

Contribution

It introduces a causal decomposition framework that explains how few-shot examples shape model behavior through additive and contextualized function vectors, unifying superposition and attention reweighting.

Findings

01

Function vectors are well-approximated by linear combinations of example sub-vectors.

02

Models reweight demonstrations based on informativeness and ambiguity, affecting the function vector.

03

Query-Key alignment primarily drives the quality of the function vector in ambiguous settings.

Abstract

In-context learning (ICL) excels at new tasks from minimal examples, yet we still lack a mechanistic explanation of how few-shot prompts shape a model's function vector (FV)--a causal activation direction that drives task behavior on the ICL query. Across tasks and models, an $n$ -shot FV is well-approximated by a linear combination of example-level sub-FVs, suggesting additive and composable contributions from individual demonstrations. Beyond additivity, we show that models contextualize individual examples' representations based on prior examples to adaptively reweight which demonstrations dominate the FV: attention shifts toward examples that are more informative and less ambiguous under the context. Finally, a causal decomposition separates Query-Key routing from Value updates, finding that contextualization's most consistent contributions to FV quality arise from Query-Key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.