Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering

Eric Bigelow; Daniel Wurgaft; YingQiao Wang; Noah Goodman; Tomer Ullman; Hidenori Tanaka; Ekdeep Singh Lubana

arXiv:2511.00617·cs.LG·March 13, 2026

Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering

Eric Bigelow, Daniel Wurgaft, YingQiao Wang, Noah Goodman, Tomer Ullman, Hidenori Tanaka, Ekdeep Singh Lubana

PDF

Open Access

TL;DR

This paper presents a Bayesian framework unifying in-context learning and activation steering in large language models, explaining their effects as belief modifications and predicting behavioral shifts.

Contribution

It introduces a predictive Bayesian model that unifies prompt-based and activation-based control of LLMs, explaining and forecasting their behavior.

Findings

01

Sigmoidal learning curves explained by evidence accumulation

02

Additivity of interventions in log-belief space predicted

03

Behavioral shifts induced by small intervention changes

Abstract

Large language models (LLMs) can be controlled at inference time through prompts (in-context learning) and internal activations (activation steering). Different accounts have been proposed to explain these methods, yet their common goal of controlling model behavior raises the question of whether these seemingly disparate methodologies can be seen as specific instances of a broader framework. Motivated by this, we develop a unifying, predictive account of LLM control from a Bayesian perspective. Specifically, we posit that both context- and activation-based interventions impact model behavior by altering its belief in latent concepts: steering operates by changing concept priors, while in-context learning leads to an accumulation of evidence. This results in a closed-form Bayesian model that is highly predictive of LLM behavior across context- and activation-based interventions in a set…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Embodied and Extended Cognition