Accumulating Context Changes the Beliefs of Language Models
Jiayi Geng, Howard Chen, Ryan Liu, Manoel Horta Ribeiro, Robb Willer, Graham Neubig, Thomas L. Griffiths

TL;DR
This paper investigates how accumulating context through interactions and reading can alter the beliefs and behaviors of language models, highlighting potential risks to their reliability and alignment.
Contribution
It demonstrates that extended interactions and reading can significantly change models' belief profiles and behaviors, revealing a hidden risk in current language model usage.
Findings
GPT-5's beliefs shift by 54.7% after 10 discussion rounds
Grok 4's beliefs shift by 27.2% after reading opposing texts
Behavioral changes align with belief shifts in tool-use tasks
Abstract
Language model (LM) assistants are increasingly used in applications such as brainstorming and research. Improvements in memory and context size have allowed these models to become more autonomous, which has also resulted in more text accumulation in their context windows without explicit user intervention. This comes with a latent risk: the belief profiles of models -- their understanding of the world as manifested in their responses or actions -- may silently change as context accumulates. This can lead to subtly inconsistent user experiences, or shifts in behavior that deviate from the original alignment of the models. In this paper, we explore how accumulating context by engaging in interactions and processing text -- talking and reading -- can change the beliefs of language models, as manifested in their responses and behaviors. Our results reveal that models' belief profiles are…
Peer Reviews
Decision·Submitted to ICLR 2026
- Designing an experimental framework to measure the effect of intentional/non-intentional shifts which can possibly introduced by a long context input - Sufficient analysis that possibly uncovers some pathways how intentional/non-intentional shifts affect LLMs' belief.
- Despite the depth of analysis, the experimental setup seems not systematic. As the current condition might introduce several unwanted confounding effects, the result analysis cannot be fully attributed to the experimental conditions (i.e., designed shifts). See Question A.
1. The setup of measuring both stated belief and action is nice 2. The task collection could serve as a benchmark for future work. 3. The setup is relatively simple and reproducible.
1. The evaluation appears to run each condition only once per model, without repetitions of the sampling process. All the reported models exhibit stochastic behavior. It is therefore unclear whether the reported belief shifts exceed the models' inherent stochastic variance. Without repeated baselines, we cannot tell if these are systematic updates or simply natural output fluctuations. 2. The authors frame belief change as a “risk,” but for non-safety-sensitive or open-ended topics, such change
- The paper aims at a problem with growing importance in model deployment, that of context management, and subtle influence from the context information. - Both self-reported beliefs and behavioral evidence (revealed beliefs) are accounted for in the experiments, which add to the soundness and generalizeability of the results. - The paper is clearly structured and is easy to follow.
- Originality: Model under persuasion [1,2,3,4,5], model debate [6,7], and model reading [8], are all well-studied topics. The only setup not covered by these topics is belief change during deep research. However, the main finding in such a case is that such change exists and is large, which is very unsurprising (deep research was designed to make the model find new evidence that shifts its belief), and there are analogous results from evaluation on domains such as maths [9] and science [10].
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
