Probing the Lack of Stable Internal Beliefs in LLMs

Yifan Luo; Kangping Xu; Yanzhen Lu; Yang Yuan; Andrew Chi-Chih Yao

arXiv:2603.25187·cs.CL·March 27, 2026

Probing the Lack of Stable Internal Beliefs in LLMs

Yifan Luo, Kangping Xu, Yanzhen Lu, Yang Yuan, Andrew Chi-Chih Yao

PDF

Open Access

TL;DR

This paper investigates whether large language models can maintain stable internal goals during multi-turn interactions, revealing significant challenges in achieving consistent persona-driven behavior without explicit goal reinforcement.

Contribution

It introduces a novel riddle game paradigm to evaluate implicit goal consistency in LLMs and demonstrates their difficulty in maintaining stable internal representations over extended dialogues.

Findings

01

LLMs often fail to preserve latent goals across turns

02

Explicit context is necessary for LLMs to maintain goal consistency

03

Highlighting a key limitation for realistic personality modeling in LLMs

Abstract

Persona-driven large language models (LLMs) require consistent behavioral tendencies across interactions to simulate human-like personality traits, such as persistence or reliability. However, current LLMs often lack stable internal representations that anchor their responses over extended dialogues. This work explores whether LLMs can maintain "implicit consistency", defined as persistent adherence to an unstated goal in multi-turn interactions. We designed a 20-question-style riddle game paradigm where an LLM is tasked with secretly selecting a target and responding to users' guesses with "yes/no" answers. Through evaluations, we find that LLMs struggle to preserve latent consistency: their implicit "goals" shift across turns unless explicitly provided their selected target in context. These findings highlight critical limitations in the building of persona-driven LLMs and underscore…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPersona Design and Applications · Social Robot Interaction and HRI · AI in Service Interactions