LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents

Davide Baldelli; Ali Parviz; Amal Zouaq; Sarath Chandar

arXiv:2601.06973·cs.CL·January 13, 2026

LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents

Davide Baldelli, Ali Parviz, Amal Zouaq, Sarath Chandar

PDF

Open Access

TL;DR

This paper demonstrates that current language models lack the ability to maintain private, hidden information during interactive tasks, and proposes a private working memory as a necessary solution for reliable agent behavior.

Contribution

It introduces the concept of Private State Interactive Tasks (PSITs), proves an impossibility theorem for public-only agents, and proposes a private working memory architecture to enable consistent private state maintenance.

Findings

01

Standard chat-based LLMs fail to maintain hidden secrets across dialogue branches.

02

Semantic retrieval does not enable true private state maintenance.

03

A private working memory restores consistency in interactive language agents.

Abstract

As LLMs move from text completion toward autonomous agents, they remain constrained by the standard chat interface, which lacks private working memory. This raises a fundamental question: can agents reliably perform interactive tasks that depend on hidden state? We define Private State Interactive Tasks (PSITs), which require agents to generate and maintain hidden information while producing consistent public responses. We show theoretically that any agent restricted to the public conversation history cannot simultaneously preserve secrecy and consistency in PSITs, yielding an impossibility theorem. To empirically validate this limitation, we introduce a self-consistency testing protocol that evaluates whether agents can maintain a hidden secret across forked dialogue branches. Standard chat-based LLMs and retrieval-based memory baselines fail this test regardless of scale,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · AI in Service Interactions · Social Robot Interaction and HRI