Playing games with knowledge: AI-Induced delusions need game theoretic interventions

Will Beaumaster; Paul Schrater

arXiv:2605.08409·cs.AI·May 12, 2026

Playing games with knowledge: AI-Induced delusions need game theoretic interventions

Will Beaumaster, Paul Schrater

PDF

TL;DR

This paper models the problem of AI-induced delusions in conversational agents as a game-theoretic coordination failure and proposes an intervention called Epistemic Mediator with Belief Versioning to improve epistemic safety.

Contribution

It introduces a formal game-theoretic framework for understanding AI-induced delusions and proposes a novel intervention mechanism that significantly reduces false belief spirals in simulations.

Findings

01

Intervention reduces false belief spirals by 48 times in simulations.

02

Modeling as a Crawford-Sobel cheap talk game reveals systemic causes of epistemic entrenchment.

03

Belief Versioning effectively maintains healthy beliefs and enables rollback when needed.

Abstract

Conversational AI has a fundamental flaw as a knowledge interface: sycophantic chatbots induce epistemic entrenchment and delusional belief spirals even in rational agents. We propose the problem does not stem from the AI model, rooted instead in a systemic consequence of the paradigm shift from user-driven knowledge search to users and agents engaged in strategic, repeated-play communication. We formalize the problem as a Crawford-Sobel cheap talk game, where costless user signals induce a pooling equilibrium. Agents optimized for user satisfaction produce sycophantic strategies that provide identical reinforcement across user types with opposite epistemic incentives: exploratory ``Growth-seekers'' ( $θ_{G}$ ) and confirmatory ``Validation-seekers'' ( $θ_{V}$ ). Under repeated play, this identification failure creates a coordination trap -- analogous to a Prisoner's Dilemma -- where…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.