Observer, Not Player: Simulating Theory of Mind in LLMs through Game Observation

Jerry Wang; Ting Yiu Liu

arXiv:2512.19210·cs.AI·December 23, 2025

Observer, Not Player: Simulating Theory of Mind in LLMs through Game Observation

Jerry Wang, Ting Yiu Liu

PDF

Open Access

TL;DR

This paper introduces an interactive framework to evaluate whether large language models can exhibit mind-like reasoning by observing and identifying strategies in the game of Rock-Paper-Scissors, emphasizing interpretability and systematic assessment.

Contribution

It develops a novel benchmark and evaluation metrics for assessing LLMs' understanding of strategic behavior through game observation, focusing on interpretability and stability of strategy identification.

Findings

01

LLMs can identify strategies with moderate accuracy.

02

The framework reveals strengths and limitations in LLM reasoning.

03

Unified metrics effectively measure alignment and calibration.

Abstract

We present an interactive framework for evaluating whether large language models (LLMs) exhibit genuine "understanding" in a simple yet strategic environment. As a running example, we focus on Rock-Paper-Scissors (RPS), which, despite its apparent simplicity, requires sequential reasoning, adaptation, and strategy recognition. Our system positions the LLM as an Observer whose task is to identify which strategies are being played and to articulate the reasoning behind this judgment. The purpose is not to test knowledge of Rock-Paper-Scissors itself, but to probe whether the model can exhibit mind-like reasoning about sequential behavior. To support systematic evaluation, we provide a benchmark consisting of both static strategies and lightweight dynamic strategies specified by well-prompted rules. We quantify alignment between the Observer's predictions and the ground-truth distributions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Games · Multimodal Machine Learning Applications