PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

Wang Bill Zhu; Qiutong Tony Yi; Robin Jia; Jesse Thomason

arXiv:2604.17819·cs.CL·April 21, 2026

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking

Wang Bill Zhu, Qiutong Tony Yi, Robin Jia, Jesse Thomason

PDF

TL;DR

PDDL-Mind enhances belief reasoning in large language models by explicitly tracking environment states with a neuro-symbolic approach, significantly improving performance on theory-of-mind benchmarks.

Contribution

It introduces PDDL-Mind, a neuro-symbolic framework that explicitly models environment states to improve LLMs' belief reasoning capabilities.

Findings

01

PDDL-Mind achieves over 5% accuracy improvement on ToM benchmarks.

02

Explicit state tracking reduces LLMs' belief reasoning errors.

03

The framework outperforms existing methods on multiple ToM datasets.

Abstract

Large language models (LLMs) perform substantially below human level on existing theory-of-mind (ToM) benchmarks, even when augmented with chain-of-thought prompting or probabilistic belief updates. We argue that these failures primarily arise from unreliable implicit state tracking rather than limitations in high-level reasoning. We introduce PDDL-Mind, a neuro-symbolic framework that decouples environment state evolution from belief inference. By translating narrative descriptions into explicit states and actions expressed in Planning Domain Definition Language (PDDL), and by verifying action-induced state transitions against a predefined domain, PDDL-Mind provides LLMs with a logically consistent and explicit representation of world states for ToM tasks. Experiments on MMToM-QA, MuMA and FanToM show that PDDL-Mind achieves over 5% absolute accuracy gain over the best existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.