DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Yuheng Wu; Jianwen Xie; Denghui Zhang; Zhaozhuo Xu

arXiv:2505.17348·cs.AI·September 30, 2025

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Yuheng Wu, Jianwen Xie, Denghui Zhang, Zhaozhuo Xu

PDF

2 Models 2 Datasets

TL;DR

DEL-ToM introduces a dynamic reasoning framework that enhances large language models' theory-of-mind abilities by decomposing tasks into belief updates and using a verifier to select the most logical belief trace during inference.

Contribution

It proposes a novel inference-time scaling approach using Dynamic Epistemic Logic and a verifier to improve ToM reasoning without retraining LLMs.

Findings

01

DEL-ToM improves ToM performance across models and benchmarks.

02

The verifier effectively scores belief updates, leading to more accurate reasoning.

03

The approach enhances transparency and verifiability of LLM reasoning processes.

Abstract

Theory-of-Mind (ToM) tasks pose a unique challenge for large language models (LLMs), which often lack the capability for dynamic logical reasoning. In this work, we propose DEL-ToM, a framework that improves verifiable ToM reasoning through inference-time scaling rather than architectural changes. Our approach decomposes ToM tasks into a sequence of belief updates grounded in Dynamic Epistemic Logic (DEL), enabling structured and verifiable dynamic logical reasoning. We use data generated automatically via a DEL simulator to train a verifier, which we call the Process Belief Model (PBM), to score each belief update step. During inference, the PBM evaluates candidate belief traces from the LLM and selects the highest-scoring one. This allows LLMs to allocate extra inference-time compute to yield more transparent reasoning. Experiments across model scales and benchmarks show that DEL-ToM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.