Simulated Annealing Enhances Theory-of-Mind Reasoning in Autoregressive Language Models
Xucong Hu, Jian-Qiao Zhu

TL;DR
This paper demonstrates that simulated annealing, a sampling-based optimization technique, significantly improves the ability of autoregressive language models to perform Theory of Mind reasoning without additional training.
Contribution
The study introduces a novel application of annealing in power-sampling methods to enhance latent mental state reasoning in language models without retraining.
Findings
Annealing improves ToM reasoning performance.
Sampling-based methods extract latent capabilities.
No additional training required for improved ToM.
Abstract
Autoregressive language models are next-token predictors and have been criticized for only optimizing surface plausibility (i.e., local coherence) rather than maintaining correct latent-state representations (i.e., global coherence). Because Theory of Mind (ToM) tasks crucially depend on reasoning about latent mental states of oneself and others, such models are therefore often thought to fail at ToM. While post-training methods can improve ToM performance, we show that strong ToM capability can be recovered directly from the base model without any additional weight updates or verifications. Our approach builds on recent power-sampling methods (Karan & Du, 2025) that use Markov chain Monte Carlo (MCMC) to sample from sharpened sequence-level (rather than token-level) probability distributions of autoregressive language models. We further find that incorporating annealing, where the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Topic Modeling · Embodied and Extended Cognition
