Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning

Sneheel Sarangi; Hanan Salam

arXiv:2507.15788·cs.LG·July 22, 2025

Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning

Sneheel Sarangi, Hanan Salam

PDF

TL;DR

This study examines whether small language models can develop a generalizable Theory of Mind through reinforcement learning, finding they tend to overfit training data and fail to transfer understanding to new, unseen tasks.

Contribution

It provides a systematic evaluation showing small LLMs struggle to acquire a true, generalizable Theory of Mind via RL, highlighting limitations of current training methods.

Findings

01

Small LLMs improve on in-distribution ToM tasks

02

Models overfit training data, failing to generalize to new tasks

03

Prolonged RL leads to overfitting and performance degradation on out-of-distribution data

Abstract

Recent advancements in large language models (LLMs) have demonstrated emergent capabilities in complex reasoning, largely spurred by rule-based Reinforcement Learning (RL) techniques applied during the post-training. This has raised the question of whether similar methods can instill more nuanced, human-like social intelligence, such as a Theory of Mind (ToM), in LLMs. This paper investigates whether small-scale LLMs can acquire a robust and generalizable ToM capability through RL with verifiable rewards (RLVR). We conduct a systematic evaluation by training models on various combinations of prominent ToM datasets (HiToM, ExploreToM, FANToM) and testing for generalization on held-out datasets (e.g., OpenToM). Our findings indicate that small LLMs struggle to develop a generic ToM capability. While performance on in-distribution tasks improves, this capability fails to transfer to unseen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.