Probing the Robustness of Theory of Mind in Large Language Models
Christian Nickel, Laura Schrewe, Lucie Flek

TL;DR
This paper introduces a new dataset of 68 tasks to evaluate Theory of Mind in large language models, revealing limited capabilities and specific challenges in understanding agent knowledge and object relationships.
Contribution
The study provides a novel, comprehensive dataset with varied complexity levels for probing ToM in LLMs, and evaluates multiple models to identify specific weaknesses and directions for improvement.
Findings
LLMs show limited ToM capabilities across tasks.
Performance drops on tasks involving agent knowledge of environment changes.
Models struggle with tasks altering object relationships through prepositions.
Abstract
With the success of ChatGPT and other similarly sized SotA LLMs, claims of emergent human like social reasoning capabilities, especially Theory of Mind (ToM), in these models have appeared in the scientific literature. On the one hand those ToM-capabilities have been successfully tested using tasks styled similar to those used in psychology (Kosinski, 2023). On the other hand, follow up studies showed that those capabilities vanished when the tasks were slightly altered (Ullman, 2023). In this work we introduce a novel dataset of 68 tasks for probing ToM in LLMs, including potentially challenging variations which are assigned to 10 complexity classes. This way it is providing novel insights into the challenges LLMs face with those task variations. We evaluate the ToM performance of four SotA open source LLMs on our dataset and the dataset introduced by (Kosinski, 2023). The overall low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
