Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm
Anna Babarczy, Andras Lukacs, Peter Vedres, Zeteny Bujka

TL;DR
This study evaluates whether large language models can infer mental states like beliefs and emotions, comparing their performance to humans using a standard ToM test, revealing GPT-4o's near-human accuracy.
Contribution
It provides a comparative analysis of LLMs' Theory of Mind capabilities, highlighting GPT-4o's robustness and performance close to human levels in mental-state inference tasks.
Findings
GPT-4o performs comparably to humans in ToM tasks.
Smaller models are sensitive to inferential cues and distractions.
Performance varies significantly across different LLMs.
Abstract
The study explores whether current Large Language Models (LLMs) exhibit Theory of Mind (ToM) capabilities -- specifically, the ability to infer others' beliefs, intentions, and emotions from text. Given that LLMs are trained on language data without social embodiment or access to other manifestations of mental representations, their apparent social-cognitive reasoning raises key questions about the nature of their understanding. Are they capable of robust mental-state attribution indistinguishable from human ability in its output, or do their outputs merely reflect superficial pattern completion? To address this question, we tested five LLMs and compared their performance to that of human controls using an adapted version of a text-based tool widely used in human ToM research. The test involves answering questions about the beliefs, intentions, and emotions of story characters. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Language and cultural evolution
