From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models

Xinyang Li; Siqi Liu; Bochao Zou; Jiansheng Chen; Huimin Ma

arXiv:2506.14224·cs.AI·June 18, 2025

From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models

Xinyang Li, Siqi Liu, Bochao Zou, Jiansheng Chen, Huimin Ma

PDF

Open Access 1 Datasets

TL;DR

This paper introduces a new interpretability-driven method to evaluate and improve Theory of Mind in multimodal large language models, using a novel dataset and attention head analysis.

Contribution

It develops a multimodal ToM test dataset and demonstrates that attention heads can reveal ToM capabilities, also proposing a training-free enhancement technique.

Findings

01

Attention heads distinguish cognitive information across perspectives.

02

Attention mechanisms can be used to assess ToM in multimodal models.

03

A lightweight method improves the models' ToM abilities without additional training.

Abstract

As large language models evolve, there is growing anticipation that they will emulate human-like Theory of Mind (ToM) to assist with routine tasks. However, existing methods for evaluating machine ToM focus primarily on unimodal models and largely treat these models as black boxes, lacking an interpretative exploration of their internal mechanisms. In response, this study adopts an approach based on internal mechanisms to provide an interpretability-driven assessment of ToM in multimodal large language models (MLLMs). Specifically, we first construct a multimodal ToM test dataset, GridToM, which incorporates diverse belief testing tasks and perceptual information from multiple perspectives. Next, our analysis shows that attention heads in multimodal large models can distinguish cognitive information across perspectives, providing evidence of ToM capabilities. Furthermore, we present a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

AnnaLeee/GridToM
dataset· 350 dl
350 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling

MethodsFocus