MindGames: Targeting Theory of Mind in Large Language Models with   Dynamic Epistemic Modal Logic

Damien Sileo; Antoine Lernould

arXiv:2305.03353·cs.CL·November 8, 2023·2 cites

MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic

Damien Sileo, Antoine Lernould

PDF

Open Access 2 Repos 1 Datasets

TL;DR

This paper introduces a novel approach using dynamic epistemic logic to evaluate Theory of Mind in large language models, revealing that scaling alone does not guarantee improved reasoning, with GPT-4 showing some capabilities but still room for growth.

Contribution

It presents a new framework for assessing ToM in language models through controlled problems and verbalization techniques based on dynamic epistemic logic.

Findings

01

Scaling models does not consistently improve ToM performance.

02

GPT-4 shows better epistemic reasoning than smaller models.

03

Room for improvement remains in language models' ToM abilities.

Abstract

Theory of Mind (ToM) is a critical component of intelligence but its assessment remains the subject of heated debates. Prior research applied human ToM assessments to natural language processing models using either human-created standardized tests or rule-based templates. However, these methods primarily focus on simplistic reasoning and require further validation. Here, we leverage dynamic epistemic logic to isolate a particular component of ToM and to generate controlled problems. We also introduce new verbalization techniques to express these problems in English natural language. Our findings indicate that some language model scaling (from 70M to 6B and 350M to 174B) does not consistently yield results better than random chance. While GPT-4 demonstrates superior epistemic reasoning capabilities, there is still room for improvement. Our code and datasets are publicly available…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

sileod/mindgames
dataset· 650 dl
650 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Language and cultural evolution · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Adam