Understanding Social Reasoning in Language Models with Language Models

Kanishk Gandhi; Jan-Philipp Fr\"anken; Tobias Gerstenberg; Noah D.; Goodman

arXiv:2306.15448·cs.CL·December 6, 2023·20 cites

Understanding Social Reasoning in Language Models with Language Models

Kanishk Gandhi, Jan-Philipp Fr\"anken, Tobias Gerstenberg, Noah D., Goodman

PDF

Open Access 1 Video

TL;DR

This paper introduces BigToM, a new benchmark for assessing social reasoning and Theory-of-Mind in large language models, revealing GPT-4's human-like inference abilities and highlighting limitations in other models.

Contribution

The paper presents a novel framework for generating social reasoning evaluations and creates BigToM, a comprehensive benchmark for testing LLMs' ToM capabilities.

Findings

01

GPT-4 exhibits ToM capabilities similar to humans.

02

Other LLMs show limited social reasoning skills.

03

Human ratings favor the new benchmark over previous evaluations.

Abstract

As Large Language Models (LLMs) become increasingly integrated into our everyday lives, understanding their ability to comprehend human mental states becomes critical for ensuring effective interactions. However, despite the recent attempts to assess the Theory-of-Mind (ToM) reasoning capabilities of LLMs, the degree to which these models can align with human ToM remains a nuanced topic of exploration. This is primarily due to two distinct challenges: (1) the presence of inconsistent results from previous evaluations, and (2) concerns surrounding the validity of existing evaluation methodologies. To address these challenges, we present a novel framework for procedurally generating evaluations with LLMs by populating causal templates. Using our framework, we create a new social reasoning benchmark (BigToM) for LLMs which consists of 25 controls and 5,000 model-written evaluations. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Understanding Social Reasoning in Language Models with Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Computational and Text Analysis Methods

MethodsALIGN