NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding
Chunkit Chan, Cheng Jiayang, Yauwai Yim, Zheye Deng, Wei Fan, Haoran, Li, Xin Liu, Hongming Zhang, Weiqi Wang, Yangqiu Song

TL;DR
NegotiationToM is a new benchmark designed to evaluate large language models' ability to understand complex mental states in real-world negotiation scenarios, revealing current models' limitations compared to humans.
Contribution
The paper introduces NegotiationToM, a novel benchmark for stress-testing machine Theory of Mind in real-world negotiations involving multi-dimensional mental states.
Findings
LLMs perform significantly worse than humans on NegotiationToM.
Chain-of-thought methods do not substantially improve LLM performance.
NegotiationToM effectively challenges current state-of-the-art LLMs.
Abstract
Large Language Models (LLMs) have sparked substantial interest and debate concerning their potential emergence of Theory of Mind (ToM) ability. Theory of mind evaluations currently focuses on testing models using machine-generated data or game settings prone to shortcuts and spurious correlations, which lacks evaluation of machine ToM ability in real-world human interaction scenarios. This poses a pressing demand to develop new real-world scenario benchmarks. We introduce NegotiationToM, a new benchmark designed to stress-test machine ToM in real-world negotiation surrounding covered multi-dimensional mental states (i.e., desires, beliefs, and intentions). Our benchmark builds upon the Belief-Desire-Intention (BDI) agent modeling theory and conducts the necessary empirical experiments to evaluate large language models. Our findings demonstrate that NegotiationToM is challenging for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAction Observation and Synchronization · Neural and Behavioral Psychology Studies
