Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation
Dingyi Yang, Junqi Zhao, Xue Li, Ce Li, Boyang Li

TL;DR
This paper evaluates whether large language models can understand others' knowledge states and intentions, revealing they perform poorly compared to humans on perspective-taking tasks, highlighting areas for future improvement.
Contribution
The study introduces two novel tasks to assess LLMs' abilities in perspective taking and knowledge state estimation, demonstrating their current limitations.
Findings
LLMs perform near-random on perspective-taking tasks
LLMs are significantly less capable than humans in understanding knowledge states
Current LLMs lack effective understanding of others' intentions and knowledge
Abstract
Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals' knowledge states and understand their intentions. In comparison, our closest animal relative, chimpanzees, lack the capacity to do so. With this paper, we aim to evaluate LLM performance in estimating other individuals' knowledge states and their potential actions. We design two tasks to test (1) if LLMs can predict story characters' next actions based on their own knowledge vs. improperly using information unavailable from their perspective, and (2) if LLMs can detect when story characters, through their actions, demonstrate knowledge they should not possess. Results reveal that most current state-of-the-art LLMs achieve near-random performance on both tasks, and are substantially inferior to humans. We argue future LLM research should place more weight on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChild and Animal Learning Development · Cognitive Abilities and Testing · Action Observation and Synchronization
