Evaluating and Modeling Social Intelligence: A Comparative Study of   Human and AI Capabilities

Junqi Wang; Chunhui Zhang; Jiapeng Li; Yuxi Ma; Lixing Niu; Jiaheng; Han; Yujia Peng; Yixin Zhu; Lifeng Fan

arXiv:2405.11841·cs.AI·May 21, 2024

Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities

Junqi Wang, Chunhui Zhang, Jiapeng Li, Yuxi Ma, Lixing Niu, Jiaheng, Han, Yujia Peng, Yixin Zhu, Lifeng Fan

PDF

Open Access 1 Repo

TL;DR

This study introduces a benchmark and theoretical framework to evaluate social intelligence in humans and AI, revealing humans outperform GPT models and highlighting the limited social understanding of current LLMs.

Contribution

The paper develops a new benchmark and computational model for assessing social intelligence, providing a comparative analysis between human and AI capabilities.

Findings

01

Humans outperform GPT models in social intelligence tasks.

02

GPT models only exhibit basic social intelligence (order 0).

03

Humans demonstrate higher adaptability and generalization in social reasoning.

Abstract

Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for social dynamics and introduced two evaluation tasks: Inverse Reasoning (IR) and Inverse Inverse Planning (IIP). Our approach also encompassed a computational model based on recursive Bayesian inference, adept at elucidating diverse human behavioral patterns. Extensive experiments and detailed analyses revealed that humans surpassed the latest GPT models in overall performance, zero-shot learning, one-shot generalization, and adaptability to multi-modalities. Notably, GPT models demonstrated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bigai-ai/evaluate-n-model-social-intelligence
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Science and Mapping

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dense Connections · Cosine Annealing · Linear Layer · Weight Decay · Linear Warmup With Cosine Annealing · Residual Connection · Byte Pair Encoding · Adam