Communicate-Predict-Act: Evaluating Social Intelligence of Agents

David Shoresh; Sarit Kraus; Yonatan Loewenstein

arXiv:2604.08727·cs.CY·April 13, 2026

Communicate-Predict-Act: Evaluating Social Intelligence of Agents

David Shoresh, Sarit Kraus, Yonatan Loewenstein

PDF

TL;DR

This paper introduces a multiplayer social game framework to evaluate LLM social intelligence, analyzing diverse models through gameplay metrics and revealing key sociocognitive factors influencing success.

Contribution

It presents a novel evaluation protocol for LLM social intelligence using multiplayer games and sociocognitive metrics, advancing understanding of AI social capabilities.

Findings

01

Performance differences across models measured by Elo ratings.

02

Sociocognitive metrics predict game outcomes with high accuracy.

03

Influence, transparency, and adaptability are key success factors.

Abstract

As large language model (LLM) agents become more prevalent in real world social settings, social intelligence will play an increasingly critical role. But social intelligence is still a poorly defined construct, for humans and artificial agents. We introduce a multiplayer arena of mixed cooperative and competitive social games to study LLM social intelligence. The controllability of LLM based agents enables systematic evaluation, which also supports broader inferences about social intelligence per se. We evaluated eight diverse LLMs (24B to 1T parameters) using a Communicate Predict Act (COMPACT) interaction protocol and fine grained probing of social dynamics. Elo style ratings reveal consistent performance differences across models, but this scalar measure provides only a partial characterization of social intelligence. To address this limitation, we analyze gameplay traces to extract…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.