Communicate-Predict-Act: Evaluating Social Intelligence of Agents
David Shoresh, Sarit Kraus, Yonatan Loewenstein

TL;DR
This paper introduces a multiplayer social game framework to evaluate LLM social intelligence, analyzing diverse models through gameplay metrics and revealing key sociocognitive factors influencing success.
Contribution
It presents a novel evaluation protocol for LLM social intelligence using multiplayer games and sociocognitive metrics, advancing understanding of AI social capabilities.
Findings
Performance differences across models measured by Elo ratings.
Sociocognitive metrics predict game outcomes with high accuracy.
Influence, transparency, and adaptability are key success factors.
Abstract
As large language model (LLM) agents become more prevalent in real world social settings, social intelligence will play an increasingly critical role. But social intelligence is still a poorly defined construct, for humans and artificial agents. We introduce a multiplayer arena of mixed cooperative and competitive social games to study LLM social intelligence. The controllability of LLM based agents enables systematic evaluation, which also supports broader inferences about social intelligence per se. We evaluated eight diverse LLMs (24B to 1T parameters) using a Communicate Predict Act (COMPACT) interaction protocol and fine grained probing of social dynamics. Elo style ratings reveal consistent performance differences across models, but this scalar measure provides only a partial characterization of social intelligence. To address this limitation, we analyze gameplay traces to extract…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
