LIFELONG SOTOPIA: Evaluating Social Intelligence of Language Agents Over Lifelong Social Interactions

Hitesh Goel; Hao Zhu

arXiv:2506.12666·cs.AI·June 17, 2025

LIFELONG SOTOPIA: Evaluating Social Intelligence of Language Agents Over Lifelong Social Interactions

Hitesh Goel, Hao Zhu

PDF

Open Access

TL;DR

This paper introduces LIFELONG-SOTOPIA, a benchmark for evaluating the social intelligence of language agents over extended interactions, revealing current limitations in goal achievement and believability compared to humans.

Contribution

The paper presents a novel lifelong social interaction benchmark for language agents and provides comprehensive evaluation results highlighting current performance gaps.

Findings

01

Goal achievement declines over interactions.

02

Memory methods improve performance but are insufficient.

03

Agents lag behind humans in complex social understanding.

Abstract

Humans engage in lifelong social interactions through interacting with different people under different scenarios for different social goals. This requires social intelligence to gather information through a long time span and use it to navigate various social contexts effectively. Whether AI systems are also capable of this is understudied in the existing research. In this paper, we present a novel benchmark, LIFELONG-SOTOPIA, to perform a comprehensive evaluation of language agents by simulating multi-episode interactions. In each episode, the language agents role-play characters to achieve their respective social goals in randomly sampled social tasks. With LIFELONG-SOTOPIA, we find that goal achievement and believability of all of the language models that we test decline through the whole interaction. Although using an advanced memory method improves the agents' performance, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Speech and dialogue systems