S$^3$IT: A Benchmark for Spatially Situated Social Intelligence Test
Zhe Sun, Xueyuan Yang, Yujie Lu, Zhenliang Zhang

TL;DR
This paper introduces S$^3$IT, a benchmark for evaluating embodied social intelligence in agents through a complex seat-ordering task involving physical and social reasoning in 3D environments.
Contribution
It presents a novel benchmark and framework for assessing embodied social intelligence, combining physical environment perception, social norm understanding, and multi-objective optimization.
Findings
State-of-the-art LLMs perform poorly on S$^3$IT
Humans significantly outperform LLMs in the task
LLMs show deficiencies in spatial reasoning but can resolve conflicts with explicit cues
Abstract
The integration of embodied agents into human environments demands embodied social intelligence: reasoning over both social norms and physical constraints. However, existing evaluations fail to address this integration, as they are limited to either disembodied social reasoning (e.g., in text) or socially-agnostic physical tasks. Both approaches fail to assess an agent's ability to integrate and trade off both physical and social constraints within a realistic, embodied context. To address this challenge, we introduce Spatially Situated Social Intelligence Test (SIT), a benchmark specifically designed to evaluate embodied social intelligence. It is centered on a novel and challenging seat-ordering task, requiring an agent to arrange seating in a 3D environment for a group of large language model-driven (LLM-driven) NPCs with diverse identities, preferences, and intricate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Action Observation and Synchronization
