S$^3$IT: A Benchmark for Spatially Situated Social Intelligence Test

Zhe Sun; Xueyuan Yang; Yujie Lu; Zhenliang Zhang

arXiv:2512.19992·cs.AI·December 24, 2025

S$^3$IT: A Benchmark for Spatially Situated Social Intelligence Test

Zhe Sun, Xueyuan Yang, Yujie Lu, Zhenliang Zhang

PDF

Open Access

TL;DR

This paper introduces S$^3$IT, a benchmark for evaluating embodied social intelligence in agents through a complex seat-ordering task involving physical and social reasoning in 3D environments.

Contribution

It presents a novel benchmark and framework for assessing embodied social intelligence, combining physical environment perception, social norm understanding, and multi-objective optimization.

Findings

01

State-of-the-art LLMs perform poorly on S$^3$IT

02

Humans significantly outperform LLMs in the task

03

LLMs show deficiencies in spatial reasoning but can resolve conflicts with explicit cues

Abstract

The integration of embodied agents into human environments demands embodied social intelligence: reasoning over both social norms and physical constraints. However, existing evaluations fail to address this integration, as they are limited to either disembodied social reasoning (e.g., in text) or socially-agnostic physical tasks. Both approaches fail to assess an agent's ability to integrate and trade off both physical and social constraints within a realistic, embodied context. To address this challenge, we introduce Spatially Situated Social Intelligence Test (S $^{3}$ IT), a benchmark specifically designed to evaluate embodied social intelligence. It is centered on a novel and challenging seat-ordering task, requiring an agent to arrange seating in a 3D environment for a group of large language model-driven (LLM-driven) NPCs with diverse identities, preferences, and intricate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Action Observation and Synchronization