# Social World Models

**Authors:** Xuhui Zhou, Jiarui Liu, Akhila Yerukola, Hyunwoo Kim, Maarten Sap

arXiv: 2509.00559 · 2026-02-02

## TL;DR

This paper introduces social world models (SWMs) and a novel structured representation (S3AP) to improve AI's understanding of implicit social dynamics, significantly enhancing social reasoning and interaction capabilities.

## Contribution

The paper proposes SWMs and S3AP formalism to explicitly model unobserved social states, improving AI social reasoning over traditional methods.

## Key findings

- +51% improvement on FANToM benchmark
- Up to +18% improvement on SOTOPIA benchmark
- Explicit modeling of mental states is more effective than baseline methods

## Abstract

Humans intuitively navigate social interactions by simulating unspoken dynamics and reasoning about others' perspectives, even with limited information. In contrast, AI systems struggle to structure and reason about implicit social contexts, as they lack explicit representations for unobserved dynamics such as intentions, beliefs, and evolving social states. In this paper, we introduce the concept of social world models (SWMs) to characterize the complex social dynamics. To operationalize SWMs, we introduce a novel structured social world representation formalism (S3AP), which captures the evolving states, actions, and mental states of agents, addressing the lack of explicit structure in traditional free-text-based inputs. Through comprehensive experiments across five social reasoning benchmarks, we show that S3AP significantly enhances LLM performance-achieving a +51% improvement on FANToM over OpenAI's o1. Our ablations further reveal that these gains are driven by the explicit modeling of hidden mental states, which proves more effective than a wide range of baseline methods. Finally, we introduce an algorithm for social world models using S3AP, which enables AI agents to build models of their interlocutors and predict their next actions and mental states. Empirically, S3AP-enabled social world models yield up to +18% improvement on the SOTOPIA multi-turn social interaction benchmark. Our findings highlight the promise of S3AP as a powerful, general-purpose representation for social world states, enabling the development of more socially-aware systems that better navigate social interactions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00559/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00559/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/2509.00559/full.md

---
Source: https://tomesphere.com/paper/2509.00559