SVBench: Evaluation of Video Generation Models on Social Reasoning

Wenshuo Peng; Gongxuan Wang; Tianmeng Yang; Chuanhao Li; Xiaojie Xu; Hui He; Kaipeng Zhang

arXiv:2512.21507·cs.CV·April 1, 2026

SVBench: Evaluation of Video Generation Models on Social Reasoning

Wenshuo Peng, Gongxuan Wang, Tianmeng Yang, Chuanhao Li, Xiaojie Xu, Hui He, Kaipeng Zhang

PDF

1 Repo

TL;DR

This paper introduces SVBench, a comprehensive benchmark for evaluating social reasoning in video generation models, revealing current models' limitations in producing socially coherent behavior.

Contribution

It presents the first benchmark grounded in social psychology paradigms, with a training-free pipeline and evaluation framework for assessing social reasoning in video generation.

Findings

01

Seven state-of-the-art models show limited social reasoning capabilities.

02

Benchmark reveals a gap between visual plausibility and social understanding.

03

Framework enables large-scale evaluation of social cognition in videos.

Abstract

Recent text-to-video generation models have made remarkable progress in visual realism, motion fidelity, and text-video alignment, yet they still struggle to produce socially coherent behavior. Unlike humans, who readily infer intentions, beliefs, emotions, and social norms from brief visual cues, current models often generate literal scenes without capturing the underlying causal and psychological dynamics. To systematically assess this limitation, we introduce the first benchmark for social reasoning in video generation. Grounded in developmental and social psychology, the benchmark covers thirty classic social cognition paradigms spanning seven core dimensions: mental-state inference, goal-directed action, joint attention, social coordination, prosocial behavior, social norms, and multi-agent strategy. To operationalize these paradigms, we build a fully training-free agent-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Gloria2tt/SVBench-Evaluation
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.