Social Genome: Grounded Social Reasoning Abilities of Multimodal Models

Leena Mathur; Marian Qian; Paul Pu Liang; Louis-Philippe Morency

arXiv:2502.15109·cs.CL·June 5, 2025

Social Genome: Grounded Social Reasoning Abilities of Multimodal Models

Leena Mathur, Marian Qian, Paul Pu Liang, Louis-Philippe Morency

PDF

TL;DR

This paper introduces SOCIAL GENOME, a comprehensive benchmark for evaluating multimodal models' social reasoning abilities using annotated videos and reasoning traces, highlighting current performance gaps.

Contribution

It presents the first benchmark for grounded social reasoning in multimodal models, incorporating external knowledge and detailed evaluation metrics.

Findings

01

State-of-the-art models show significant performance gaps.

02

The benchmark enables detailed analysis of reasoning quality.

03

External knowledge integration remains a challenge.

Abstract

Social reasoning abilities are crucial for AI systems to effectively interpret and respond to multimodal human communication and interaction within social contexts. We introduce SOCIAL GENOME, the first benchmark for fine-grained, grounded social reasoning abilities of multimodal models. SOCIAL GENOME contains 272 videos of interactions and 1,486 human-annotated reasoning traces related to inferences about these interactions. These traces contain 5,777 reasoning steps that reference evidence from visual cues, verbal cues, vocal cues, and external knowledge (contextual knowledge external to videos). SOCIAL GENOME is also the first modeling challenge to study external knowledge in social reasoning. SOCIAL GENOME computes metrics to holistically evaluate semantic and structural qualities of model-generated social reasoning traces. We demonstrate the utility of SOCIAL GENOME through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.