TL;DR
This paper introduces PHASE, a novel dataset of physically-grounded abstract social events in 2D animations, enabling the evaluation of social perception and prediction in complex, realistic scenarios for machine understanding.
Contribution
The creation of PHASE, a comprehensive dataset with human validation, and the development of SIMPLE, a Bayesian model outperforming neural networks in social recognition tasks.
Findings
Humans perceive rich social interactions in PHASE animations.
SIMPLE outperforms state-of-the-art neural networks in social prediction.
PHASE serves as a challenging benchmark for social perception models.
Abstract
The ability to perceive and reason about social interactions in the context of physical environments is core to human social intelligence and human-machine cooperation. However, no prior dataset or benchmark has systematically evaluated physically grounded perception of complex social interactions that go beyond short actions, such as high-fiving, or simple group activities, such as gathering. In this work, we create a dataset of physically-grounded abstract social events, PHASE, that resemble a wide range of real-life social interactions by including social concepts such as helping another agent. PHASE consists of 2D animations of pairs of agents moving in a continuous space generated procedurally using a physics engine and a hierarchical planner. Agents have a limited field of view, and can interact with multiple objects, in an environment that has multiple landmarks and obstacles.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
