MovieGraphs: Towards Understanding Human-Centric Situations from Videos
Paul Vicol, Makarand Tapaswi, Lluis Castrejon, Sanja Fidler

TL;DR
MovieGraphs introduces a comprehensive dataset with graph-based annotations of social situations in movies, enabling better understanding of human interactions and emotions for socially intelligent AI.
Contribution
The paper presents a novel dataset with detailed, grounded, and temporal graph annotations of social scenes, along with methods for querying and understanding human-centric interactions.
Findings
Graphs effectively summarize and localize scenes.
Subgraphs enable semantic retrieval of situations.
Proposed methods improve interaction and reason understanding.
Abstract
There is growing interest in artificial intelligence to build socially intelligent robots. This requires machines to have the ability to "read" people's emotions, motivations, and other factors that affect behavior. Towards this goal, we introduce a novel dataset called MovieGraphs which provides detailed, graph-based annotations of social situations depicted in movie clips. Each graph consists of several types of nodes, to capture who is present in the clip, their emotional and physical attributes, their relationships (i.e., parent/child), and the interactions between them. Most interactions are associated with topics that provide additional details, and reasons that give motivations for actions. In addition, most interactions and many attributes are grounded in the video with time stamps. We provide a thorough analysis of our dataset, showing interesting common-sense correlations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
