Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning
Lifeng Fan, Wenguan Wang, Siyuan Huang, Xinyu Tang, Song-Chun Zhu

TL;DR
This paper introduces VACATION, a large-scale dataset and a novel spatio-temporal graph neural network to understand and predict human gaze communication in social videos at both atomic and event levels.
Contribution
It provides the first comprehensive dataset for gaze communication and proposes a new graph-based model for detailed analysis of gaze interactions in social scenes.
Findings
Model significantly outperforms baselines in predicting gaze communication.
VACATION dataset covers diverse social scenes with detailed annotations.
Proposed network effectively captures both atomic and event-level gaze behaviors.
Abstract
This paper addresses a new problem of understanding human gaze communication in social videos from both atomic-level and event-level, which is significant for studying human social interactions. To tackle this novel and challenging problem, we contribute a large-scale video dataset, VACATION, which covers diverse daily social scenes and gaze communication behaviors with complete annotations of objects and human faces, human attention, and communication structures and labels in both atomic-level and event-level. Together with VACATION, we propose a spatio-temporal graph neural network to explicitly represent the diverse gaze interactions in the social scenes and to infer atomic-level gaze communication by message passing. We further propose an event network with encoder-decoder structure to predict the event-level gaze communication. Our experiments demonstrate that the proposed model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Visual Attention and Saliency Detection · Gaze Tracking and Assistive Technology
MethodsGraph Neural Network
