Gaze Perception in Humans and CNN-Based Model
Nicole X. Han, William Yang Wang, Miguel P. Eckstein

TL;DR
This study compares human and CNN-based gaze perception, revealing that humans are more influenced by scene context when inferring attention, highlighting differences in social attention mechanisms.
Contribution
It introduces a comparative analysis of human and CNN gaze inference, emphasizing the role of scene context in human judgments.
Findings
Humans are more influenced by scene context in gaze inference.
CNN model's gaze estimates are less affected by scene context.
Differences highlight distinct mechanisms in social attention perception.
Abstract
Making accurate inferences about other individuals' locus of attention is essential for human social interactions and will be important for AI to effectively interact with humans. In this study, we compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes with a number of individuals looking at a common location. We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene, such as the presence of the attended target and the number of individuals in the image.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Visual Attention and Saliency Detection · Gaze Tracking and Assistive Technology
