Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach
Feiyang Liu, Dan Guo, Jingyuan Xu, Zihao He, Shengeng Tang, Kun Li,, Meng Wang

TL;DR
This paper introduces GazeSeg, a novel pixel-level gaze target prediction method utilizing a visual foundation model and a new dataset, significantly improving accuracy in natural scene gaze following tasks.
Contribution
The paper presents GazeSeg, a new unified framework for gaze target segmentation and recognition, and releases a comprehensive dataset with pixel-level annotations for gaze following.
Findings
Achieves Dice score of 0.325 in gaze segmentation
Attains 71.7% top-5 recognition accuracy
Outperforms previous state-of-the-art in gaze-following AUC
Abstract
Following the gaze of other people and analyzing the target they are looking at can help us understand what they are thinking, and doing, and predict the actions that may follow. Existing methods for gaze following struggle to perform well in natural scenes with diverse objects, and focus on gaze points rather than objects, making it difficult to deliver clear semantics and accurate scope of the targets. To address this shortcoming, we propose a novel gaze target prediction solution named GazeSeg, that can fully utilize the spatial visual field of the person as guiding information and lead to a progressively coarse-to-fine gaze target segmentation and recognition process. Specifically, a prompt-based visual foundation model serves as the encoder, working in conjunction with three distinct decoding modules (e.g. FoV perception, heatmap generation, and segmentation) to form the framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gaze Tracking and Assistive Technology · Visual Attention and Saliency Detection
MethodsHeatmap · Focus
