GazeOnce: Real-Time Multi-Person Gaze Estimation
Mingfang Zhang, Yunfei Liu, Feng Lu

TL;DR
GazeOnce is a real-time, end-to-end deep learning framework that estimates 3D gaze directions for multiple people simultaneously in a single image, advancing multi-person gaze estimation in real-world scenarios.
Contribution
It introduces the first one-stage multi-person gaze estimation method and a new dataset, MPSGaze, for training and evaluation.
Findings
Faster gaze estimation with lower error than existing methods
Capable of estimating gaze for over 10 people simultaneously
Effective in real-time multi-user applications
Abstract
Appearance-based gaze estimation aims to predict the 3D eye gaze direction from a single image. While recent deep learning-based approaches have demonstrated excellent performance, they usually assume one calibrated face in each input image and cannot output multi-person gaze in real time. However, simultaneous gaze estimation for multiple people in the wild is necessary for real-world applications. In this paper, we propose the first one-stage end-to-end gaze estimation method, GazeOnce, which is capable of simultaneously predicting gaze directions for multiple faces (>10) in an image. In addition, we design a sophisticated data generation pipeline and propose a new dataset, MPSGaze, which contains full images of multiple people with 3D gaze ground truth. Experimental results demonstrate that our unified framework not only offers a faster speed, but also provides a lower gaze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Retinal Imaging and Analysis · Retinal and Optic Conditions
