Efficiency in Real-time Webcam Gaze Tracking
Amogh Gudi, Xin Li, Jan van Gemert

TL;DR
This paper explores optimizing both computational speed and calibration ease in real-time webcam gaze tracking, finding that single eye input combined with geometric calibration offers the best balance of efficiency and accuracy.
Contribution
It evaluates the trade-offs between CNN input types and calibration methods, proposing a hybrid geometric regression approach for improved efficiency.
Findings
Single eye input with geometric calibration is most efficient.
Hybrid geometric regression reduces calibration effort.
Fast inference with acceptable accuracy achieved.
Abstract
Efficiency and ease of use are essential for practical applications of camera based eye/gaze-tracking. Gaze tracking involves estimating where a person is looking on a screen based on face images from a computer-facing camera. In this paper we investigate two complementary forms of efficiency in gaze tracking: 1. The computational efficiency of the system which is dominated by the inference speed of a CNN predicting gaze-vectors; 2. The usability efficiency which is determined by the tediousness of the mandatory calibration of the gaze-vector to a computer screen. To do so, we evaluate the computational speed/accuracy trade-off for the CNN and the calibration effort/accuracy trade-off for screen calibration. For the CNN, we evaluate the full face, two-eyes, and single eye input. For screen calibration, we measure the number of calibration points needed and evaluate three types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
