Real-time Appearance-based Gaze Estimation for Open Domains

Zhenhao Li; Zheng Liu; Seunghyun Lee; Amin Fadaeinejad; Yuanhao Yu

arXiv:2603.26945·cs.CV·March 31, 2026

Real-time Appearance-based Gaze Estimation for Open Domains

Zhenhao Li, Zheng Liu, Seunghyun Lee, Amin Fadaeinejad, Yuanhao Yu

PDF

1 Models

TL;DR

This paper introduces a robust, real-time appearance-based gaze estimation framework that enhances generalization in unconstrained scenarios through data augmentation, multi-task learning, and new benchmark datasets, enabling mobile device deployment.

Contribution

It proposes a novel augmentation and multi-task learning approach to improve AGE robustness without extra human annotations, and curates new challenging datasets for evaluation.

Findings

01

Achieves competitive accuracy with less than 1% of UniGaze-H parameters.

02

Enhances gaze estimation robustness in unconstrained conditions.

03

Provides new benchmarks for evaluating gaze robustness.

Abstract

Appearance-based gaze estimation (AGE) has achieved remarkable performance in constrained settings, yet we reveal a significant generalization gap where existing AGE models often fail in practical, unconstrained scenarios, particularly those involving facial wearables and poor lighting conditions. We attribute this failure to two core factors: limited image diversity and inconsistent label fidelity across different datasets, especially along the pitch axis. To address these, we propose a robust AGE framework that enhances generalization without requiring additional human-annotated data. First, we expand the image manifold via an ensemble of augmentation techniques, including synthesis of eyeglasses, masks, and varied lighting. Second, to mitigate the impact of anisotropic inter-dataset label deviation, we reformulate gaze regression as a multi-task learning problem, incorporating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
BcantCode/GazeInceptionLite
model· 128 dl· ♡ 1
128 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.