PicoEyes: Unified Gaze Estimation Framework for Mixed Reality with a Large-Scale Multi-View Dataset
Fuxin Duan, Hui Wang

TL;DR
PicoEyes is a comprehensive gaze estimation framework that predicts multiple gaze attributes from monocular or binocular inputs, supported by a large multi-view dataset, achieving state-of-the-art results in mixed reality applications.
Contribution
The paper introduces PicoEyes, a unified, end-to-end gaze estimation framework with a large-scale multi-view dataset, advancing robustness and accuracy in MR gaze tracking.
Findings
Achieves state-of-the-art performance across various settings.
Supports calibration, rewear, and forecasting scenarios.
Provides a large, diverse multi-view dataset for training and evaluation.
Abstract
We present PicoEyes, a unified gaze estimation framework that directly predicts all key attributes of gaze, including 3D eye parameters, eye-region segmentation, optical axis, visual axis, and depth maps, from either monocular or binocular inputs. The framework simultaneously addresses calibration, gaze forecasting, and varying device postures, while also supporting 3D eye reconstruction via joint estimation of eye parameters and depth maps in an end-to-end manner. In addition, we introduce a large-scale multi-view near-eye dataset containing comprehensive 2D and 3D annotations under diverse conditions, including train, test, rewear-test, and calibration sessions. Extensive experiments demonstrate that PicoEyes achieves state-ofthe-art performance, consistently outperforming both academic and industrial gaze tracking methods across nocalibration, calibration, rewear-after-calibration,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
