PicoEyes: Unified Gaze Estimation Framework for Mixed Reality with a Large-Scale Multi-View Dataset

Fuxin Duan; Hui Wang

arXiv:2605.07188·cs.CV·May 14, 2026

PicoEyes: Unified Gaze Estimation Framework for Mixed Reality with a Large-Scale Multi-View Dataset

Fuxin Duan, Hui Wang

PDF

TL;DR

PicoEyes is a comprehensive gaze estimation framework that predicts multiple gaze attributes from monocular or binocular inputs, supported by a large multi-view dataset, achieving state-of-the-art results in mixed reality applications.

Contribution

The paper introduces PicoEyes, a unified, end-to-end gaze estimation framework with a large-scale multi-view dataset, advancing robustness and accuracy in MR gaze tracking.

Findings

01

Achieves state-of-the-art performance across various settings.

02

Supports calibration, rewear, and forecasting scenarios.

03

Provides a large, diverse multi-view dataset for training and evaluation.

Abstract

We present PicoEyes, a unified gaze estimation framework that directly predicts all key attributes of gaze, including 3D eye parameters, eye-region segmentation, optical axis, visual axis, and depth maps, from either monocular or binocular inputs. The framework simultaneously addresses calibration, gaze forecasting, and varying device postures, while also supporting 3D eye reconstruction via joint estimation of eye parameters and depth maps in an end-to-end manner. In addition, we introduce a large-scale multi-view near-eye dataset containing comprehensive 2D and 3D annotations under diverse conditions, including train, test, rewear-test, and calibration sessions. Extensive experiments demonstrate that PicoEyes achieves state-ofthe-art performance, consistently outperforming both academic and industrial gaze tracking methods across nocalibration, calibration, rewear-after-calibration,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.