Gaze Beyond the Frame: Forecasting Egocentric 3D Visual Span

Heeseung Yun; Joonil Na; Jaeyeon Kim; Calvin Murdock; Gunhee Kim

arXiv:2511.18470·cs.CV·November 25, 2025

Gaze Beyond the Frame: Forecasting Egocentric 3D Visual Span

Heeseung Yun, Joonil Na, Jaeyeon Kim, Calvin Murdock, Gunhee Kim

PDF

Open Access

TL;DR

This paper introduces EgoSpanLift, a novel 3D forecasting method for predicting where egocentric users will focus their gaze next, leveraging 3D scene understanding and a large new benchmark dataset.

Contribution

EgoSpanLift transforms 2D gaze forecasting into 3D, integrating SLAM, volumetric analysis, and deep learning models, and provides a comprehensive egocentric 3D visual span benchmark.

Findings

01

Outperforms existing 2D gaze prediction baselines

02

Achieves accurate 3D visual span forecasting

03

Works effectively when projected onto 2D images

Abstract

People continuously perceive and interact with their surroundings based on underlying intentions that drive their exploration and behaviors. While research in egocentric user and scene understanding has focused primarily on motion and contact-based interaction, forecasting human visual perception itself remains less explored despite its fundamental role in guiding human actions and its implications for AR/VR and assistive technologies. We address the challenge of egocentric 3D visual span forecasting, predicting where a person's visual perception will focus next within their three-dimensional environment. To this end, we propose EgoSpanLift, a novel method that transforms egocentric visual span forecasting from 2D image planes to 3D scenes. EgoSpanLift converts SLAM-derived keypoints into gaze-compatible geometry and extracts volumetric visual span regions. We further combine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Gaze Tracking and Assistive Technology · Face Recognition and Perception