4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping
Shuji Oishi, Kenji Koide, Masashi Yokozuka, Atsuhiko Banno

TL;DR
This paper introduces 4D Attention, a comprehensive framework for accurately mapping human gaze onto static and dynamic objects in real-time, enhancing analysis of visual attention in dynamic environments.
Contribution
The paper proposes a novel unified framework combining visual localization, IMU data, and dynamic object reconstruction for spatio-temporal gaze mapping.
Findings
Effective gaze mapping in dynamic scenes demonstrated
Framework supports real-world human-robot interaction applications
Quantitative evaluations confirm accuracy and robustness
Abstract
This study presents a framework for capturing human attention in the spatio-temporal domain using eye-tracking glasses. Attention mapping is a key technology for human perceptual activity analysis or Human-Robot Interaction (HRI) to support human visual cognition; however, measuring human attention in dynamic environments is challenging owing to the difficulty in localizing the subject and dealing with moving objects. To address this, we present a comprehensive framework, 4D Attention, for unified gaze mapping onto static and dynamic objects. Specifically, we estimate the glasses pose by leveraging a loose coupling of direct visual localization and Inertial Measurement Unit (IMU) values. Further, by installing reconstruction components into our framework, dynamic objects not captured in the 3D environment map are instantiated based on the input images. Finally, a scene rendering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Retinal Imaging and Analysis
