Towards an Embodied Semantic Fovea: Semantic 3D scene reconstruction   from ego-centric eye-tracker videos

Mickey Li; Noyan Songur; Pavel Orlov; Stefan Leutenegger; A Aldo; Faisal

arXiv:1807.10561·cs.CV·July 30, 2018·1 cites

Towards an Embodied Semantic Fovea: Semantic 3D scene reconstruction from ego-centric eye-tracker videos

Mickey Li, Noyan Songur, Pavel Orlov, Stefan Leutenegger, A Aldo, Faisal

PDF

Open Access

TL;DR

This paper presents a real-time system that combines 3D scene reconstruction, semantic labeling, and gaze estimation from ego-centric RGB-D videos, advancing understanding of human-environment interactions in everyday tasks.

Contribution

It introduces a novel approach augmenting Semantic SLAM with gaze vectors for improved 3D semantic mapping from ego-centric videos.

Findings

01

Successfully produced semantic 3D maps from NYUv2 dataset images

02

Achieved reasonable accuracy in 3D object tracking and gaze estimation

03

Demonstrated real-time 3D mapping with semantic labels and gaze data

Abstract

Incorporating the physical environment is essential for a complete understanding of human behavior in unconstrained every-day tasks. This is especially important in ego-centric tasks where obtaining 3 dimensional information is both limiting and challenging with the current 2D video analysis methods proving insufficient. Here we demonstrate a proof-of-concept system which provides real-time 3D mapping and semantic labeling of the local environment from an ego-centric RGB-D video-stream with 3D gaze point estimation from head mounted eye tracking glasses. We augment existing work in Semantic Simultaneous Localization And Mapping (Semantic SLAM) with collected gaze vectors. Our system can then find and track objects both inside and outside the user field-of-view in 3D from multiple perspectives with reasonable accuracy. We validate our concept by producing a semantic map from images of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaze Tracking and Assistive Technology · Robotics and Sensor-Based Localization · Visual Attention and Saliency Detection