Egocentric Pose Estimation from Human Vision Span
Hao Jiang, Vamsi Krishna Ithapu

TL;DR
This paper introduces a deep learning system for egocentric pose estimation from a natural human vision span, combining SLAM features and body imagery to accurately and robustly estimate head and body poses in real time for wearable devices.
Contribution
It presents a novel approach that leverages dynamic SLAM features and body imagery, explicitly enforces geometric consistency, and trains effectively with existing mocap data.
Findings
Accurate 3D head and body pose estimation in real time.
Robust system trained with existing mocap datasets.
Effective egopose estimation from a natural human visual field.
Abstract
Estimating camera wearer's body pose from an egocentric view (egopose) is a vital task in augmented and virtual reality. Existing approaches either use a narrow field of view front facing camera that barely captures the wearer, or an extruded head-mounted top-down camera for maximal wearer visibility. In this paper, we tackle the egopose estimation from a more natural human vision span, where camera wearer can be seen in the peripheral view and depending on the head pose the wearer may become invisible or has a limited partial view. This is a realistic visual field for user-centric wearable devices like glasses which have front facing wide angle cameras. Existing solutions are not appropriate for this setting, and so, we propose a novel deep learning system taking advantage of both the dynamic features from camera SLAM and the body shape imagery. We compute 3D head pose, 3D body pose,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
