EgoHumans: An Egocentric 3D Multi-Human Benchmark
Rawal Khirodkar, Aayush Bansal, Lingni Ma, Richard Newcombe, Minh Vo,, Kris Kitani

TL;DR
EgoHumans introduces a comprehensive egocentric multi-human 3D pose and tracking benchmark captured with wearable cameras in real-world scenarios, enabling better evaluation and development of egocentric human understanding algorithms.
Contribution
The paper presents a new multi-view egocentric dataset with diverse activities and occlusion handling, along with EgoFormer, a transformer-based method that improves multi-human tracking accuracy.
Findings
EgoHumans dataset contains over 125k images of multi-human activities.
Existing methods perform poorly on egocentric multi-human tracking.
EgoFormer achieves a 13.6% improvement in IDF1 score over prior methods.
Abstract
We present EgoHumans, a new multi-view multi-human video benchmark to advance the state-of-the-art of egocentric human 3D pose estimation and tracking. Existing egocentric benchmarks either capture single subject or indoor-only scenarios, which limit the generalization of computer vision algorithms for real-world applications. We propose a novel 3D capture setup to construct a comprehensive egocentric multi-human benchmark in the wild with annotations to support diverse tasks such as human detection, tracking, 2D/3D pose estimation, and mesh recovery. We leverage consumer-grade wearable camera-equipped glasses for the egocentric view, which enables us to capture dynamic activities like playing tennis, fencing, volleyball, etc. Furthermore, our multi-view setup generates accurate 3D ground truth even under severe or complete occlusion. The dataset consists of more than 125k egocentric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Stroke Rehabilitation and Recovery
MethodsFocus
