TL;DR
EgoEV-HandPose introduces a novel stereo event-camera framework and dataset for egocentric 3D hand pose estimation and gesture recognition, outperforming prior methods especially in challenging lighting and occlusion conditions.
Contribution
The paper presents EgoEV-HandPose, a new end-to-end stereo event-camera approach and the first large-scale dataset for egocentric hand perception, advancing state-of-the-art performance.
Findings
Achieves MPJPE of 30.54mm in hand pose estimation.
Attains 86.87% Top-1 gesture recognition accuracy.
Outperforms RGB-based and prior event-camera methods in low-light and occlusion scenarios.
Abstract
Egocentric 3D hand pose estimation and gesture recognition are essential for immersive augmented/virtual reality, human-computer interaction, and robotics. However, conventional frame-based cameras suffer from motion blur and limited dynamic range, while existing event-based methods are hindered by ego-motion interference, monocular depth ambiguity, and the lack of large-scale real-world stereo datasets. To overcome these limitations, we propose EgoEV-HandPose, an end-to-end framework for joint 3D bimanual pose estimation and gesture recognition from stereo event streams. Central to our approach is KeypointBEV, a flexible stereo fusion module that lifts features into a canonical bird's-eye-view space and employs an iterative reprojection-guided refinement loop to progressively resolve depth uncertainty and enforce kinematic consistency. In addition, we introduce EgoEVHands, the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
