RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video
Jiayi Wang, Franziska Mueller, Florian Bernard, Suzanne Sorli,, Oleksandr Sotnychenko, Neng Qian, Miguel A. Otaduy, Dan Casas, Christian, Theobalt

TL;DR
This paper introduces a real-time RGB-based method for tracking and reconstructing the 3D pose and geometry of interacting hands, addressing depth ambiguity and outperforming existing RGB approaches.
Contribution
The work presents the first real-time RGB method explicitly designed for two-hand interaction tracking and 3D reconstruction, using a novel multi-task CNN and generative model fitting.
Findings
Outperforms existing RGB two-hand tracking methods.
Achieves comparable performance to depth-based real-time methods.
Provides detailed 3D hand pose and shape estimation from monocular RGB.
Abstract
Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Human Motion and Animation
