Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera
Franziska Mueller, Micah Davis, Florian Bernard, Oleksandr, Sotnychenko, Mickeal Verschoor, Miguel A. Otaduy, Dan Casas, Christian, Theobalt

TL;DR
This paper introduces a real-time, marker-less method for reconstructing the pose and shape of two interacting hands using a single depth camera, leveraging deep learning and GPU optimization.
Contribution
It is the first two-hand tracking approach that combines real-time performance, collision handling, shape adaptation, and single-camera input without markers.
Findings
Achieves state-of-the-art accuracy in complex two-hand interactions.
Handles inter- and intra-hand collisions effectively.
Operates in real time on consumer hardware.
Abstract
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands. Our approach is the first two-hand tracking solution that combines an extensive list of favorable properties, namely it is marker-less, uses a single consumer-level depth camera, runs in real time, handles inter- and intra-hand collisions, and automatically adjusts to the user's hand shape. In order to achieve this, we embed a recent parametric hand pose and shape model and a dense correspondence predictor based on a deep neural network into a suitable energy minimization framework. For training the correspondence prediction network, we synthesize a two-hand dataset based on physical simulations that includes both hand pose and shape annotations while at the same time avoiding inter-hand penetrations. To achieve real-time rates, we phrase the model fitting in terms of a nonlinear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
