TL;DR
This paper introduces a unified framework for capturing complex hand interactions with objects and other hands using RGB-D data, combining generative models, salient points, collision detection, and physics simulation for accurate and plausible motion tracking.
Contribution
It presents a novel, differentiable, unified approach that handles multi-hand and object interactions in hand motion capture from monocular and multi-view RGB-D data.
Findings
Achieved low tracking error in complex interaction scenarios.
Successfully captured sequences with up to 150 degrees of freedom.
Validated on 29 diverse sequences with various interactions.
Abstract
Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
