Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video
Weichao Zhao, Hezhen Hu, Wengang Zhou, Li li, Houqiang Li

TL;DR
This paper introduces a novel framework that leverages spatial-temporal context and interpenetration detection to improve the accuracy and physical plausibility of interacting hand reconstruction from monocular RGB videos.
Contribution
It proposes a new temporal framework with motion smoothness constraints and an interpenetration detection module, advancing the state-of-the-art in interacting hand reconstruction.
Findings
Achieves new state-of-the-art results on public benchmarks.
Effectively models temporal context for better reconstruction.
Ensures physically plausible hand interactions without collisions.
Abstract
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many interfering factors, e.g. self- and mutual occlusion and similar textures. Previous works only leverage information from a single RGB image without modeling their physically plausible relation, which leads to inferior reconstruction results. In this work, we are dedicated to explicitly exploiting spatial-temporal information to achieve better interacting hand reconstruction. On one hand, we leverage temporal context to complement insufficient information provided by the single frame, and design a novel temporal framework with a temporal constraint for interacting hand motion smoothness. On the other hand, we further propose an interpenetration detection module to produce kinetically plausible interacting hands without physical collisions. Extensive experiments are performed to validate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Face recognition and analysis
