DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
Qingxuan Wu, Zhiyang Dou, Sirui Xu, Soshi Shimada, Chen Wang,, Zhengming Yu, Yuan Liu, Cheng Lin, Zeyu Cao, Taku Komura, Vladislav Golyanik,, Christian Theobalt, Wenping Wang, Lingjie Liu

TL;DR
DICE is an end-to-end Transformer-based method that accurately reconstructs 3D hand-face interactions with deformations from a single image, outperforming prior approaches in speed and generalization.
Contribution
It introduces a novel end-to-end framework with disentangled deformation and contact estimation, and a weakly-supervised training strategy for better generalization without 3D annotations.
Findings
Achieves state-of-the-art accuracy on benchmark datasets.
Operates at 20 fps on high-end GPU, significantly faster than previous methods.
Demonstrates robust performance on in-the-wild images.
Abstract
Reconstructing 3D hand-face interactions with deformations from a single image is a challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges stem from self-occlusions during single-view hand-face interactions, diverse spatial relationships between hands and face, complex deformations, and the ambiguity of the single-view setting. The first and only method for hand-face interaction recovery, Decaf, introduces a global fitting optimization guided by contact and deformation estimation networks trained on studio-collected data with 3D annotations. However, Decaf suffers from a time-consuming optimization process and limited generalization capability due to its reliance on 3D annotations of hand-face interaction data. To address these issues, we present DICE, the first end-to-end method for Deformation-aware hand-face Interaction reCovEry from a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Gaze Tracking and Assistive Technology · Ergonomics and Musculoskeletal Disorders
MethodsSparse Evolutionary Training
