Multi-view Hand Reconstruction with a Point-Embedded Transformer
Lixin Yang, Licheng Zhong, Pengxiang Zhu, Xinyu Zhan, Junxiao Kong,, Jian Xu, Cewu Lu

TL;DR
This paper introduces POEM, a multi-view hand mesh reconstruction model that embeds a static basis point in stereo space, enabling effective feature fusion and demonstrating strong generalizability in real-world scenarios.
Contribution
The paper proposes embedding a static basis point in stereo space and a training strategy with diverse multi-view data, improving multi-view hand reconstruction's accuracy and generalizability.
Findings
High accuracy in multi-view hand mesh reconstruction.
Strong generalization to diverse camera configurations.
Open-source implementation available.
Abstract
This work introduces a novel and generalizable multi-view Hand Mesh Reconstruction (HMR) model, named POEM, designed for practical use in real-world hand motion capture scenarios. The advances of the POEM model consist of two main aspects. First, concerning the modeling of the problem, we propose embedding a static basis point within the multi-view stereo space. A point represents a natural form of 3D information and serves as an ideal medium for fusing features across different views, given its varied projections across these views. Consequently, our method harnesses a simple yet effective idea: a complex 3D hand mesh can be represented by a set of 3D basis points that 1) are embedded in the multi-view stereo, 2) carry features from the multi-view images, and 3) encompass the hand in it. The second advance lies in the training strategy. We utilize a combination of five large-scale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Anatomy and Medical Technology
MethodsSparse Evolutionary Training
