STMR: Spiral Transformer for Hand Mesh Reconstruction
Huilong Xie, Wenwei Song, Wenxiong Kang, Yihong Lin

TL;DR
This paper introduces STMR, a novel spiral transformer architecture for hand mesh reconstruction that integrates spiral sampling with transformers, employing multi-scale pose features and pose-to-vertex lifting to achieve state-of-the-art accuracy and speed.
Contribution
The paper presents a new spiral transformer model that effectively leverages mesh topology for hand reconstruction, incorporating MSPFE and PPVL modules for enhanced feature extraction.
Findings
Achieves state-of-the-art accuracy on FreiHAND dataset
Demonstrates superior inference speed compared to similar methods
Shows improved mesh reconstruction performance with novel modules
Abstract
Recent advancements in both transformer-based methods and spiral neighbor sampling techniques have greatly enhanced hand mesh reconstruction. Transformers excel in capturing complex vertex relationships, and spiral neighbor sampling is vital for utilizing topological structures. This paper ingeniously integrates spiral sampling into the Transformer architecture, enhancing its ability to leverage mesh topology for superior performance in hand mesh reconstruction, resulting in substantial accuracy boosts. STMR employs a single image encoder for model efficiency. To augment its information extraction capability, we design the multi-scale pose feature extraction (MSPFE) module, which facilitates the extraction of rich pose features, ultimately enhancing the model's performance. Moreover, the proposed predefined pose-to-vertex lifting (PPVL) method improves vertex feature representation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReconstructive Surgery and Microvascular Techniques
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam
