Enhancing Monocular 3D Hand Reconstruction with Learned Texture Priors
Giorgos Karvounas, Nikolaos Kyriazis, Iason Oikonomidis, Georgios Pavlakos, Antonis A. Argyros

TL;DR
This paper introduces a lightweight texture module that uses dense appearance alignment to improve monocular 3D hand reconstruction accuracy and realism by leveraging texture cues as active supervisory signals.
Contribution
It proposes a novel dense alignment loss with a differentiable rendering pipeline, enhancing existing 3D hand pose estimation models with texture-guided supervision.
Findings
Improved accuracy in 3D hand pose estimation.
Enhanced realism in reconstructed hand models.
Effective integration with existing pipelines.
Abstract
We revisit the role of texture in monocular 3D hand reconstruction, not as an afterthought for photorealism, but as a dense, spatially grounded cue that can actively support pose and shape estimation. Our observation is simple: even in high-performing models, the overlay between predicted hand geometry and image appearance is often imperfect, suggesting that texture alignment may be an underused supervisory signal. We propose a lightweight texture module that embeds per-pixel observations into UV texture space and enables a novel dense alignment loss between predicted and observed hand appearances. Our approach assumes access to a differentiable rendering pipeline and a model that maps images to 3D hand meshes with known topology, allowing us to back-project a textured hand onto the image and perform pixel-based alignment. The module is self-contained and easily pluggable into existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
