TL;DR
MeshLAM is a fast, one-shot method for creating high-quality, animatable 3D head avatars from a single image, avoiding lengthy optimization or multiple views.
Contribution
It introduces a novel feed-forward framework with a dual shape-texture architecture and an iterative GRU-based decoding mechanism for coherent mesh and texture reconstruction.
Findings
Outperforms state-of-the-art in reconstruction quality.
Achieves real-time animation from a single image.
Demonstrates superior computational efficiency.
Abstract
We introduce MeshLAM, a feed-forward framework for one-shot animatable mesh head reconstruction that generates high-fidelity, animatable 3D head avatars from a single image. Unlike previous work that relies on time-consuming test-time optimization or extensive multi-view data, our method produces complete mesh representations with inherent animatability from a single image in a single forward pass. Our approach employs a dual shape and texture map architecture that simultaneously processes mesh vertices and texture map with extracted image features from a shared transformer backbone, allowing for coherent shape carving and appearance modeling. To prevent mesh collapse and ensure topological integrity during feed-forward deformation, we propose an iterative GRU-based decoding mechanism with progressive geometry deformation and texture refinement, coupled with a novel reprojection-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
