Realistic One-shot Mesh-based Head Avatars
Taras Khakhulin, Vanessa Sklyarova, Victor Lempitsky, Egor Zakharov

TL;DR
The paper introduces ROME, a system that creates realistic, one-shot, mesh-based 3D head avatars from a single photo, capable of detailed rendering and reenactment.
Contribution
It proposes a novel method for estimating person-specific head meshes and neural textures from one image, enabling high-quality, rigged avatars with neural rendering.
Findings
Competitive head geometry recovery performance
High-quality neural rendering for cross-person reenactment
Effective in-the-wild video training dataset utilization
Abstract
We present a system for realistic one-shot mesh-based human head avatars creation, ROME for short. Using a single photograph, our model estimates a person-specific head mesh and the associated neural texture, which encodes both local photometric and geometric details. The resulting avatars are rigged and can be rendered using a neural network, which is trained alongside the mesh and texture estimators on a dataset of in-the-wild videos. In the experiments, we observe that our system performs competitively both in terms of head geometry recovery and the quality of renders, especially for the cross-person reenactment. See results https://samsunglabs.github.io/rome/
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Face recognition and analysis · Generative Adversarial Networks and Image Synthesis
MethodsRank-One Model Editing
