TL;DR
UIKA is a fast, universal head avatar model that creates animatable 3D heads from pose-free images, using UV-guided modeling and a synthetic dataset to outperform existing methods.
Contribution
The paper introduces a novel UV-guided avatar modeling strategy and a large synthetic dataset for training, enabling rapid head avatar creation from various pose-free inputs.
Findings
Outperforms existing approaches in monocular and multi-view settings.
Uses UV correspondence to decouple camera pose and expression.
Employs learnable UV tokens for effective attention and decoding.
Abstract
We present UIKA, a feed-forward animatable Gaussian head model from an arbitrary number of pose-free inputs, including a single image, multi-view captures, and smartphone-captured videos. Unlike the traditional avatar method, which requires a studio-level multi-view capture system and reconstructs a human-specific model through a long-time optimization process, we rethink the task through the lenses of model representation, network design, and data preparation. First, we introduce a UV-guided avatar modeling strategy, in which each input image is associated with a pixel-wise facial correspondence estimation. Such correspondence estimation allows us to reproject each valid pixel color from screen space to UV space, which is independent of camera pose and character expression. Furthermore, we design learnable UV tokens on which the attention mechanism can be applied at both the screen and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Gaze Tracking and Assistive Technology
