TL;DR
OMG-Avatar is a fast, one-shot 3D head reconstruction method using a multi-LOD Gaussian model, enabling high-quality, adaptable head avatars from a single image with diverse hardware support.
Contribution
It introduces a novel multi-LOD Gaussian representation and a unified model for efficient, high-quality head avatar reconstruction from a single image, supporting various levels of detail.
Findings
Outperforms state-of-the-art in reconstruction quality.
Achieves real-time inference in 0.2 seconds.
Supports diverse hardware with multi-LOD modeling.
Abstract
We propose OMG-Avatar, a novel One-shot method that leverages a Multi-LOD (Level-of-Detail) Gaussian representation for animatable 3D head reconstruction from a single image in 0.2s. Our method enables LOD head avatar modeling using a unified model that accommodates diverse hardware capabilities and inference speed requirements. To capture both global and local facial characteristics, we employ a transformer-based architecture for global feature extraction and projection-based sampling for local feature acquisition. These features are effectively fused under the guidance of a depth buffer, ensuring occlusion plausibility. We further introduce a coarse-to-fine learning paradigm to support Level-of-Detail functionality and enhance the perception of hierarchical details. To address the limitations of 3DMMs in modeling non-head regions such as the shoulders, we introduce a multi-region…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
