InstantAvatar: Efficient 3D Head Reconstruction via Surface Rendering
Antonio Canela, Pol Caselles, Ibrar Malik, Eduard Ramon, Jaime, Garc\'ia, Jordi S\'anchez-Riera, Gil Triginer, Francesc Moreno-Noguer

TL;DR
InstantAvatar is a novel method that rapidly reconstructs full-head 3D avatars from minimal images using a voxel-grid neural field and a learned prior, achieving high accuracy in seconds.
Contribution
It introduces a fast 3D head reconstruction system combining voxel-grid neural fields with a statistical prior, significantly reducing computation time.
Findings
Achieves 3D head reconstructions in seconds.
Maintains comparable accuracy to state-of-the-art methods.
Uses a learned prior to stabilize and accelerate optimization.
Abstract
Recent advances in full-head reconstruction have been obtained by optimizing a neural field through differentiable surface or volume rendering to represent a single scene. While these techniques achieve an unprecedented accuracy, they take several minutes, or even hours, due to the expensive optimization process required. In this work, we introduce InstantAvatar, a method that recovers full-head avatars from few images (down to just one) in a few seconds on commodity hardware. In order to speed up the reconstruction process, we propose a system that combines, for the first time, a voxel-grid neural field representation with a surface renderer. Notably, a naive combination of these two techniques leads to unstable optimizations that do not converge to valid solutions. In order to overcome this limitation, we present a novel statistical model that learns a prior distribution over 3D head…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Face recognition and analysis · 3D Shape Modeling and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
