InstantAvatar: Efficient 3D Head Reconstruction via Surface Rendering

Antonio Canela; Pol Caselles; Ibrar Malik; Eduard Ramon; Jaime; Garc\'ia; Jordi S\'anchez-Riera; Gil Triginer; Francesc Moreno-Noguer

arXiv:2308.04868·cs.CV·April 8, 2024

InstantAvatar: Efficient 3D Head Reconstruction via Surface Rendering

Antonio Canela, Pol Caselles, Ibrar Malik, Eduard Ramon, Jaime, Garc\'ia, Jordi S\'anchez-Riera, Gil Triginer, Francesc Moreno-Noguer

PDF

Open Access

TL;DR

InstantAvatar is a novel method that rapidly reconstructs full-head 3D avatars from minimal images using a voxel-grid neural field and a learned prior, achieving high accuracy in seconds.

Contribution

It introduces a fast 3D head reconstruction system combining voxel-grid neural fields with a statistical prior, significantly reducing computation time.

Findings

01

Achieves 3D head reconstructions in seconds.

02

Maintains comparable accuracy to state-of-the-art methods.

03

Uses a learned prior to stabilize and accelerate optimization.

Abstract

Recent advances in full-head reconstruction have been obtained by optimizing a neural field through differentiable surface or volume rendering to represent a single scene. While these techniques achieve an unprecedented accuracy, they take several minutes, or even hours, due to the expensive optimization process required. In this work, we introduce InstantAvatar, a method that recovers full-head avatars from few images (down to just one) in a few seconds on commodity hardware. In order to speed up the reconstruction process, we propose a system that combines, for the first time, a voxel-grid neural field representation with a surface renderer. Notably, a naive combination of these two techniques leads to unstable optimizations that do not converge to valid solutions. In order to overcome this limitation, we present a novel statistical model that learns a prior distribution over 3D head…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Face recognition and analysis · 3D Shape Modeling and Analysis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings