HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo; Yinyu Nie; Arthur Moreau; Jifei Song; Richard Shaw,; Yiren Zhou; Eduardo P\'erez-Pellitero

arXiv:2312.02902·cs.CV·August 14, 2024·1 cites

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo, Yinyu Nie, Arthur Moreau, Jifei Song, Richard Shaw,, Yiren Zhou, Eduardo P\'erez-Pellitero

PDF

Open Access

TL;DR

HeadGaS introduces a real-time 3D head animation method using 3D Gaussian Splats that achieves high-quality rendering and expressive control, significantly outperforming previous approaches in speed and visual fidelity.

Contribution

The paper presents a hybrid 3D Gaussian Splatting model with learnable features that enables real-time, high-quality 3D head animation with expression control.

Findings

01

Achieves real-time inference at high frame rates

02

Surpasses baseline quality by up to 2dB

03

Accelerates rendering speed by over 10 times

Abstract

3D head animation has seen major quality and runtime improvements over the last few years, particularly empowered by the advances in differentiable rendering and neural radiance fields. Real-time rendering is a highly desirable goal for real-world applications. We propose HeadGaS, a model that uses 3D Gaussian Splats (3DGS) for 3D head reconstruction and animation. In this paper we introduce a hybrid model that extends the explicit 3DGS representation with a base of learnable latent features, which can be linearly blended with low-dimensional parameters from parametric head models to obtain expression-dependent color and opacity values. We demonstrate that HeadGaS delivers state-of-the-art results in real-time inference frame rates, surpassing baselines by up to 2dB, while accelerating rendering speed by over x10.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Balanced Selection