Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation

Kaiwen Jiang; Xueting Li; Seonwook Park; Ravi Ramamoorthi; Shalini De Mello; Koki Nagano

arXiv:2512.16893·cs.CV·December 19, 2025

Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation

Kaiwen Jiang, Xueting Li, Seonwook Park, Ravi Ramamoorthi, Shalini De Mello, Koki Nagano

PDF

Open Access

TL;DR

This paper introduces a fast, expressive 3D-consistent portrait animation method that distills knowledge from 2D diffusion models into a lightweight encoder, enabling real-time animation from a single image with high quality.

Contribution

It presents a novel distillation approach that combines the speed of 3D-aware methods with the expressive detail of 2D diffusion models, avoiding reliance on parametric face models.

Findings

01

Runs at 107.31 FPS for animation and pose control

02

Achieves comparable quality to state-of-the-art methods

03

Uses an efficient local fusion strategy for 3D structural and animation information

Abstract

Portrait animation has witnessed tremendous quality improvements thanks to recent advances in video diffusion models. However, these 2D methods often compromise 3D consistency and speed, limiting their applicability in real-world scenarios, such as digital twins or telepresence. In contrast, 3D-aware facial animation feedforward methods -- built upon explicit 3D representations, such as neural radiance fields or Gaussian splatting -- ensure 3D consistency and achieve faster inference speed, but come with inferior expression details. In this paper, we aim to combine their strengths by distilling knowledge from a 2D diffusion-based method into a feed-forward encoder, which instantly converts an in-the-wild single image into a 3D-consistent, fast yet expressive animatable representation. Our animation representation is decoupled from the face's 3D representation and learns motion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Emotion and Mood Recognition