Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head   Synthesis

Shuai Shen; Wanhua Li; Zheng Zhu; Yueqi Duan; Jie Zhou; Jiwen Lu

arXiv:2207.11770·cs.CV·July 26, 2022·5 cites

Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis

Shuai Shen, Wanhua Li, Zheng Zhu, Yueqi Duan, Jie Zhou, Jiwen Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Dynamic Facial Radiance Fields (DFRF), a novel method enabling few-shot, high-quality talking head synthesis by conditioning on appearance images and audio-driven face deformation, significantly reducing training data requirements.

Contribution

The paper proposes DFRF, which generalizes to unseen identities with minimal data by conditioning on appearance images and using a differentiable face warping module based on audio signals.

Findings

01

DFRF synthesizes natural, high-quality talking head videos with only tens of seconds of data.

02

It requires only 40,000 training iterations for new identities.

03

DFRF outperforms existing methods in few-shot talking head synthesis.

Abstract

Talking head synthesis is an emerging technology with wide applications in film dubbing, virtual avatars and online education. Recent NeRF-based methods generate more natural talking videos, as they better capture the 3D structural information of faces. However, a specific model needs to be trained for each identity with a large dataset. In this paper, we propose Dynamic Facial Radiance Fields (DFRF) for few-shot talking head synthesis, which can rapidly generalize to an unseen identity with few training data. Different from the existing NeRF-based methods which directly encode the 3D geometry and appearance of a specific person into the network, our DFRF conditions face radiance field on 2D appearance images to learn the face prior. Thus the facial radiance field can be flexibly adjusted to the new identity with few reference images. Additionally, for better modeling of the facial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sstzal/DFRF
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition

MethodsContrastive Language-Image Pre-training