DFA-NeRF: Personalized Talking Head Generation via Disentangled Face   Attributes Neural Rendering

Shunyu Yao; RuiZhe Zhong; Yichao Yan; Guangtao Zhai; Xiaokang Yang

arXiv:2201.00791·cs.CV·January 4, 2022·43 cites

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang

PDF

Open Access

TL;DR

DFA-NeRF introduces a neural radiance field framework that disentangles lip movements from other facial attributes, enabling high-fidelity, personalized talking head generation synchronized with audio.

Contribution

The paper presents a novel neural radiance field approach that separately models lip movements and personalized attributes, improving realism and synchronization in talking head synthesis.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Achieves high lip synchronization accuracy.

03

Generates natural head movements and eye blinks.

Abstract

While recent advances in deep neural networks have made it possible to render high-quality images, generating photo-realistic and personalized talking head remains challenging. With given audio, the key to tackling this task is synchronizing lip movement and simultaneously generating personalized attributes like head movement and eye blink. In this work, we observe that the input audio is highly correlated to lip motion while less correlated to other personalized attributes (e.g., head movements). Inspired by this, we propose a novel framework based on neural radiance field to pursue high-fidelity and personalized talking head generation. Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation. In the meanwhile,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Speech and Audio Processing

MethodsGaussian Process