DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for   High-Fidelity Talking Portrait Synthesis

Yaoyu Su; Shaohui Wang; Haoqian Wang

arXiv:2309.07752·cs.CV·September 15, 2023

DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Yaoyu Su, Shaohui Wang, Haoqian Wang

PDF

Open Access

TL;DR

DT-NeRF introduces a decomposed triplane-hash neural radiance field framework that enhances photorealistic talking portrait synthesis by decomposing facial regions, integrating audio features, and leveraging NeRF volumetric rendering for superior results.

Contribution

The paper proposes a novel decomposed triplane-hash NeRF architecture with specialized facial region representations and audio integration for high-fidelity talking portrait synthesis.

Findings

01

Achieves state-of-the-art photorealistic rendering of talking faces.

02

Effectively decomposes facial features into specialized triplanes.

03

Demonstrates superior performance on key evaluation datasets.

Abstract

In this paper, we present the decomposed triplane-hash neural radiance fields (DT-NeRF), a framework that significantly improves the photorealistic rendering of talking faces and achieves state-of-the-art results on key evaluation datasets. Our architecture decomposes the facial region into two specialized triplanes: one specialized for representing the mouth, and the other for the broader facial features. We introduce audio features as residual terms and integrate them as query vectors into our model through an audio-mouth-face transformer. Additionally, our method leverages the capabilities of Neural Radiance Fields (NeRF) to enrich the volumetric representation of the entire face through additive volumetric rendering techniques. Comprehensive experimental evaluations corroborate the effectiveness and superiority of our proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging