Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking   Portrait Synthesis

Jiahe Li; Jiawei Zhang; Xiao Bai; Jun Zhou; Lin Gu

arXiv:2307.09323·cs.CV·August 25, 2023·5 cites

Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Jiahe Li, Jiawei Zhang, Xiao Bai, Jun Zhou, Lin Gu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ER-NeRF, a region-aware neural radiance field architecture that enables fast, high-fidelity talking portrait synthesis with real-time rendering and efficient model size, leveraging spatial region contributions and explicit audio-region connections.

Contribution

The paper proposes a novel ER-NeRF architecture with a Tri-Plane Hash Representation, a Region Attention Module, and Adaptive Pose Encoding for improved talking portrait synthesis.

Findings

01

Achieves state-of-the-art high-fidelity talking portrait videos.

02

Enables real-time rendering with small model size.

03

Demonstrates superior performance in accuracy and efficiency.

Abstract

This paper presents ER-NeRF, a novel conditional Neural Radiance Fields (NeRF) based architecture for talking portrait synthesis that can concurrently achieve fast convergence, real-time rendering, and state-of-the-art performance with small model size. Our idea is to explicitly exploit the unequal contribution of spatial regions to guide talking portrait modeling. Specifically, to improve the accuracy of dynamic head reconstruction, a compact and expressive NeRF-based Tri-Plane Hash Representation is introduced by pruning empty spatial regions with three planar hash encoders. For speech audio, we propose a Region Attention Module to generate region-aware condition feature via an attention mechanism. Different from existing methods that utilize an MLP-based encoder to learn the cross-modal relation implicitly, the attention mechanism builds an explicit connection between audio features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fictionarry/er-nerf
pytorchOfficial

Videos

Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis· youtube

Taxonomy

TopicsAdvanced Vision and Imaging · Human Motion and Animation · Face recognition and analysis

MethodsPruning