NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior
Gihoon Kim, Kwanggyoon Seo, Sihun Cha, Junyong Noh

TL;DR
This paper introduces NeRFFaceSpeech, a novel method for high-quality, 3D-consistent, audio-driven talking head synthesis from a single image, leveraging generative priors and innovative techniques to address previous limitations.
Contribution
It presents a new approach combining NeRF and generative models to achieve 3D-consistent facial animation from one image, including a novel spatial synchronization and LipaintNet for inner-mouth detail.
Findings
Outperforms previous methods in 3D consistency and quality.
Introduces a quantitative robustness measure against pose variations.
Enables high-quality, one-shot audio-driven 3D head synthesis.
Abstract
Audio-driven talking head generation is advancing from 2D to 3D content. Notably, Neural Radiance Field (NeRF) is in the spotlight as a means to synthesize high-quality 3D talking head outputs. Unfortunately, this NeRF-based approach typically requires a large number of paired audio-visual data for each identity, thereby limiting the scalability of the method. Although there have been attempts to generate audio-driven 3D talking head animations with a single image, the results are often unsatisfactory due to insufficient information on obscured regions in the image. In this paper, we mainly focus on addressing the overlooked aspect of 3D consistency in the one-shot, audio-driven domain, where facial animations are synthesized primarily in front-facing perspectives. We propose a novel method, NeRFFaceSpeech, which enables to produce high-quality 3D-aware talking head. Using prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis
MethodsFocus
