Audio-Driven 3D Facial Animation from In-the-Wild Videos

Liying Lu; Tianke Zhang; Yunfei Liu; Xuangeng Chu; Yu Li

arXiv:2306.11541·cs.CV·June 21, 2023·1 cites

Audio-Driven 3D Facial Animation from In-the-Wild Videos

Liying Lu, Tianke Zhang, Yunfei Liu, Xuangeng Chu, Yu Li

PDF

Open Access

TL;DR

This paper introduces a novel audio-driven 3D facial animation method that leverages in-the-wild 2D videos for training, resulting in improved generalization, lip synchronization, and personalized speaking styles.

Contribution

It utilizes abundant 2D talking-head videos combined with 3D face reconstruction to enhance 3D facial animation from audio, surpassing prior limited-data approaches.

Findings

01

Outperforms existing methods in lip synchronization quality.

02

Effectively captures individual speaking styles.

03

Demonstrates superior generalization on diverse videos.

Abstract

Given an arbitrary audio clip, audio-driven 3D facial animation aims to generate lifelike lip motions and facial expressions for a 3D head. Existing methods typically rely on training their models using limited public 3D datasets that contain a restricted number of audio-3D scan pairs. Consequently, their generalization capability remains limited. In this paper, we propose a novel method that leverages in-the-wild 2D talking-head videos to train our 3D facial animation model. The abundance of easily accessible 2D talking-head videos equips our model with a robust generalization capability. By combining these videos with existing 3D face reconstruction methods, our model excels in generating consistent and high-fidelity lip synchronization. Additionally, our model proficiently captures the speaking styles of different individuals, allowing it to generate 3D talking-heads with distinct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Speech and Audio Processing