Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial   Animation

Hui Fu; Zeqing Wang; Ke Gong; Keze Wang; Tianshui Chen; Haojie Li,; Haifeng Zeng; Wenxiong Kang

arXiv:2312.10877·cs.CV·December 19, 2023·2 cites

Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation

Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li,, Haifeng Zeng, Wenxiong Kang

PDF

Open Access 1 Video

TL;DR

This paper introduces Mimic, a novel framework for disentangling speaking style and content in speech-driven 3D facial animation, leading to more realistic and diverse facial animations by modeling subject-specific speaking styles.

Contribution

It presents the first method to explicitly disentangle speaking style from content in facial motion, enabling arbitrary style encoding and improved animation realism.

Findings

01

Outperforms state-of-the-art methods in qualitative and quantitative evaluations.

02

Effectively captures diverse speaking styles across datasets.

03

Enables realistic and style-consistent facial animations.

Abstract

Speech-driven 3D facial animation aims to synthesize vivid facial animations that accurately synchronize with speech and match the unique speaking style. However, existing works primarily focus on achieving precise lip synchronization while neglecting to model the subject-specific speaking style, often resulting in unrealistic facial animations. To the best of our knowledge, this work makes the first attempt to explore the coupled information between the speaking style and the semantic content in facial motions. Specifically, we introduce an innovative speaking style disentanglement method, which enables arbitrary-subject speaking style encoding and leads to a more realistic synthesis of speech-driven facial animations. Subsequently, we propose a novel framework called \textbf{Mimic} to learn disentangled representations of the speaking style and content from facial motions by building…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation· underline

Taxonomy

TopicsFace recognition and analysis · Speech and Audio Processing · Facial Nerve Paralysis Treatment and Research

MethodsFocus