CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial   Animation Generation

Xiangyu Liang; Wenlin Zhuang; Tianyong Wang; Guangxing Geng; Guangyue; Geng; Haifeng Xia; Siyu Xia

arXiv:2404.18604·cs.CV·April 30, 2024

CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

Xiangyu Liang, Wenlin Zhuang, Tianyong Wang, Guangxing Geng, Guangyue, Geng, Haifeng Xia, Siyu Xia

PDF

Open Access

TL;DR

CSTalk is a novel method that models correlations among facial regions and supervises training to generate realistic, emotion-conforming 3D facial animations driven by speech, addressing naturalness and expressiveness issues.

Contribution

The paper introduces CSTalk, a correlation-supervised generative approach that improves naturalness and emotional expressiveness in speech-driven 3D facial animation.

Findings

01

Outperforms existing state-of-the-art methods

02

Generates more natural and expressive facial animations

03

Effectively models correlations among facial regions

Abstract

Speech-driven 3D facial animation technology has been developed for years, but its practical application still lacks expectations. The main challenges lie in data limitations, lip alignment, and the naturalness of facial expressions. Although lip alignment has seen many related studies, existing methods struggle to synthesize natural and realistic expressions, resulting in a mechanical and stiff appearance of facial animations. Even with some research extracting emotional features from speech, the randomness of facial movements limits the effective expression of emotions. To address this issue, this paper proposes a method called CSTalk (Correlation Supervised) that models the correlations among different regions of facial movements and supervises the training of the generative model to generate realistic expressions that conform to human facial motion patterns. To generate more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis

MethodsSparse Evolutionary Training