Expressive Speech-driven Facial Animation with controllable emotions

Yutong Chen; Junhong Zhao; Wei-Qiang Zhang

arXiv:2301.02008·cs.CV·May 8, 2025·1 cites

Expressive Speech-driven Facial Animation with controllable emotions

Yutong Chen, Junhong Zhao, Wei-Qiang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a deep learning approach for speech-driven facial animation that allows controllable emotional expressions, achieving realistic lip sync and expressive diversity, surpassing existing methods.

Contribution

A novel emotion controller module enabling continuous and flexible emotion control in speech-driven facial animation.

Findings

01

Generated animations exhibit rich emotional expressiveness.

02

Maintains accurate lip synchronization.

03

Outperforms state-of-the-art methods in evaluations.

Abstract

It is in high demand to generate facial animation with high realism, but it remains a challenging task. Existing approaches of speech-driven facial animation can produce satisfactory mouth movement and lip synchronization, but show weakness in dramatic emotional expressions and flexibility in emotion control. This paper presents a novel deep learning-based approach for expressive facial animation generation from speech that can exhibit wide-spectrum facial expressions with controllable emotion type and intensity. We propose an emotion controller module to learn the relationship between the emotion variations (e.g., types and intensity) and the corresponding facial expression parameters. It enables emotion-controllable facial animation, where the target expression can be continuously adjusted as desired. The qualitative and quantitative evaluations show that the animation generated by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

on1262/facialanimation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Speech and Audio Processing