Text-Driven Emotionally Continuous Talking Face Generation

Hao Yang; Yanyan Zhao; Tian Zheng; Hongbo Zhang; Bichen Wang; Di Wu; Xing Fu; Xuda Zhi; Yongbo Huang; Hao He

arXiv:2603.06071·cs.CV·March 9, 2026

Text-Driven Emotionally Continuous Talking Face Generation

Hao Yang, Yanyan Zhao, Tian Zheng, Hongbo Zhang, Bichen Wang, Di Wu, Xing Fu, Xuda Zhi, Yongbo Huang, Hao He

PDF

Open Access

TL;DR

This paper introduces a novel task and model for generating talking face videos that reflect continuous emotional changes driven by text and emotion descriptions, improving realism and emotional expressiveness.

Contribution

It proposes the EC-TFG task and the TIE-TFG model, enabling dynamic emotional expression in talking face generation based on text and emotion input.

Findings

01

Produces smooth emotion transitions in videos

02

Maintains high-quality visuals and motion authenticity

03

Handles diverse emotional states effectively

Abstract

Talking Face Generation (TFG) strives to create realistic and emotionally expressive digital faces. While previous TFG works have mastered the creation of naturalistic facial movements, they typically express a fixed target emotion in synthetic videos and lack the ability to exhibit continuously changing and natural expressions like humans do when conveying information. To synthesize realistic videos, we propose a novel task called Emotionally Continuous Talking Face Generation (EC-TFG), which takes a text segment and an emotion description with varying emotions as driving data, aiming to generate a video where the person speaks the text while reflecting the emotional changes within the description. Alongside this, we introduce a customized model, i.e., Temporal-Intensive Emotion Modulated Talking Face Generation (TIE-TFG), which innovatively manages dynamic emotional variations by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Emotion and Mood Recognition