EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis

Shuai Tan; Bin Ji

arXiv:2508.13442·cs.CV·August 20, 2025

EDTalk++: Full Disentanglement for Controllable Talking Head Synthesis

Shuai Tan, Bin Ji

PDF

TL;DR

EDTalk++ introduces a comprehensive disentanglement framework for controllable talking head synthesis, enabling independent manipulation of facial features from diverse inputs, with improved control and realism.

Contribution

The paper presents a novel full disentanglement approach with four separate modules for facial features, orthogonality constraints, and an audio-to-motion module, advancing controllable talking head generation.

Findings

01

Effective disentanglement of facial features demonstrated

02

Independent control of mouth, pose, eye, and expression achieved

03

Enhanced realism and flexibility in talking head synthesis

Abstract

Achieving disentangled control over multiple facial motions and accommodating diverse input modalities greatly enhances the application and entertainment of the talking head generation. This necessitates a deep exploration of the decoupling space for facial features, ensuring that they a) operate independently without mutual interference and b) can be preserved to share with different modal inputs, both aspects often neglected in existing methods. To address this gap, this paper proposes EDTalk++, a novel full disentanglement framework for controllable talking head generation. Our framework enables individual manipulation of mouth shape, head pose, eye movement, and emotional expression, conditioned on video or audio inputs. Specifically, we employ four lightweight modules to decompose the facial dynamics into four distinct latent spaces representing mouth, pose, eye, and expression,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.