EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

Shuai Tan; Bin Ji; Mengxiao Bi; Ye Pan

arXiv:2404.01647·cs.CV·April 3, 2024·1 cites

EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

Shuai Tan, Bin Ji, Mengxiao Bi, Ye Pan

PDF

Open Access 1 Models

TL;DR

EDTalk introduces a novel framework for disentangled control of facial motions in talking head synthesis, enabling independent manipulation of mouth, pose, and expression conditioned on audio or video inputs, with improved efficiency and flexibility.

Contribution

The paper proposes a lightweight, orthogonality-enforced disentanglement framework that allows independent control of facial features and shared priors for audio-driven synthesis.

Findings

01

Effective disentanglement of facial motions demonstrated

02

Independent control of mouth, pose, and expression achieved

03

Enhanced synthesis quality and flexibility

Abstract

Achieving disentangled control over multiple facial motions and accommodating diverse input modalities greatly enhances the application and entertainment of the talking head generation. This necessitates a deep exploration of the decoupling space for facial features, ensuring that they a) operate independently without mutual interference and b) can be preserved to share with different modal input, both aspects often neglected in existing methods. To address this gap, this paper proposes a novel Efficient Disentanglement framework for Talking head generation (EDTalk). Our framework enables individual manipulation of mouth shape, head pose, and emotional expression, conditioned on video or audio inputs. Specifically, we employ three lightweight modules to decompose the facial dynamics into three distinct latent spaces representing mouth, pose, and expression, respectively. Each space is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
tanshuai0219/EDTalk
model· ♡ 3
♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Face recognition and analysis

MethodsSparse Evolutionary Training