EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable   Landmark Conditions

Zhiyuan Chen; Jiajiong Cao; Zhiquan Chen; Yuming Li; Chenguang Ma

arXiv:2407.08136·cs.CV·July 15, 2024·2 cites

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions

Zhiyuan Chen, Jiajiong Cao, Zhiquan Chen, Yuming Li, Chenguang Ma

PDF

Open Access 1 Repo 2 Models 1 Video

TL;DR

EchoMimic is a novel portrait animation method that combines audio and facial landmarks during training, resulting in more stable and natural lifelike video generation compared to existing approaches.

Contribution

It introduces a training strategy that concurrently uses audio and facial landmarks, enabling flexible and improved portrait animation.

Findings

01

Outperforms existing methods in quantitative metrics

02

Produces more natural and stable portrait videos

03

Works effectively with combined audio and landmark inputs

Abstract

The area of portrait image animation, propelled by audio input, has witnessed notable progress in the generation of lifelike and dynamic portraits. Conventional methods are limited to utilizing either audios or facial key points to drive images into videos, while they can yield satisfactory results, certain issues exist. For instance, methods driven solely by audios can be unstable at times due to the relatively weaker audio signal, while methods driven exclusively by facial key points, although more stable in driving, can result in unnatural outcomes due to the excessive control of key point information. In addressing the previously mentioned challenges, in this paper, we introduce a novel approach which we named EchoMimic. EchoMimic is concurrently trained using both audios and facial landmarks. Through the implementation of a novel training strategy, EchoMimic is capable of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

antgroup/echomimic
pytorchOfficial

Models

Videos

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions· underline

Taxonomy

Topics3D Surveying and Cultural Heritage · Digital Humanities and Scholarship · Subtitles and Audiovisual Media