A Lip Sync Expert Is All You Need for Speech to Lip Generation In The   Wild

K R Prajwal; Rudrabha Mukhopadhyay; Vinay Namboodiri; C V Jawahar

arXiv:2008.10010·cs.CV·August 25, 2020

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

PDF

4 Repos 5 Models 1 Video

TL;DR

This paper introduces Wav2Lip, a novel model that significantly improves lip-sync accuracy in unconstrained talking face videos by learning from a powerful discriminator and establishing new evaluation benchmarks.

Contribution

We propose Wav2Lip, a lip-sync model that outperforms existing methods in dynamic, arbitrary identity videos, and introduce rigorous benchmarks for measuring lip synchronization.

Findings

01

Wav2Lip achieves near real-video lip-sync accuracy.

02

The model outperforms prior methods on new benchmarks.

03

Extensive evaluations validate the effectiveness of Wav2Lip.

Abstract

In this work, we investigate the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment. Current works excel at producing accurate lip movements on a static image or videos of specific people seen during the training phase. However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio. We identify key reasons pertaining to this and hence resolve them by learning from a powerful lip-sync discriminator. Next, we propose new, rigorous evaluation benchmarks and metrics to accurately measure lip synchronization in unconstrained videos. Extensive quantitative evaluations on our challenging benchmarks show that the lip-sync accuracy of the videos generated by our Wav2Lip model is almost as good as real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

Making Talking Memes With Voice DeepFakes!· youtube