Talking Face Generation by Conditional Recurrent Adversarial Network
Yang Song, Jingwen Zhu, Dawei Li, Xiaolong Wang, Hairong Qi

TL;DR
This paper introduces a novel conditional recurrent adversarial network for talking face generation that ensures smooth facial movements and accurate lip synchronization across diverse videos, outperforming existing methods.
Contribution
The work presents a new conditional video generation model incorporating temporal dependencies and a multi-task adversarial training scheme for improved realism and lip sync accuracy.
Findings
Outperforms state-of-the-art in visual quality and lip sync accuracy
Ensures smooth facial and lip movements over entire videos
Effective dataset reduction without quality loss
Abstract
Given an arbitrary face image and an arbitrary speech clip, the proposed work attempts to generating the talking face video with accurate lip synchronization while maintaining smooth transition of both lip and facial movement over the entire video clip. Existing works either do not consider temporal dependency on face images across different video frames thus easily yielding noticeable/abrupt facial and lip movement or are only limited to the generation of talking face video for a specific person thus lacking generalization capacity. We propose a novel conditional video generation network where the audio input is treated as a condition for the recurrent adversarial network such that temporal dependency is incorporated to realize smooth transition for the lip and facial movement. In addition, we deploy a multi-task adversarial training scheme in the context of video generation to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques
