Identity-Preserving Talking Face Generation with Landmark and Appearance   Priors

Weizhi Zhong; Chaowei Fang; Yinqi Cai; Pengxu Wei; Gangming Zhao,; Liang Lin; Guanbin Li

arXiv:2305.08293·cs.CV·May 16, 2023·1 cites

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

Weizhi Zhong, Chaowei Fang, Yinqi Cai, Pengxu Wei, Gangming Zhao,, Liang Lin, Guanbin Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage framework for generating realistic, lip-synced talking face videos that preserve identity, using a Transformer-based landmark generator and a landmark-to-video rendering model with prior appearance information.

Contribution

The paper presents a novel two-stage approach combining audio-to-landmark generation and landmark-to-video rendering with prior face appearance, improving realism and identity preservation.

Findings

01

Outperforms existing person-generic methods in realism and lip-sync accuracy.

02

Effectively preserves identity using prior appearance information from static reference images.

03

Achieves better synchronization between audio and generated face videos.

Abstract

Generating talking face videos from audio attracts lots of research interest. A few person-specific methods can generate vivid videos but require the target speaker's videos for training or fine-tuning. Existing person-generic methods have difficulty in generating realistic and lip-synced videos while preserving identity information. To tackle this problem, we propose a two-stage framework consisting of audio-to-landmark generation and landmark-to-video rendering procedures. First, we devise a novel Transformer-based landmark generator to infer lip and jaw landmarks from the audio. Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker. Then, a video rendering model is built to translate the generated landmarks into face images. During this stage, prior appearance information is extracted from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Weizhi-Zhong/IP_LAP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis

MethodsALIGN