A Keypoint Based Enhancement Method for Audio Driven Free View Talking   Head Synthesis

Yichen Han; Ya Li; Yingming Gao; Jinlong Xue; Songpo Wang; Lei Yang

arXiv:2210.03335·cs.CV·October 10, 2022

A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis

Yichen Han, Ya Li, Yingming Gao, Jinlong Xue, Songpo Wang, Lei Yang

PDF

Open Access

TL;DR

This paper introduces a Keypoint Based Enhancement (KPBE) method that improves the naturalness and visual quality of audio-driven free view talking head synthesis by addressing common artifacts like mouth cut feeling and skin highlight deficiencies.

Contribution

The proposed KPBE method enhances existing synthesis techniques by using keypoint decomposition and motion field-based image generation to produce more natural and visually appealing talking head videos.

Findings

01

Improved mean opinion scores for video quality.

02

Reduced mouth cut feeling and enhanced skin highlights.

03

Better synchronization and naturalness in synthesized videos.

Abstract

Audio driven talking head synthesis is a challenging task that attracts increasing attention in recent years. Although existing methods based on 2D landmarks or 3D face models can synthesize accurate lip synchronization and rhythmic head pose for arbitrary identity, they still have limitations, such as the cut feeling in the mouth mapping and the lack of skin highlights. The morphed region is blurry compared to the surrounding face. A Keypoint Based Enhancement (KPBE) method is proposed for audio driven free view talking head synthesis to improve the naturalness of the generated video. Firstly, existing methods were used as the backend to synthesize intermediate results. Then we used keypoint decomposition to extract video synthesis controlling parameters from the backend output and the source image. After that, the controlling parameters were composited to the source keypoints and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis