A Comprehensive Taxonomy and Analysis of Talking Head Synthesis:   Techniques for Portrait Generation, Driving Mechanisms, and Editing

Ming Meng; Yufei Zhao; Bo Zhang; Yonggui Zhu; Weimin Shi; Maxwell Wen,; and Zhaoxin Fan

arXiv:2406.10553·cs.CV·June 19, 2024

A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing

Ming Meng, Yufei Zhao, Bo Zhang, Yonggui Zhu, Weimin Shi, Maxwell Wen,, and Zhaoxin Fan

PDF

Open Access

TL;DR

This paper provides a comprehensive review of talking head synthesis, covering generation, driving mechanisms, and editing techniques, with analysis of datasets, performance metrics, and future research directions.

Contribution

It offers a systematic taxonomy, critical analysis, and organized datasets for talking head synthesis, highlighting recent advances and identifying research gaps.

Findings

01

Summarizes key milestones and innovations in the field.

02

Provides performance analysis based on various metrics.

03

Explores diverse application scenarios and future directions.

Abstract

Talking head synthesis, an advanced method for generating portrait videos from a still image driven by specific content, has garnered widespread attention in virtual reality, augmented reality and game production. Recently, significant breakthroughs have been made with the introduction of novel models such as the transformer and the diffusion model. Current methods can not only generate new content but also edit the generated material. This survey systematically reviews the technology, categorizing it into three pivotal domains: portrait generation, driven mechanisms, and editing techniques. We summarize milestone studies and critically analyze their innovations and shortcomings within each domain. Additionally, we organize an extensive collection of datasets and provide a thorough performance analysis of current methodologies based on various evaluation metrics, aiming to furnish a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI

MethodsDiffusion