From Pixels to Portraits: A Comprehensive Survey of Talking Head   Generation Techniques and Applications

Shreyank N Gowda; Dheeraj Pandey; Shashank Narayana Gowda

arXiv:2308.16041·cs.CV·August 31, 2023·2 cites

From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications

Shreyank N Gowda, Dheeraj Pandey, Shashank Narayana Gowda

PDF

Open Access

TL;DR

This comprehensive survey reviews recent deep learning methods for talking head generation, categorizing approaches, analyzing their strengths and limitations, and comparing models on key performance metrics to guide future research.

Contribution

It systematically categorizes and analyzes state-of-the-art talking head generation techniques, providing a clear overview and identifying promising future directions.

Findings

01

Image-driven, audio-driven, video-driven, and other methods are effectively categorized.

02

Publicly available models are compared based on inference time and quality.

03

The survey highlights key strengths and limitations of current approaches.

Abstract

Recent advancements in deep learning and computer vision have led to a surge of interest in generating realistic talking heads. This paper presents a comprehensive survey of state-of-the-art methods for talking head generation. We systematically categorises them into four main approaches: image-driven, audio-driven, video-driven and others (including neural radiance fields (NeRF), and 3D-based methods). We provide an in-depth analysis of each method, highlighting their unique contributions, strengths, and limitations. Furthermore, we thoroughly compare publicly available models, evaluating them on key aspects such as inference time and human-rated quality of the generated outputs. Our aim is to provide a clear and concise overview of the current landscape in talking head generation, elucidating the relationships between different approaches and identifying promising directions for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis