# Realistic Speech-Driven Facial Animation with GANs

**Authors:** Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

arXiv: 1906.06337 · 2019-06-18

## TL;DR

This paper introduces an end-to-end GAN-based system for realistic speech-driven facial animation that produces synchronized lip movements and natural expressions from a single image and speech audio.

## Contribution

It presents a novel temporal GAN architecture that generates realistic talking head videos without handcrafted features, improving synchronization and naturalness.

## Key findings

- Generated videos show accurate lip synchronization.
- The system produces natural facial expressions like blinks.
- High-quality, sharp, and synchronized facial animations.

## Abstract

Speech-driven facial animation is the process that automatically synthesizes talking characters based on speech signals. The majority of work in this domain creates a mapping from audio features to visual features. This approach often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features. Our method generates videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. Our temporal GAN uses 3 discriminators focused on achieving detailed frames, audio-visual synchronization, and realistic expressions. We quantify the contribution of each component in our model using an ablation study and we provide insights into the latent representation of the model. The generated videos are evaluated based on sharpness, reconstruction quality, lip-reading accuracy, synchronization as well as their ability to generate natural blinks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.06337/full.md

## Figures

39 figures with captions in the complete paper: https://tomesphere.com/paper/1906.06337/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1906.06337/full.md

---
Source: https://tomesphere.com/paper/1906.06337