Neural Face Video Compression using Multiple Views

Anna Volokitin; Stefan Brugger; Ali Benlalah; Sebastian Martin; Brian; Amberg; Michael Tschannen

arXiv:2203.15401·cs.CV·April 14, 2022

Neural Face Video Compression using Multiple Views

Anna Volokitin, Stefan Brugger, Ali Benlalah, Sebastian Martin, Brian, Amberg, Michael Tschannen

PDF

Open Access

TL;DR

This paper introduces a neural face video compression method that leverages multiple source views to improve reconstruction accuracy, significantly reducing bandwidth compared to traditional codecs.

Contribution

It proposes a multi-view neural face video compression approach that enhances reconstruction quality by using multiple source frames instead of a single view.

Findings

01

Improved face reconstruction accuracy with multiple views

02

Significant bandwidth savings over traditional codecs

03

Encouraging experimental results demonstrating effectiveness

Abstract

Recent advances in deep generative models led to the development of neural face video compression codecs that use an order of magnitude less bandwidth than engineered codecs. These neural codecs reconstruct the current frame by warping a source frame and using a generative model to compensate for imperfections in the warped source frame. Thereby, the warp is encoded and transmitted using a small number of keypoints rather than a dense flow field, which leads to massive savings compared to traditional codecs. However, by relying on a single source frame only, these methods lead to inaccurate reconstructions (e.g. one side of the head becomes unoccluded when turning the head and has to be synthesized). Here, we aim to tackle this issue by relying on multiple source frames (views of the face) and present encouraging results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Speech and Audio Processing · Digital Media Forensic Detection