Head2Head++: Deep Facial Attributes Re-Targeting

Michail Christos Doukas; Mohammad Rami Koujan; Viktoriia Sharmanska,; Anastasios Roussos

arXiv:2006.10199·cs.CV·September 29, 2021

Head2Head++: Deep Facial Attributes Re-Targeting

Michail Christos Doukas, Mohammad Rami Koujan, Viktoriia Sharmanska,, Anastasios Roussos

PDF

1 Repo

TL;DR

Head2Head++ is a deep learning system that uses 3D face geometry and GANs to perform real-time, photo-realistic facial attribute re-targeting and head reenactment from monocular videos.

Contribution

It introduces a novel architecture combining 3D face modeling and GANs for temporally consistent, high-quality facial reenactment in nearly real-time.

Findings

01

Successfully transfers facial expressions, head pose, and eye gaze.

02

Achieves photo-realistic and faithful reenactment.

03

Operates at nearly 18 fps in real-time.

Abstract

Facial video re-targeting is a challenging problem aiming to modify the facial attributes of a target subject in a seamless manner by a driving monocular sequence. We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment. Our method is different to purely 3D model-based approaches, or recent image-based methods that use Deep Convolutional Neural Networks (DCNNs) to generate individual frames. We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos, with the aid of a sequential Generator and an ad-hoc Dynamics Discriminator network. We conduct a comprehensive set of quantitative and qualitative tests and demonstrate experimentally that our proposed method can successfully transfer facial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

michaildoukas/head2head
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings