TL;DR
This paper introduces a novel latent transformer model for high-quality, disentangled, and identity-preserving face editing in images and videos, addressing limitations of previous methods in control and generalization.
Contribution
It proposes a dedicated latent transformation network with explicit disentanglement and identity preservation, extending face editing capabilities to videos.
Findings
Outperforms state-of-the-art methods in visual quality
Achieves effective disentangled and controllable face editing
Successfully generalizes to real images and videos
Abstract
High quality facial image editing is a challenging problem in the movie post-production industry, requiring a high degree of control and identity preservation. Previous works that attempt to tackle this problem may suffer from the entanglement of facial attributes and the loss of the person's identity. Furthermore, many algorithms are limited to a certain task. To tackle these limitations, we propose to edit facial attributes via the latent space of a StyleGAN generator, by training a dedicated latent transformation network and incorporating explicit disentanglement and identity preservation terms in the loss function. We further introduce a pipeline to generalize our face editing to videos. Our model achieves a disentangled, controllable, and identity-preserving facial attribute editing, even in the challenging case of real (i.e., non-synthetic) images and videos. We conduct extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDense Connections · HuMan(Expedia)||How do I get a human at Expedia? · Feedforward Network · Convolution · Adaptive Instance Normalization · R1 Regularization
