TL;DR
This paper introduces SMILE, a versatile method for multi-attribute facial image editing that effectively handles both random and reference-guided transformations using a multimodal representation, outperforming existing approaches.
Contribution
We propose a novel multimodal representation for facial attribute editing that supports both random and reference-based transformations with a unified approach.
Findings
Outperforms existing methods in qualitative and quantitative evaluations.
Capable of fine-grained and coarse attribute editing using references or style space.
Extensible to head-swapping and face-reenactment without video training.
Abstract
Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs). Exploring the disentangled attribute space within a transformation is a very challenging task due to the multiple and mutually-inclusive nature of the facial images, where different labels (eyeglasses, hats, hair, identity, etc.) can co-exist at the same time. Several works address this issue either by exploiting the modality of each domain/attribute using a conditional random vector noise, or extracting the modality from an exemplary image. However, existing methods cannot handle both random and reference transformations for multiple attributes, which limits the generality of the solutions. In this paper, we successfully exploit a multimodal representation that handles all attributes, be it guided by random noise or exemplar images, while only using the underlying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
