DRIT++: Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu,, Maneesh Singh, Ming-Hsuan Yang

TL;DR
DRIT++ introduces a novel disentangled representation framework for diverse image-to-image translation that operates effectively without paired training data, enabling the generation of multiple realistic outputs from a single input.
Contribution
The paper proposes a disentangled representation approach with domain-invariant content and domain-specific attribute spaces for unpaired image translation, incorporating a cross-cycle consistency loss.
Findings
Generates diverse, realistic images across various tasks without paired data
Achieves high realism scores in user studies and Fréchet inception distance
Demonstrates significant diversity using perceptual distance and Jensen-Shannon divergence
Abstract
Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: 1) lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Digital Media Forensic Detection
