DRIT++: Diverse Image-to-Image Translation via Disentangled   Representations

Hsin-Ying Lee; Hung-Yu Tseng; Qi Mao; Jia-Bin Huang; Yu-Ding Lu,; Maneesh Singh; Ming-Hsuan Yang

arXiv:1905.01270·cs.CV·December 19, 2019·187 cites

DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu,, Maneesh Singh, Ming-Hsuan Yang

PDF

Open Access 4 Repos

TL;DR

DRIT++ introduces a novel disentangled representation framework for diverse image-to-image translation that operates effectively without paired training data, enabling the generation of multiple realistic outputs from a single input.

Contribution

The paper proposes a disentangled representation approach with domain-invariant content and domain-specific attribute spaces for unpaired image translation, incorporating a cross-cycle consistency loss.

Findings

01

Generates diverse, realistic images across various tasks without paired data

02

Achieves high realism scores in user studies and Fréchet inception distance

03

Demonstrates significant diversity using perceptual distance and Jensen-Shannon divergence

Abstract

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: 1) lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Digital Media Forensic Detection