Rethinking the Paradigm of Content Constraints in Unpaired Image-to-Image Translation
Xiuding Cai, Yaoyao Zhu, Dong Miao, Linjie Fu, Yu Yao

TL;DR
This paper introduces EnCo, an efficient content constraint method for unpaired image-to-image translation that improves generation quality by constraining patch-level features in the latent space and employs a novel discriminative patch sampling strategy.
Contribution
EnCo constrains patch-level features in the latent space for better content preservation, and introduces a discriminative attention-guided patch sampling strategy, achieving state-of-the-art results.
Findings
EnCo improves translation quality and training efficiency.
DAG patch sampling enhances model performance with negligible overhead.
EnCo achieves multiple state-of-the-art results on various datasets.
Abstract
In an unpaired setting, lacking sufficient content constraints for image-to-image translation (I2I) tasks, GAN-based approaches are usually prone to model collapse. Current solutions can be divided into two categories, reconstruction-based and Siamese network-based. The former requires that the transformed or transforming image can be perfectly converted back to the original image, which is sometimes too strict and limits the generative performance. The latter involves feeding the original and generated images into a feature extractor and then matching their outputs. This is not efficient enough, and a universal feature extractor is not easily available. In this paper, we propose EnCo, a simple but efficient way to maintain the content by constraining the representational similarity in the latent space of patch-level features from the same stage of the \textbf{En}coder and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsCancer-related molecular mechanisms research · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
