Improving Shape Deformation in Unsupervised Image-to-Image Translation
Aaron Gokaslan, Vivek Ramanujan, Daniel Ritchie, Kwang In Kim, James, Tompkin

TL;DR
This paper introduces a new discriminator and perceptual loss to improve shape deformation in unsupervised image-to-image translation, enabling better handling of large shape changes across diverse datasets.
Contribution
It proposes a discriminator with dilated convolutions and a multi-scale perceptual loss to enhance shape deformation capabilities in unsupervised translation.
Findings
Improved shape deformation in toy datasets.
Enhanced translation quality for complex human and animal images.
Better preservation of object shape in diverse mappings.
Abstract
Unsupervised image-to-image translation techniques are able to map local texture between two domains, but they are typically unsuccessful when the domains require larger shape change. Inspired by semantic segmentation, we introduce a discriminator with dilated convolutions that is able to use information from across the entire image to train a more context-aware generator. This is coupled with a multi-scale perceptual loss that is better able to represent error in the underlying shape of objects. We demonstrate that this design is more capable of representing shape deformation in a challenging toy dataset, plus in complex mappings with significant dataset variation between humans, dolls, and anime faces, and between cats and dogs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Video Analysis and Summarization
