DSRGAN: Explicitly Learning Disentangled Representation of Underlying   Structure and Rendering for Image Generation without Tuple Supervision

Guang-Yuan Hao; Hong-Xing Yu; Wei-Shi Zheng

arXiv:1909.13501·cs.LG·October 1, 2019

DSRGAN: Explicitly Learning Disentangled Representation of Underlying Structure and Rendering for Image Generation without Tuple Supervision

Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

PDF

Open Access

TL;DR

This paper introduces DSRGAN, a novel GAN-based method that learns to disentangle underlying spatial structure from rendering in images without tuple supervision, enabling independent control of these factors.

Contribution

The paper proposes a new framework with a shared latent space and parallel generative networks to achieve disentangled representation learning without tuple supervision.

Findings

01

DSRGAN outperforms state-of-the-art methods in disentanglability.

02

Introduces the Normalized Disentanglability metric for quantitative evaluation.

03

Demonstrates effective independent control of structure and rendering in image generation.

Abstract

We focus on explicitly learning disentangled representation for natural image generation, where the underlying spatial structure and the rendering on the structure can be independently controlled respectively, yet using no tuple supervision. The setting is significant since tuple supervision is costly and sometimes even unavailable. However, the task is highly unconstrained and thus ill-posed. To address this problem, we propose to introduce an auxiliary domain which shares a common underlying-structure space with the target domain, and we make a partially shared latent space assumption. The key idea is to encourage the partially shared latent variable to represent the similar underlying spatial structures in both domains, while the two domain-specific latent variables will be unavoidably arranged to present renderings of two domains respectively. This is achieved by designing two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Digital Media Forensic Detection