TunaGAN: Interpretable GAN for Smart Editing

Weiquan Mao; Beicheng Lou; Jiyao Yuan

arXiv:1908.06163·cs.CV·August 20, 2019·1 cites

TunaGAN: Interpretable GAN for Smart Editing

Weiquan Mao, Beicheng Lou, Jiyao Yuan

PDF

Open Access

TL;DR

TunaGAN is an interpretable, tunable GAN that enables high-resolution face image editing based on user instructions, utilizing auxiliary networks and exploring latent space for feature disentanglement.

Contribution

The paper introduces TunaGAN, a novel approach combining auxiliary networks with Style-GAN for controllable, high-quality image editing and investigates latent space traversal for feature disentanglement.

Findings

01

Effective high-resolution face editing with user instructions.

02

Analysis of mode collapse impacts on model robustness.

03

Insights into latent space traversal for feature disentanglement.

Abstract

In this paper, we introduce a tunable generative adversary network (TunaGAN) that uses an auxiliary network on top of existing generator networks (Style-GAN) to modify high-resolution face images according to user's high-level instructions, with good qualitative and quantitative performance. To optimize for feature disentanglement, we also investigate two different latent space that could be traversed for modification. The problem of mode collapse is characterized in detail for model robustness. This work could be easily extended to content-aware image editor based on other GANs and provide insight on mode collapse problems in more general settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Video Analysis and Summarization