TunaGAN: Interpretable GAN for Smart Editing
Weiquan Mao, Beicheng Lou, Jiyao Yuan

TL;DR
TunaGAN is an interpretable, tunable GAN that enables high-resolution face image editing based on user instructions, utilizing auxiliary networks and exploring latent space for feature disentanglement.
Contribution
The paper introduces TunaGAN, a novel approach combining auxiliary networks with Style-GAN for controllable, high-quality image editing and investigates latent space traversal for feature disentanglement.
Findings
Effective high-resolution face editing with user instructions.
Analysis of mode collapse impacts on model robustness.
Insights into latent space traversal for feature disentanglement.
Abstract
In this paper, we introduce a tunable generative adversary network (TunaGAN) that uses an auxiliary network on top of existing generator networks (Style-GAN) to modify high-resolution face images according to user's high-level instructions, with good qualitative and quantitative performance. To optimize for feature disentanglement, we also investigate two different latent space that could be traversed for modification. The problem of mode collapse is characterized in detail for model robustness. This work could be easily extended to content-aware image editor based on other GANs and provide insight on mode collapse problems in more general settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Video Analysis and Summarization
