Learning Modulated Transformation in GANs
Ceyuan Yang, Qihang Zhang, Yinghao Xu, Jiapeng Zhu, Yujun Shen, Bo Dai

TL;DR
This paper introduces a modulated transformation module (MTM) for GANs that predicts spatial offsets to better model geometric variations, improving generation quality across multiple tasks.
Contribution
The paper proposes a plug-and-play MTM for GANs that enhances geometric modeling by predicting spatial offsets, compatible with existing frameworks without hyper-parameter tuning.
Findings
Improves FID from 21.36 to 13.60 on TaiChi dataset for human generation
Generalizes well to image, 3D-aware, and video synthesis tasks
Compatible with state-of-the-art GAN frameworks
Abstract
The success of style-based generators largely benefits from style modulation, which helps take care of the cross-instance variation within data. However, the instance-wise stochasticity is typically introduced via regular convolution, where kernels interact with features at some fixed locations, limiting its capacity for modeling geometric variation. To alleviate this problem, we equip the generator in generative adversarial networks (GANs) with a plug-and-play module, termed as modulated transformation module (MTM). This module predicts spatial offsets under the control of latent codes, based on which the convolution operation can be applied at variable locations for different instances, and hence offers the model an additional degree of freedom to handle geometry deformation. Extensive experiments suggest that our approach can be faithfully generalized to various generative tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Human Motion and Animation
