Learning Modulated Transformation in GANs

Ceyuan Yang; Qihang Zhang; Yinghao Xu; Jiapeng Zhu; Yujun Shen; Bo Dai

arXiv:2308.15472·cs.CV·August 30, 2023

Learning Modulated Transformation in GANs

Ceyuan Yang, Qihang Zhang, Yinghao Xu, Jiapeng Zhu, Yujun Shen, Bo Dai

PDF

Open Access

TL;DR

This paper introduces a modulated transformation module (MTM) for GANs that predicts spatial offsets to better model geometric variations, improving generation quality across multiple tasks.

Contribution

The paper proposes a plug-and-play MTM for GANs that enhances geometric modeling by predicting spatial offsets, compatible with existing frameworks without hyper-parameter tuning.

Findings

01

Improves FID from 21.36 to 13.60 on TaiChi dataset for human generation

02

Generalizes well to image, 3D-aware, and video synthesis tasks

03

Compatible with state-of-the-art GAN frameworks

Abstract

The success of style-based generators largely benefits from style modulation, which helps take care of the cross-instance variation within data. However, the instance-wise stochasticity is typically introduced via regular convolution, where kernels interact with features at some fixed locations, limiting its capacity for modeling geometric variation. To alleviate this problem, we equip the generator in generative adversarial networks (GANs) with a plug-and-play module, termed as modulated transformation module (MTM). This module predicts spatial offsets under the control of latent codes, based on which the convolution operation can be applied at variable locations for different instances, and hence offers the model an additional degree of freedom to handle geometry deformation. Extensive experiments suggest that our approach can be faithfully generalized to various generative tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Human Motion and Animation