StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation
Yuhan Wang, Liming Jiang, Chen Change Loy

TL;DR
StyleInV introduces a novel inversion-based motion generator for unconditional video synthesis, enabling high-quality, temporally consistent, long-duration videos with less training complexity and support for style transfer.
Contribution
It proposes a learning-based inversion network for GANs that captures rich priors and modulates future latents, reducing reliance on heavy discriminators and enabling style transfer.
Findings
Outperforms existing methods in generating long, high-resolution videos
Achieves high temporal consistency and single-frame quality
Supports style transfer with simple fine-tuning
Abstract
Unconditional video generation is a challenging task that involves synthesizing high-quality videos that are both coherent and of extended duration. To address this challenge, researchers have used pretrained StyleGAN image generators for high-quality frame synthesis and focused on motion generator design. The motion generator is trained in an autoregressive manner using heavy 3D convolutional discriminators to ensure motion coherence during video generation. In this paper, we introduce a novel motion generator design that uses a learning-based inversion network for GAN. The encoder in our method captures rich and smooth priors from encoding images to latents, and given the latent of an initially generated frame as guidance, our method can generate smooth future latent by modulating the inversion encoder temporally. Our method enjoys the advantage of sparse training and naturally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques
MethodsConvolution · HuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Dense Connections · Feedforward Network · Adaptive Instance Normalization · StyleGAN
