Conditional MoCoGAN for Zero-Shot Video Generation
Shun Kimura, Kazuhiko Kawamoto

TL;DR
This paper introduces Conditional MoCoGAN, a novel GAN-based model for zero-shot video generation that learns disentangled representations to generate unseen videos from incomplete training data.
Contribution
It extends conditional GANs to zero-shot video generation by learning disentangled motion and content representations, enabling the synthesis of unseen videos.
Findings
Effective zero-shot video generation demonstrated on Weizmann and MUG datasets.
Improved disentanglement of motion and content in latent space.
High-quality video synthesis for unseen classes.
Abstract
We propose a conditional generative adversarial network (GAN) model for zero-shot video generation. In this study, we have explored zero-shot conditional generation setting. In other words, we generate unseen videos from training samples with missing classes. The task is an extension of conditional data generation. The key idea is to learn disentangled representations in the latent space of a GAN. To realize this objective, we base our model on the motion and content decomposed GAN and conditional GAN for image generation. We build the model to find better-disentangled representations and to generate good-quality videos. We demonstrate the effectiveness of our proposed model through experiments on the Weizmann action database and the MUG facial expression database.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Video Analysis and Summarization
