ArrowGAN : Learning to Generate Videos by Learning Arrow of Time
Kibeom Hong, Youngjung Uh, Hyeran Byun

TL;DR
ArrowGAN introduces a novel approach where the discriminator learns to identify the arrow of time, improving video generation quality by incorporating temporal understanding, and achieves state-of-the-art results on multiple datasets.
Contribution
The paper proposes ArrowGAN, a framework that uses arrow of time classification as an auxiliary task to enhance video GAN training, with categorical extensions for improved performance.
Findings
ArrowGAN outperforms previous methods on video inception score.
ArrowGAN reduces Frechet video distance across datasets.
Temporal self-supervision improves video realism.
Abstract
Training GANs on videos is even more sophisticated than on images because videos have a distinguished dimension: time. While recent methods designed a dedicated architecture considering time, generated videos are still far from indistinguishable from real videos. In this paper, we introduce ArrowGAN framework, where the discriminators learns to classify arrow of time as an auxiliary task and the generators tries to synthesize forward-running videos. We argue that the auxiliary task should be carefully chosen regarding the target domain. In addition, we explore categorical ArrowGAN with recent techniques in conditional image generation upon ArrowGAN framework, achieving the state-of-the-art performance on categorical video generation. Our extensive experiments validate the effectiveness of arrow of time as a self-supervisory task, and demonstrate that all our components of categorical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
