Generating Videos with Scene Dynamics

Carl Vondrick; Hamed Pirsiavash; Antonio Torralba

arXiv:1609.02612·cs.CV·October 27, 2016·848 cites

Generating Videos with Scene Dynamics

Carl Vondrick, Hamed Pirsiavash, Antonio Torralba

PDF

Open Access

TL;DR

This paper introduces a generative adversarial network that models scene dynamics in videos, enabling both improved video generation and action recognition from unlabeled data.

Contribution

It presents a novel spatio-temporal GAN architecture that separates foreground and background, demonstrating its effectiveness in video generation and representation learning.

Findings

01

Generated videos up to a second long at full frame rate outperform baselines

02

Model predicts plausible future frames from static images

03

Learns useful features for action recognition with minimal supervision

Abstract

We capitalize on large amounts of unlabeled video in order to learn a model of scene dynamics for both video recognition tasks (e.g. action classification) and video generation tasks (e.g. future prediction). We propose a generative adversarial network for video with a spatio-temporal convolutional architecture that untangles the scene's foreground from the background. Experiments suggest this model can generate tiny videos up to a second at full frame rate better than simple baselines, and we show its utility at predicting plausible futures of static images. Moreover, experiments and visualizations show the model internally learns useful features for recognizing actions with minimal supervision, suggesting scene dynamics are a promising signal for representation learning. We believe generative video models can impact many applications in video understanding and simulation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications