Visual Dynamics: Stochastic Future Generation via Layered Cross   Convolutional Networks

Tianfan Xue; Jiajun Wu; Katherine L. Bouman; William T. Freeman

arXiv:1807.09245·cs.CV·August 13, 2019

Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

Tianfan Xue, Jiajun Wu, Katherine L. Bouman, William T. Freeman

PDF

TL;DR

This paper introduces a probabilistic approach for future frame synthesis from a single image using a novel Cross Convolutional Network, enabling diverse and realistic motion predictions for both synthetic and real-world data.

Contribution

It presents a new probabilistic model and a Cross Convolutional Network architecture for generating multiple plausible future frames from a single image.

Findings

01

Model performs well on synthetic and real-world data

02

Network learns compact encoding of object appearance and motion

03

Applications include visual analogy-making and video extrapolation

Abstract

We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.