Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
Tianfan Xue, Jiajun Wu, Katherine L. Bouman, William T. Freeman

TL;DR
This paper introduces a probabilistic approach for future frame synthesis from a single image, utilizing a novel Cross Convolutional Network to generate multiple plausible future frames, applicable to both synthetic and real-world videos.
Contribution
The paper presents a new probabilistic model and a Cross Convolutional Network architecture for synthesizing multiple future frames from a single image, advancing beyond deterministic methods.
Findings
Model performs well on synthetic data like 2D shapes and sprites
Effective on real-world video data
Enables tasks like visual analogy-making
Abstract
We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods, which have tackled this problem in a deterministic or non-parametric way, we propose a novel approach that models future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. Future frame synthesis is challenging, as it involves low- and high-level image and motion understanding. We propose a novel network structure, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, as well as on real-wold videos. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization
