Unsupervised Learning of Visual Structure using Predictive Generative Networks
William Lotter, Gabriel Kreiman, David Cox

TL;DR
This paper demonstrates that deep neural networks trained to predict future video frames can learn internal representations of 3D objects, generalize to new tasks, and outperform models trained only with reconstruction loss.
Contribution
It introduces a predictive generative network framework that learns rich, transformation-tolerant representations through unsupervised future frame prediction.
Findings
Achieves state-of-the-art performance in video prediction tasks.
Learns representations that generalize to object classification.
Outperforms reconstruction-only models in generalization.
Abstract
The ability to predict future states of the environment is a central pillar of intelligence. At its core, effective prediction requires an internal model of the world and an understanding of the rules by which the world changes. Here, we explore the internal models developed by deep neural networks trained using a loss based on predicting future frames in synthetic video sequences, using a CNN-LSTM-deCNN framework. We first show that this architecture can achieve excellent performance in visual sequence prediction tasks, including state-of-the-art performance in a standard 'bouncing balls' dataset (Sutskever et al., 2009). Using a weighted mean-squared error and adversarial loss (Goodfellow et al., 2014), the same architecture successfully extrapolates out-of-the-plane rotations of computer-generated faces. Furthermore, despite being trained end-to-end to predict only pixel-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques
