Stochastic Variational Video Prediction
Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell,, Sergey Levine

TL;DR
This paper introduces SV2P, a stochastic variational model for multi-frame video prediction that captures multiple plausible futures in real-world videos, outperforming previous deterministic and stochastic methods.
Contribution
The paper presents the first effective stochastic multi-frame video prediction model for real-world videos using variational inference.
Findings
SV2P produces more accurate and detailed future frames.
It outperforms deterministic models in stochastic environments.
The method is validated on multiple real-world datasets.
Abstract
Predicting the future in real-world settings, particularly from raw sensory observations such as images, is exceptionally challenging. Real-world events can be stochastic and unpredictable, and the high dimensionality and complexity of natural images requires the predictive model to build an intricate understanding of the natural world. Many existing methods tackle this problem by making simplifying assumptions about the environment. One common assumption is that the outcome is deterministic and there is only one plausible future. This can lead to low-quality predictions in real-world settings with stochastic dynamics. In this paper, we develop a stochastic variational video prediction (SV2P) method that predicts a different possible future for each sample of its latent variables. To the best of our knowledge, our model is the first to provide effective stochastic multi-frame prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Image and Signal Denoising Methods
