Long-Term Prediction of Natural Video Sequences with Robust Video Predictors
Luke Ditria, Tom Drummond

TL;DR
This paper introduces Robust Video Predictors (RoViPs) that leverage perceptual and uncertainty-based losses, attention mechanisms, and robustness to prediction errors to improve long-term natural video sequence prediction.
Contribution
The work presents novel improvements including perceptual and uncertainty losses, attention-based skip connections, and robustness to errors for long-term natural video prediction.
Findings
High-quality short-term predictions achieved
Long-term realistic video sequences generated
Robustness to errors enables extended prediction sequences
Abstract
Predicting high dimensional video sequences is a curiously difficult problem. The number of possible futures for a given video sequence grows exponentially over time due to uncertainty. This is especially evident when trying to predict complicated natural video scenes from a limited snapshot of the world. The inherent uncertainty accumulates the further into the future you predict making long-term prediction very difficult. In this work we introduce a number of improvements to existing work that aid in creating Robust Video Predictors (RoViPs). We show that with a combination of deep Perceptual and uncertainty-based reconstruction losses we are able to create high quality short-term predictions. Attention-based skip connections are utilised to allow for long range spatial movement of input features to further improve performance. Finally, we show that by simply making the predictor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Human Pose and Action Recognition
