TL;DR
SLAMP introduces a stochastic model for video prediction that explicitly reasons about appearance and motion, leveraging motion history to improve long-term dynamic consistency, especially excelling in complex real-world scenarios.
Contribution
It presents a novel stochastic approach that explicitly models motion and appearance, utilizing motion history for enhanced long-term video prediction.
Findings
Performs comparably to state-of-the-art on generic datasets
Significantly outperforms on autonomous driving datasets
Explicit motion reasoning improves long-term consistency
Abstract
Motion is an important cue for video prediction and often utilized by separating video content into static and dynamic components. Most of the previous work utilizing motion is deterministic but there are stochastic methods that can model the inherent uncertainty of the future. Existing stochastic models either do not reason about motion explicitly or make limiting assumptions about the static part. In this paper, we reason about appearance and motion in the video stochastically by predicting the future based on the motion history. Explicit reasoning about motion without history already reaches the performance of current stochastic models. The motion history further improves the results by allowing to predict consistent dynamics several frames into the future. Our model performs comparably to the state-of-the-art models on the generic video prediction datasets, however, significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
