SimVP: Simpler yet Better Video Prediction
Zhangyang Gao, Cheng Tan, Lirong Wu, Stan Z. Li

TL;DR
SimVP is a simple, CNN-based video prediction model trained with MSE loss that achieves state-of-the-art results across multiple datasets without complex strategies, demonstrating strong generalization and lower training costs.
Contribution
This paper introduces SimVP, a straightforward CNN-based approach for video prediction that rivals complex models, serving as a solid baseline and simplifying the training process.
Findings
Achieves state-of-the-art performance on five benchmark datasets.
Demonstrates strong generalization on real-world datasets.
Reduces training cost significantly, enabling scalability.
Abstract
From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated training strategies. We admire these progresses but are confused about the necessity: is there a simple method that can perform comparably well? This paper proposes SimVP, a simple video prediction model that is completely built upon CNN and trained by MSE loss in an end-to-end fashion. Without introducing any additional tricks and complicated strategies, we can achieve state-of-the-art performance on five benchmark datasets. Through extended experiments, we demonstrate that SimVP has strong generalization and extensibility on real-world datasets. The significant reduction of training cost makes it easier to scale to complex scenarios. We believe SimVP can serve as a solid baseline to stimulate the further development of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
