Motion Selective Prediction for Video Frame Synthesis
Veronique Prinet

TL;DR
This paper introduces a novel video prediction model that extends the content and motion of a specific video using a dual network with dynamic and static convolutional kernels, emphasizing interpretability and robustness.
Contribution
The proposed model uniquely learns from initial frames of a single video to predict future frames, differing from traditional large-dataset training methods.
Findings
Robust performance on challenging in-the-wild videos
Competitive results compared to baseline methods
Enhanced interpretability of the model's functioning
Abstract
Existing conditional video prediction approaches train a network from large databases and generalize to previously unseen data. We take the opposite stance, and introduce a model that learns from the first frames of a given video and extends its content and motion, to, eg, double its length. To this end, we propose a dual network that can use in a flexible way both dynamic and static convolutional motion kernels, to predict future frames. The construct of our model gives us the the means to efficiently analyze its functioning and interpret its output. We demonstrate experimentally the robustness of our approach on challenging videos in-the-wild and show that it is competitive wrt related baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies
