Fourier-based Video Prediction through Relational Object Motion

Malte Mosbach; Sven Behnke

arXiv:2110.05881·cs.CV·October 13, 2021

Fourier-based Video Prediction through Relational Object Motion

Malte Mosbach, Sven Behnke

PDF

TL;DR

This paper introduces a novel frequency-domain method for video prediction that explicitly models object-motion relationships, resulting in clearer, more consistent predictions without extensive training.

Contribution

It proposes a Fourier-based approach combined with relational object motion inference, offering an alternative to recurrent neural networks for video prediction.

Findings

01

Predictions are consistent with observed scene dynamics.

02

Results are less blurry compared to traditional deep recurrent models.

03

Method reduces training requirements.

Abstract

The ability to predict future outcomes conditioned on observed video frames is crucial for intelligent decision-making in autonomous systems. Recently, deep recurrent architectures have been applied to the task of video prediction. However, this often results in blurry predictions and requires tedious training on large datasets. Here, we explore a different approach by (1) using frequency-domain approaches for video prediction and (2) explicitly inferring object-motion relationships in the observed scene. The resulting predictions are consistent with the observed dynamics in a scene and do not suffer from blur.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.