Markov Decision Process for Video Generation

Vladyslav Yushchenko; Nikita Araslanov; Stefan Roth

arXiv:1909.12400·cs.CV·September 30, 2019

Markov Decision Process for Video Generation

Vladyslav Yushchenko, Nikita Araslanov, Stefan Roth

PDF

TL;DR

This paper introduces a Markov Decision Process framework for video generation that addresses temporal inconsistencies, enhances long-term modeling, and improves video quality using new metrics and integration with existing models.

Contribution

The paper reformulates video generation as an MDP to enable long-term modeling and introduces new metrics for better temporal diversity assessment.

Findings

01

Improved video quality on Human Actions and UCF-101 datasets.

02

More memory-efficient model with better temporal consistency.

03

Effective integration with existing frameworks like MoCoGAN.

Abstract

We identify two pathological cases of temporal inconsistencies in video generation: video freezing and video looping. To better quantify the temporal diversity, we propose a class of complementary metrics that are effective, easy to implement, data agnostic, and interpretable. Further, we observe that current state-of-the-art models are trained on video samples of fixed length thereby inhibiting long-term modeling. To address this, we reformulate the problem of video generation as a Markov Decision Process (MDP). The underlying idea is to represent motion as a stochastic process with an infinite forecast horizon to overcome the fixed length limitation and to mitigate the presence of temporal artifacts. We show that our formulation is easy to integrate into the state-of-the-art MoCoGAN framework. Our experiments on the Human Actions and UCF-101 datasets demonstrate that our MDP-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.