Hierarchical Model for Long-term Video Prediction

Peter Wang; Zhongxia Yan; Jeff Zhang

arXiv:1706.08665·cs.CV·July 4, 2017·2 cites

Hierarchical Model for Long-term Video Prediction

Peter Wang, Zhongxia Yan, Jeff Zhang

PDF

Open Access

TL;DR

This paper introduces a hierarchical approach for long-term video prediction that estimates high-level structure first, then generates realistic frames using an analogy network, improving long-term prediction quality.

Contribution

The paper proposes a novel hierarchical model combining LSTMs and analogy networks with adversarial loss for improved long-term video prediction.

Findings

01

Effective high-level structure prediction over long sequences

02

Improved realism in generated video frames

03

Demonstrated on Penn Action dataset with promising results

Abstract

Video prediction has been an active topic of research in the past few years. Many algorithms focus on pixel-level predictions, which generates results that blur and disintegrate within a few frames. In this project, we use a hierarchical approach for long-term video prediction. We aim at estimating high-level structure in the input frame first, then predict how that structure grows in the future. Finally, we use an image analogy network to recover a realistic image from the predicted structure. Our method is largely adopted from the work by Villegas et al. The method is built with a combination of LSTMs and analogy-based convolutional auto-encoder networks. Additionally, in order to generate more realistic frame predictions, we also adopt adversarial loss. We evaluate our method on the Penn Action dataset, and demonstrate good results on high-level long-term structure prediction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging