Efficient training for future video generation based on hierarchical   disentangled representation of latent variables

Naoya Fushishita; Antonio Tejero-de-Pablos; Yusuke Mukuta; Tatsuya; Harada

arXiv:2106.03502·cs.CV·June 9, 2021

Efficient training for future video generation based on hierarchical disentangled representation of latent variables

Naoya Fushishita, Antonio Tejero-de-Pablos, Yusuke Mukuta, Tatsuya, Harada

PDF

Open Access

TL;DR

This paper introduces a memory-efficient hierarchical disentangled representation method for future video prediction, decomposing video into background and foreground components to improve quality and reduce computational costs.

Contribution

The novel hierarchical disentangled approach significantly reduces memory usage and enhances future video prediction capabilities compared to prior methods.

Findings

01

Efficiently generates future videos with less memory consumption.

02

Successfully handles complex datasets beyond previous methods.

03

Decomposes video into background and foreground for better representation.

Abstract

Generating videos predicting the future of a given sequence has been an area of active research in recent years. However, an essential problem remains unsolved: most of the methods require large computational cost and memory usage for training. In this paper, we propose a novel method for generating future prediction videos with less memory usage than the conventional methods. This is a critical stepping stone in the path towards generating videos with high image quality, similar to that of generated images in the latest works in the field of image generation. We achieve high-efficiency by training our method in two stages: (1) image reconstruction to encode video frames into latent variables, and (2) latent variable prediction to generate the future sequence. Our method achieves an efficient compression of video into low-dimensional latent variables by decomposing each frame according…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image Processing Techniques