Probabilistic Video Generation using Holistic Attribute Control
Jiawei He, Andreas Lehrmann, Joseph Marino, Greg Mori, Leonid Sigal

TL;DR
This paper introduces a probabilistic video generation framework that uses attribute control and variational autoencoders to produce diverse, consistent video sequences and predict future frames based on learned spatio-temporal patterns.
Contribution
The paper presents a novel generative model combining VAEs and RNNs with attribute control for improved video synthesis and prediction.
Findings
Generated videos are highly consistent and diverse.
Model outperforms state-of-the-art on multiple datasets.
Attribute conditioning enhances control over generated content.
Abstract
Videos express highly structured spatio-temporal patterns of visual data. A video can be thought of as being governed by two factors: (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame, and (ii) an inter-frame motion or scene dynamics (e.g., encoding evolution of the person ex-ecuting the action). Based on this intuition, we propose a generative framework for video generation and future prediction. The proposed framework generates a video (short clip) by decoding samples sequentially drawn from a latent space distribution into full video frames. Variational Autoencoders (VAEs) are used as a means of encoding/decoding frames into/from the latent space and RNN as a wayto model the dynamics in the latent space. We improve the video generation consistency through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Video Analysis and Summarization
