Mutual Information Based Method for Unsupervised Disentanglement of   Video Representation

P Aditya Sreekar; Ujjwal Tiwari; Anoop Namboodiri

arXiv:2011.08614·cs.CV·November 18, 2020

Mutual Information Based Method for Unsupervised Disentanglement of Video Representation

P Aditya Sreekar, Ujjwal Tiwari, Anoop Namboodiri

PDF

1 Repo

TL;DR

This paper introduces MIPAE, a framework that disentangles video representations into content and pose variables using mutual information, improving future frame prediction in videos.

Contribution

The work proposes a novel mutual information loss and a disentanglement metric, advancing unsupervised video representation learning.

Findings

01

MIPAE effectively disentangles content and pose in video data.

02

Disentangled representations improve future frame prediction quality.

03

MIG scores align with visual and quantitative evaluation metrics.

Abstract

Video Prediction is an interesting and challenging task of predicting future frames from a given set context frames that belong to a video sequence. Video prediction models have found prospective applications in Maneuver Planning, Health care, Autonomous Navigation and Simulation. One of the major challenges in future frame generation is due to the high dimensional nature of visual data. In this work, we propose Mutual Information Predictive Auto-Encoder (MIPAE) framework, that reduces the task of predicting high dimensional video frames by factorising video representations into content and low dimensional pose latent variables that are easy to predict. A standard LSTM network is used to predict these low dimensional pose representations. Content and the predicted pose representations are decoded to generate future frames. Our approach leverages the temporal structure of the latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

blackPython/mipae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory