VDSM: Unsupervised Video Disentanglement with State-Space Modeling and   Deep Mixtures of Experts

Matthew J. Vowels; Necati Cihan Camgoz; Richard Bowden

arXiv:2103.07292·cs.CV·December 16, 2021

VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts

Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

PDF

1 Repo

TL;DR

VDSM is an unsupervised deep state-space model that effectively disentangles identity and dynamic factors in videos, enabling improved generative and transfer tasks without supervision.

Contribution

It introduces a novel hierarchical state-space model with a Mixture of Experts decoder for unsupervised video disentanglement, surpassing supervised methods.

Findings

01

State-of-the-art performance on disentanglement tasks

02

Outperforms adversarial methods with supervision

03

Effective in identity and dynamics transfer

Abstract

Disentangled representations support a range of downstream tasks including causal reasoning, generative modeling, and fair machine learning. Unfortunately, disentanglement has been shown to be impossible without the incorporation of supervision or inductive bias. Given that supervision is often expensive or infeasible to acquire, we choose to incorporate structural inductive bias and present an unsupervised, deep State-Space-Model for Video Disentanglement (VDSM). The model disentangles latent time-varying and dynamic factors via the incorporation of hierarchical structure with a dynamic prior and a Mixture of Experts decoder. VDSM learns separate disentangled representations for the identity of the object or person in the video, and for the action being performed. We evaluate VDSM across a range of qualitative and quantitative tasks including identity and dynamics transfer, sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matthewvowels1/DisentanglingSequences
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.