Sequential Representation Learning via Static-Dynamic Conditional   Disentanglement

Mathieu Cyrille Simon; Pascal Frossard; Christophe De Vleeschouwer

arXiv:2408.05599·cs.LG·August 13, 2024

Sequential Representation Learning via Static-Dynamic Conditional Disentanglement

Mathieu Cyrille Simon, Pascal Frossard, Christophe De Vleeschouwer

PDF

Open Access

TL;DR

This paper introduces a novel self-supervised method for disentangling static and dynamic factors in sequential data like videos, explicitly modeling their causal relationship and improving representation learning with normalizing flows.

Contribution

It proposes a new model that relaxes independence assumptions, incorporates causal relationships, and introduces a theoretically grounded disentanglement constraint for better sequential data representation.

Findings

01

Outperforms state-of-the-art methods in scene dynamics scenarios

02

Provides a formal definition and identifiability conditions for factors

03

Enhances model expressivity with Normalizing Flows

Abstract

This paper explores self-supervised disentangled representation learning within sequential data, focusing on separating time-independent and time-varying factors in videos. We propose a new model that breaks the usual independence assumption between those factors by explicitly accounting for the causal relationship between the static/dynamic variables and that improves the model expressivity through additional Normalizing Flows. A formal definition of the factors is proposed. This formalism leads to the derivation of sufficient conditions for the ground truth factors to be identifiable, and to the introduction of a novel theoretically grounded disentanglement constraint that can be directly and efficiently incorporated into our new framework. The experiments show that the proposed approach outperforms previous complex state-of-the-art techniques in scenarios where the dynamics of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Chaos-based Image/Signal Encryption

MethodsNormalizing Flows