Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations

Tal Barami; Nimrod Berman; Ilan Naiman; Amos H. Hason; Rotem Ezra; Omri Azencot

arXiv:2510.17313·cs.LG·October 28, 2025

Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations

Tal Barami, Nimrod Berman, Ilan Naiman, Amos H. Hason, Rotem Ezra, Omri Azencot

PDF

Open Access 1 Video

TL;DR

This paper introduces a comprehensive benchmark and evaluation framework for multi-factor sequential disentanglement in deep learning, addressing the complexity of real-world data involving multiple interacting semantic factors over time.

Contribution

It presents the first standardized benchmark, modular tools, a post-hoc Latent Exploration Stage, a Koopman-inspired model, and leverages Vision-Language Models for automated dataset annotation and evaluation.

Findings

01

Koopman-inspired model achieves state-of-the-art results.

02

Vision-Language Models enable zero-shot disentanglement evaluation.

03

Benchmark spans six diverse datasets across modalities.

Abstract

Learning disentangled representations in sequential data is a key goal in deep learning, with broad applications in vision, audio, and time series. While real-world data involves multiple interacting semantic factors over time, prior work has mostly focused on simpler two-factor static and dynamic settings, primarily because such settings make data collection easier, thereby overlooking the inherently multi-factor nature of real-world data. We introduce the first standardized benchmark for evaluating multi-factor sequential disentanglement across six diverse datasets spanning video, audio, and time series. Our benchmark includes modular tools for dataset integration, model development, and evaluation metrics tailored to multi-factor analysis. We additionally propose a post-hoc Latent Exploration Stage to automatically align latent dimensions with semantic factors, and introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Adversarial Robustness in Machine Learning