Unsupervised Disentanglement of Content and Style via   Variance-Invariance Constraints

Yuxuan Wu; Ziyu Wang; Bhiksha Raj; Gus Xia

arXiv:2407.03824·cs.LG·March 18, 2025

Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints

Yuxuan Wu, Ziyu Wang, Bhiksha Raj, Gus Xia

PDF

Open Access 1 Video

TL;DR

This paper introduces V3, an unsupervised method that learns to disentangle content and style representations across various domains by leveraging statistical differences, achieving superior generalization and interpretability.

Contribution

The paper proposes a domain-general, unsupervised approach called V3 that effectively disentangles content and style without labels, applicable across multiple modalities.

Findings

01

V3 successfully disentangles content and style in music, images, and animations.

02

V3 outperforms existing unsupervised methods in disentanglement quality.

03

V3 exhibits strong out-of-distribution generalization and interpretability.

Abstract

We contribute an unsupervised method that effectively learns disentangled content and style representations from sequences of observations. Unlike most disentanglement algorithms that rely on domain-specific labels or knowledge, our method is based on the insight of domain-general statistical differences between content and style -- content varies more among different fragments within a sample but maintains an invariant vocabulary across data samples, whereas style remains relatively invariant within a sample but exhibits more significant variation across different samples. We integrate such inductive bias into an encoder-decoder architecture and name our method after V3 (variance-versus-invariance). Experimental results show that V3 generalizes across multiple domains and modalities, successfully learning disentangled content and style representations, such as pitch and timbre from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints· slideslive

Taxonomy

TopicsNatural Language Processing Techniques