An Information Criterion for Controlled Disentanglement of Multimodal Data
Chenyu Wang, Sharut Gupta, Xinyi Zhang, Sana Tonekaboni, Stefanie, Jegelka, Tommi Jaakkola, Caroline Uhler

TL;DR
This paper introduces DisentangledSSL, a self-supervised method for learning disentangled multimodal representations, improving interpretability and downstream task performance by separating shared and modality-specific information.
Contribution
The paper proposes a novel self-supervised approach for disentangling shared and modality-specific features in multimodal data, with theoretical analysis and empirical validation.
Findings
DisentangledSSL effectively learns shared and modality-specific features.
It outperforms baseline methods on vision-language prediction tasks.
It improves molecule-phenotype retrieval accuracy.
Abstract
Multimodal representation learning seeks to relate and decompose information inherent in multiple modalities. By disentangling modality-specific information from information that is shared across modalities, we can improve interpretability and robustness and enable downstream tasks such as the generation of counterfactual outcomes. Separating the two types of information is challenging since they are often deeply entangled in many real-world applications. We propose Disentangled Self-Supervised Learning (DisentangledSSL), a novel self-supervised approach for learning disentangled representations. We present a comprehensive analysis of the optimality of each disentangled representation, particularly focusing on the scenario not covered in prior work where the so-called Minimum Necessary Information (MNI) point is not attainable. We demonstrate that DisentangledSSL successfully learns…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques in Science and Engineering
