Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

Manal Benhamza; Marianne Clausel; Myriam Tami

arXiv:2605.19135·cs.LG·May 20, 2026

Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

Manal Benhamza, Marianne Clausel, Myriam Tami

PDF

TL;DR

This paper establishes theoretical guarantees for identifying causal latent variables in multimodal data with partial shared structure, using a Wasserstein-based method validated by extensive experiments.

Contribution

It provides the first component-wise identifiability results for multimodal causal representation learning with partial sharing, without assuming parametric distributions.

Findings

01

Proves component-wise identifiability under flexible assumptions.

02

Introduces a Wasserstein-based module for latent structure recovery.

03

Demonstrates superior performance over state-of-the-art methods on synthetic and real data.

Abstract

Causal representation learning (CRL) seeks to uncover meaningful latent variables and their corresponding causal structure from high-dimensional observational data. Although its significance, CRL identifiability remains a crucial property, as it ensures the recovery of the mechanisms behind the data generation process, and hence the interpretability and robustness of the representation. Proving identifiability in CRL is intrinsically difficult, and we address in this work an even more challenging setting: multimodality. We consider multimodal observed data with a latent partially shared structure. Each modality is generated, through non linear mixing functions, from a specific subset of causal latent variables. Under flexible assumptions and without imposing any parametric distribution on the latent variables, we establish component-wise identifiability guarantees for the causal latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.