Causal Representation Learning from Multimodal Biomedical Observations
Yuewen Sun, Lingjing Kong, Guangyi Chen, Loka Li, Gongxu Luo, Zijian, Li, Yixuan Zhang, Yujia Zheng, Mengyue Yang, Petar Stojanov, Eran Segal, Eric, P. Xing, Kun Zhang

TL;DR
This paper introduces a flexible causal representation learning framework for multimodal biomedical data, providing theoretical guarantees and practical methods to identify interpretable causal variables, validated on synthetic and real-world datasets.
Contribution
It develops nonparametric identification conditions for multimodal data with causal relationships, extending subspace identification and emphasizing structural sparsity in biomedical systems.
Findings
Theoretical guarantees for latent causal variable identifiability.
Effective framework demonstrated on synthetic and real datasets.
Results align with established biomedical knowledge.
Abstract
Prevalent in biomedical applications (e.g., human phenotype research), multimodal datasets can provide valuable insights into the underlying physiological mechanisms. However, current machine learning (ML) models designed to analyze these datasets often lack interpretability and identifiability guarantees, which are essential for biomedical research. Recent advances in causal representation learning have shown promise in identifying interpretable latent causal variables with formal theoretical guarantees. Unfortunately, most current work on multimodal distributions either relies on restrictive parametric assumptions or yields only coarse identification results, limiting their applicability to biomedical research that favors a detailed understanding of the mechanisms. In this work, we aim to develop flexible identification conditions for multimodal data and principled methods to…
Peer Reviews
Decision·ICLR 2025 Poster
This paper tackles an important and yet not well-addressed problem. It is well-motivated by biomedical applications. The proposed framework is more general compared to the prior work (as shown in Table 1).
In general, I find several parts that need further clarification. (see the question parts) My main concern is that the simulated tasks focus on low-dimensional data with simple sparse causal structures. It is not clear whether the method is scalable and can be generalized to more complex causal structures.
1. The contribution is very clear and useful for future works on identifiability. More strengths included in the summary. 2. The writing is lucid. The examples are cleverly used to contrast the proposed method against the existing works.
I did not find any significant weaknesses. However, I do have a few questions for the sake of clarity. Questions in the following section.
1. The paper is well-written and motivated. 2. Authors provide identifiability guarantees for each latent component. This is helpful to characterize the interactions among all latent components across modalities for the biological applications. 3. The assumptions of theoretical results look reasonable to me. 4. Real-world dataset analysis is provided, demonstrating its usefulness.
1. A1 indicated the neural network needs to be invertible. How do authors achieve this? Authors use normalizing flow, an invertible generative model, as a part of their network, what about others? Also what's the computation efficiency? 2. In real world, user-defined number of latent variables can be biased. Have authors analyzed it?
Videos
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Fault Detection and Control Systems
