Causal Structure Discovery between Clusters of Nodes Induced by Latent Factors
Chandler Squires, Annie Yun, Eshaan Nichani, Raj Agrawal, and Caroline, Uhler

TL;DR
This paper introduces a novel method for learning causal structures involving latent variables by identifying clusters of observed variables and their relationships, with applications to gene networks and protein data.
Contribution
It proposes a new class of latent factor causal models, proves their identifiability, and develops a three-stage algorithm for structure discovery that outperforms existing methods.
Findings
Almost perfect recovery of variable clusters in synthetic data.
High accuracy in identifying edges from observed to latent variables.
Effective application to protein mass spectrometry data.
Abstract
We consider the problem of learning the structure of a causal directed acyclic graph (DAG) model in the presence of latent variables. We define latent factor causal models (LFCMs) as a restriction on causal DAG models with latent variables, which are composed of clusters of observed variables that share the same latent parent and connections between these clusters given by edges pointing from the observed variables to latent variables. LFCMs are motivated by gene regulatory networks, where regulatory edges, corresponding to transcription factors, connect spatially clustered genes. We show identifiability results on this model and design a consistent three-stage algorithm that discovers clusters of observed nodes, a partial ordering over clusters, and finally, the entire structure over both observed and latent nodes. We evaluate our method in a synthetic setting, demonstrating its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Gene expression and cancer classification · Bioinformatics and Genomic Networks
