Dependence versus Conditional Dependence in Local Causal Discovery from Gene Expression Data
Eric V. Strobl, Shyam Visweswaran

TL;DR
This paper compares dependence measures and conditional dependence measures in local causal discovery from gene expression data, showing that CDMs are more sample-efficient but require larger datasets for optimal performance.
Contribution
The paper introduces a new algorithm to systematically compare DMs and CDMs in causal discovery, highlighting their relative strengths and sample size requirements.
Findings
CDMs outperform DMs in small sample sizes.
CDMs require at least several hundred samples for effective causal discovery.
The proposed algorithm is publicly available for further research.
Abstract
Motivation: Algorithms that discover variables which are causally related to a target may inform the design of experiments. With observational gene expression data, many methods discover causal variables by measuring each variable's degree of statistical dependence with the target using dependence measures (DMs). However, other methods measure each variable's ability to explain the statistical dependence between the target and the remaining variables in the data using conditional dependence measures (CDMs), since this strategy is guaranteed to find the target's direct causes, direct effects, and direct causes of the direct effects in the infinite sample limit. In this paper, we design a new algorithm in order to systematically compare the relative abilities of DMs and CDMs in discovering causal variables from gene expression data. Results: The proposed algorithm using a CDM is sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bayesian Modeling and Causal Inference · Bioinformatics and Genomic Networks
