TL;DR
This paper introduces two optimal active learning strategies for causal model discovery from interventional data, improving identifiability and providing theoretical guarantees, with validation through simulation comparisons.
Contribution
It presents two novel active learning algorithms for optimal intervention design in causal discovery, including a polynomial-time method for full identifiability and validation of a longstanding conjecture.
Findings
The greedy approach maximizes edge orientation after each intervention.
The polynomial-time method guarantees full identifiability with minimal interventions.
Simulation results show improved performance over random and existing methods.
Abstract
From observational data alone, a causal DAG is only identifiable up to Markov equivalence. Interventional data generally improves identifiability; however, the gain of an intervention strongly depends on the intervention target, that is, the intervened variables. We present active learning (that is, optimal experimental design) strategies calculating optimal interventions for two different learning goals. The first one is a greedy approach using single-vertex interventions that maximizes the number of edges that can be oriented after each intervention. The second one yields in polynomial time a minimum set of targets of arbitrary size that guarantees full identifiability. This second approach proves a conjecture of Eberhardt (2008) indicating the number of unbounded intervention targets which is sufficient and in the worst case necessary for full identifiability. In a simulation study,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
