Theoretical Guarantees for Causal Discovery on Large Random Graphs
Mathieu Chevalley, Arash Mehrjou, Patrick Schwab

TL;DR
This paper provides the first dimension-adaptive theoretical guarantees for causal discovery accuracy in large random graphs, showing that error rates concentrate and diminish with increasing network size, even under complex topologies.
Contribution
It introduces novel finite-dimensional, faithfulness-robust bounds for causal structure learning on large, random, and scale-free graphs, challenging prior assumptions about high-dimensional difficulties.
Findings
FNR concentrates around its mean at rate O(log d / sqrt d) for Erdős–Rényi graphs.
For Barabási–Albert graphs with degree exponent > 3, deviation width vanishes as dimension increases.
Simulation results confirm theoretical concentration and error reduction in large-scale settings.
Abstract
We investigate theoretical guarantees for the false-negative rate (FNR) -- the fraction of true causal edges whose orientation is not recovered, under single-variable random interventions and an -interventional faithfulness assumption that accommodates latent confounding. For sparse Erd\H{o}s--R\'enyi directed acyclic graphs, where the edge probability scales as , we show that the FNR concentrates around its mean at rate , implying that large deviations above the expected error become exponentially unlikely as dimensionality increases. This concentration ensures that derived upper bounds hold with high probability in large-scale settings. Extending the analysis to generalized Barab\'asi--Albert graphs reveals an even stronger phenomenon: when the degree exponent satisfies , the deviation width scales as $O(d^{\beta -…
Peer Reviews
Decision·ICLR 2026 Poster
Causal structure discovery is a fundamental and challenging problem in machine learning and scientific inference. Any progress in understanding its limits and guarantees is valuable. Thus the broad topic of the paper is well motivated. The paper is primarily theoretical, providing detailed proofs in the appendix and clearly stated assumptions. They also give a very good description of related work. While the main results are theoretical, the authors also include empirical experiments illustrati
The main weakness I find is that it is not clear whether causal discovery on random graphs corresponds to realistic application domains. Most real-world causal systems are I presume are more structured rather than randomly generated. Without concrete scenarios where random-graph analysis informs practice, the practical utility of these results remains uncertain. Thus the paper would be strengthened by examples of settings where such asymptotic guarantees could guide real causal-discovery in prac
**Originality and significance:** To the best of my knowledge, this paper is the first to study deviation bounds for identifiability metrics in random causal graph models. Compared to just expectation results, these results are more informative and provide stronger guidance for downstream applications (e.g., the development of causal discovery algorithms which are targeted towards more easily identifiable cases). **Quality and clarity:** The theoretical results are strong under the relatively w
In my opinion, the main weakness of the paper (shared by similar papers in the area, and even related areas like average-case complexity theory) is a somewhat shaky relation between theory and practice. In the "Limitations" section, the authors acknowledge that the random graph models are somewhat realistic, and I would also prefer the paper to emphasize that the "random intervention" model may be overly pessimistic. However, for the sake of argument, assume that both the random graph and inte
1. Ambitious theoretical framing. The paper tackles a central challenge in causal structure discovery—learning DAGs in high-dimensional or heterogeneous settings—by proposing a finite-dimensional permutation formulation that encodes acyclicity by construction. The theoretical framing is ambitious and conceptually interesting, even if the practical payoff is not yet demonstrated. 2. Sound mathematical development. The formal derivations appear internally consistent, and the proofs provide a logic
1. Connection between theory and search. The paper presents strong theoretical results on identifiability and robustness for a permutation-based DAG formulation, but it remains unclear how these guarantees translate to the actual optimization procedure. The analysis establishes properties of the objective function and representation rather than of the search itself. Clarifying whether the theoretical results apply to the implemented algorithm—or only to its ideal global optimum—would strengthen
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Advanced Causal Inference Techniques · Functional Brain Connectivity Studies
