Bayesian causal discovery: Posterior concentration and optimal detection
Valentinian Lungu, Joni Shaska, Ioannis Kontoyiannis, Urbashi Mitra

TL;DR
This paper analyzes the rate at which Bayesian methods identify true causal structures in linear models, revealing a sharp dichotomy based on the maximality of the true DAG, with implications for overfitting and hypothesis testing.
Contribution
It establishes the convergence rates of the Bayesian posterior on DAGs, highlighting a critical difference between maximal and non-maximal graphs, and connects posterior behavior to optimal edge detection.
Findings
Posterior concentrates exponentially fast on maximal true DAGs.
Non-maximal DAGs have a slower convergence rate of 1/√n.
Theoretical results are supported by simulation experiments.
Abstract
We consider the problem of Bayesian causal discovery for the standard model of linear structural equations with equivariant Gaussian noise. A uniform prior is placed on the space of directed acyclic graphs (DAGs) over a fixed set of variables and, given the graph, independent Gaussian priors are placed on the associated linear coefficients of pairwise interactions. We show that the rate at which the posterior on model space concentrates on the true underlying DAG depends critically on its nature: If it is maximal, in the sense that adding any one new edge would violate acyclicity, then its posterior probability converges to 1 exponentially fast (almost surely) in the sample size . Otherwise, it converges at a rate no faster than . This sharp dichotomy is an instance of the important general phenomenon that avoiding overfitting is significantly harder than identifying all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference
