Finding Optimal Bayesian Networks
David Maxwell Chickering, Christopher Meek

TL;DR
This paper extends the theoretical understanding of greedy Bayesian network search algorithms, showing they can identify inclusion-optimal models under realistic assumptions like the composition property, even with hidden variables and selection bias.
Contribution
It relaxes previous assumptions to demonstrate that greedy search algorithms can find inclusion-optimal models under broader, more realistic conditions.
Findings
Guarantees identification of inclusion-optimal models under the composition property.
Shows the composition property holds with various generative models including unobserved variables.
Extends optimality results to scenarios with selection bias.
Abstract
In this paper, we derive optimality results for greedy Bayesian-network search algorithms that perform single-edge modifications at each step and use asymptotically consistent scoring criteria. Our results extend those of Meek (1997) and Chickering (2002), who demonstrate that in the limit of large datasets, if the generative distribution is perfect with respect to a DAG defined over the observable variables, such search algorithms will identify this optimal (i.e. generative) DAG model. We relax their assumption about the generative distribution, and assume only that this distribution satisfies the {em composition property} over the observable variables, which is a more realistic assumption for real domains. Under this assumption, we guarantee that the search algorithms identify an {em inclusion-optimal} model; that is, a model that (1) contains the generative distribution and (2) has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Data Quality and Management
