DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks
Dennis Wei, Tian Gao, Yue Yu

TL;DR
This paper critically analyzes the continuous optimization approach for learning Bayesian networks, generalizes acyclicity conditions, and proposes a local search method that improves algorithm accuracy and efficiency.
Contribution
It generalizes algebraic acyclicity conditions, clarifies KKT conditions for the NOTEARS framework, and introduces a local search post-processing algorithm that enhances learning accuracy.
Findings
KKT conditions for NOTEARS cannot be satisfied except trivially.
The derived KKT conditions are necessary and sometimes sufficient for local minima.
The local search algorithm significantly improves structural Hamming distance across tested methods.
Abstract
This paper re-examines a continuous optimization framework dubbed NOTEARS for learning Bayesian networks. We first generalize existing algebraic characterizations of acyclicity to a class of matrix polynomials. Next, focusing on a one-parameter-per-edge setting, it is shown that the Karush-Kuhn-Tucker (KKT) optimality conditions for the NOTEARS formulation cannot be satisfied except in a trivial case, which explains a behavior of the associated algorithm. We then derive the KKT conditions for an equivalent reformulation, show that they are indeed necessary, and relate them to explicit constraints that certain edges be absent from the graph. If the score function is convex, these KKT conditions are also sufficient for local minimality despite the non-convexity of the constraint. Informed by the KKT conditions, a local search post-processing algorithm is proposed and shown to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · Blind Source Separation Techniques · Domain Adaptation and Few-Shot Learning
