Constraint-Free Structure Learning with Smooth Acyclic Orientations
Riccardo Massidda, Francesco Landolfi, Martina Cinquini, Davide Bacciu

TL;DR
COSMO introduces a constraint-free, differentiable approach for acyclic structure learning that is faster and maintains high accuracy without explicit acyclicity constraints.
Contribution
It proposes a novel smooth orientation matrix parameterization that avoids explicit acyclicity constraints, ensuring convergence to acyclic solutions efficiently.
Findings
COSMO converges to acyclic solutions without explicit constraints.
It outperforms existing methods in graph reconstruction accuracy.
COSMO is computationally faster due to its constraint-free formulation.
Abstract
The structure learning problem consists of fitting data generated by a Directed Acyclic Graph (DAG) to correctly reconstruct its arcs. In this context, differentiable approaches constrain or regularize the optimization problem using a continuous relaxation of the acyclicity property. The computational cost of evaluating graph acyclicity is cubic on the number of nodes and significantly affects scalability. In this paper we introduce COSMO, a constraint-free continuous optimization scheme for acyclic structure learning. At the core of our method, we define a differentiable approximation of an orientation matrix parameterized by a single priority vector. Differently from previous work, our parameterization fits a smooth orientation matrix and the resulting acyclic adjacency matrix without evaluating acyclicity at any step. Despite the absence of explicit constraints, we prove that COSMO…
Peer Reviews
Decision·ICLR 2024 poster
- A primary strength of the method is its provision of an unconstrained continuous optimization approach for acyclic structure learning. - The method introduces a novel, differentiable approximation known as the "smooth orientation matrix," which depends on a temperature parameter. - The method outperforms certain constrained approaches in terms of speed due to the quadratic number of operations needed to reconstruct the DAG.
- In Figure 1, are you learning a directed graph or a partially directed graph with undirected edges to be identified later? Please formally define the priority vector in the notation section. - While Figure 1 appears to be an illustration of how COSMO works, it's not very clearly explained. It would be helpful to provide a proof sketch or illustration with an example in the introduction to briefly explain how your algorithm works (given this example) and why it's an unconstrained optimization
- the paper proposes an improved DAG learning approach without constraint, which is novel - empirical performance on the large number of variables is solid
- Experiment: When the number of variable is small or medium, the accuracy seems to be worse than some existing approaches. In addition, some experiment results are missing and some elaboration is needed. - Presentation could be improved to be more scientifically sound. Details and questions: - Figure 3: since you showed the results with 1e4 nodes, what is the accuracy performance of these methods with large d? - page 13 in appendix, last line: how did the denominator become $(1 + e^{-\eps
The paper is overall clear and makes a fair contribution to structure learning, which is a significant problem. The paper is also of good quality, the authors presented a comprehensive set of experiments, showcasing the competitive performance of their method.
The main weakness of the paper is its novelty. The re-parametrization itself is not new, that is, the idea of decomposing a weighted adjacency matrix by a real matrix and orientation matrix. Comparing to NOCurl, the change is basically from ReLU to Sigmoid. However, I should give credits to the authors for making this arguably small change work well in practice.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Bioinformatics
