Constraint-Free Structure Learning with Smooth Acyclic Orientations

Riccardo Massidda; Francesco Landolfi; Martina Cinquini; Davide Bacciu

arXiv:2309.08406·cs.LG·September 18, 2023

Constraint-Free Structure Learning with Smooth Acyclic Orientations

Riccardo Massidda, Francesco Landolfi, Martina Cinquini, Davide Bacciu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

COSMO introduces a constraint-free, differentiable approach for acyclic structure learning that is faster and maintains high accuracy without explicit acyclicity constraints.

Contribution

It proposes a novel smooth orientation matrix parameterization that avoids explicit acyclicity constraints, ensuring convergence to acyclic solutions efficiently.

Findings

01

COSMO converges to acyclic solutions without explicit constraints.

02

It outperforms existing methods in graph reconstruction accuracy.

03

COSMO is computationally faster due to its constraint-free formulation.

Abstract

The structure learning problem consists of fitting data generated by a Directed Acyclic Graph (DAG) to correctly reconstruct its arcs. In this context, differentiable approaches constrain or regularize the optimization problem using a continuous relaxation of the acyclicity property. The computational cost of evaluating graph acyclicity is cubic on the number of nodes and significantly affects scalability. In this paper we introduce COSMO, a constraint-free continuous optimization scheme for acyclic structure learning. At the core of our method, we define a differentiable approximation of an orientation matrix parameterized by a single priority vector. Differently from previous work, our parameterization fits a smooth orientation matrix and the resulting acyclic adjacency matrix without evaluating acyclicity at any step. Despite the absence of explicit constraints, we prove that COSMO…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

- A primary strength of the method is its provision of an unconstrained continuous optimization approach for acyclic structure learning. - The method introduces a novel, differentiable approximation known as the "smooth orientation matrix," which depends on a temperature parameter. - The method outperforms certain constrained approaches in terms of speed due to the quadratic number of operations needed to reconstruct the DAG.

Weaknesses

- In Figure 1, are you learning a directed graph or a partially directed graph with undirected edges to be identified later? Please formally define the priority vector in the notation section. - While Figure 1 appears to be an illustration of how COSMO works, it's not very clearly explained. It would be helpful to provide a proof sketch or illustration with an example in the introduction to briefly explain how your algorithm works (given this example) and why it's an unconstrained optimization

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

- the paper proposes an improved DAG learning approach without constraint, which is novel - empirical performance on the large number of variables is solid

Weaknesses

- Experiment: When the number of variable is small or medium, the accuracy seems to be worse than some existing approaches. In addition, some experiment results are missing and some elaboration is needed. - Presentation could be improved to be more scientifically sound. Details and questions: - Figure 3: since you showed the results with 1e4 nodes, what is the accuracy performance of these methods with large d? - page 13 in appendix, last line: how did the denominator become $(1 + e^{-\eps

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The paper is overall clear and makes a fair contribution to structure learning, which is a significant problem. The paper is also of good quality, the authors presented a comprehensive set of experiments, showcasing the competitive performance of their method.

Weaknesses

The main weakness of the paper is its novelty. The re-parametrization itself is not new, that is, the idea of decomposing a weighted adjacency matrix by a real matrix and orientation matrix. Comparing to NOCurl, the change is basically from ReLU to Sigmoid. However, I should give credits to the authors for making this arguably small change work well in practice.

Code & Models

Repositories

gabriele-dominici/causalcgm
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Machine Learning in Bioinformatics