# Integer Programming for Learning Directed Acyclic Graphs from Continuous   Data

**Authors:** Hasan Manzour, Simge K\"u\c{c}\"ukyavuz, Ali Shojaie

arXiv: 1904.10574 · 2019-04-25

## TL;DR

This paper introduces a new mixed-integer quadratic optimization model for learning optimal directed acyclic graphs from continuous data, improving computational efficiency and scalability over existing methods.

## Contribution

The paper proposes a layered network formulation that efficiently incorporates super-structures and regularizations, advancing the optimization approach for DAG learning.

## Key findings

- The LN formulation outperforms existing models in computational time.
- It scales better than algorithms using only  regularization.
- The model effectively incorporates super-structures to reduce candidate DAGs.

## Abstract

Learning directed acyclic graphs (DAGs) from data is a challenging task both in theory and in practice, because the number of possible DAGs scales superexponentially with the number of nodes. In this paper, we study the problem of learning an optimal DAG from continuous observational data. We cast this problem in the form of a mathematical programming model which can naturally incorporate a super-structure in order to reduce the set of possible candidate DAGs. We use the penalized negative log-likelihood score function with both $\ell_0$ and $\ell_1$ regularizations and propose a new mixed-integer quadratic optimization (MIQO) model, referred to as a layered network (LN) formulation. The LN formulation is a compact model, which enjoys as tight an optimal continuous relaxation value as the stronger but larger formulations under a mild condition. Computational results indicate that the proposed formulation outperforms existing mathematical formulations and scales better than available algorithms that can solve the same problem with only $\ell_1$ regularization. In particular, the LN formulation clearly outperforms existing methods in terms of computational time needed to find an optimal DAG in the presence of a sparse super-structure.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.10574/full.md

## Figures

56 figures with captions in the complete paper: https://tomesphere.com/paper/1904.10574/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/1904.10574/full.md

---
Source: https://tomesphere.com/paper/1904.10574