Optimal estimation of Gaussian DAG models
Ming Gao, Wai Ming Tai, Bryon Aragam

TL;DR
This paper determines the minimal number of samples needed to accurately learn Gaussian DAG structures, showing that under certain conditions, directed and undirected Gaussian models share the same complexity.
Contribution
It establishes the minimax optimal sample complexity for Gaussian DAG structure learning in key settings, extending to general assumptions and subgaussian errors.
Findings
Sample complexity is $n\asymp q\log(d/q)$ for structure learning.
Directed and undirected Gaussian models have the same optimal sample complexity under equal variance.
Results extend to broader identification assumptions and subgaussian errors.
Abstract
We study the optimal sample complexity of learning a Gaussian directed acyclic graph (DAG) from observational data. Our main results establish the minimax optimal sample complexity for learning the structure of a linear Gaussian DAG model in two settings of interest: 1) Under equal variances without knowledge of the true ordering, and 2) For general linear models given knowledge of the ordering. In both cases the sample complexity is , where is the maximum number of parents and is the number of nodes. We further make comparisons with the classical problem of learning (undirected) Gaussian graphical models, showing that under the equal variance assumption, these two problems share the same optimal sample complexity. In other words, at least for Gaussian models with equal error variances, learning a directed graphical model is statistically no more difficult…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Geochemistry and Geologic Mapping
