Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates
Vincent Y. F. Tan, Animashree Anandkumar, Alan S. Willsky

TL;DR
This paper introduces a pruning-based algorithm for learning high-dimensional forest-structured graphical models, demonstrating its consistency and analyzing error rates in both fixed and high-dimensional settings.
Contribution
It proposes a novel adaptive thresholding algorithm for structure learning and provides theoretical guarantees and error rate analysis for high-dimensional models.
Findings
Error probability decays faster than any polynomial in sample size.
Sufficient conditions on sample size and model parameters for consistency.
Independent and tree models are extremal structures for learning difficulty.
Abstract
The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Bayesian Modeling and Causal Inference
