Optimal Rates for Learning Hidden Tree Structures
Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate

TL;DR
This paper establishes the fundamental limits and guarantees for learning hidden tree-structured graphical models from noisy data, showing that the sample complexity depends on a key information threshold and that the Chow-Liu algorithm is optimal under these conditions.
Contribution
It introduces the information threshold as a critical quantity for sample complexity and proves the optimality of the Chow-Liu algorithm in noisy settings, including non-i.i.d. noise.
Findings
Sample complexity scales inversely with the square of the information threshold.
Chow-Liu algorithm achieves optimal rates with respect to the information threshold.
No algorithm can recover the structure with probability > 1/2 below a certain sample size.
Abstract
We provide high probability finite sample complexity guarantees for hidden non-parametric structure learning of tree-shaped graphical models, whose hidden and observable nodes are discrete random variables with either finite or countable alphabets. We study a fundamental quantity called the (noisy) information threshold, which arises naturally from the error analysis of the Chow-Liu algorithm and, as we discuss, provides explicit necessary and sufficient conditions on sample complexity, by effectively summarizing the difficulty of the tree-structure learning problem. Specifically, we show that the finite sample complexity of the Chow-Liu algorithm for ensuring exact structure recovery from noisy data is inversely proportional to the information threshold squared (provided it is positive), and scales almost logarithmically relative to the number of nodes over a given probability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification
