Optimal Rates for Learning Hidden Tree Structures

Konstantinos E. Nikolakakis; Dionysios S. Kalogerias; Anand D. Sarwate

arXiv:1909.09596·stat.ML·April 1, 2021·1 cites

Optimal Rates for Learning Hidden Tree Structures

Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate

PDF

Open Access

TL;DR

This paper establishes the fundamental limits and guarantees for learning hidden tree-structured graphical models from noisy data, showing that the sample complexity depends on a key information threshold and that the Chow-Liu algorithm is optimal under these conditions.

Contribution

It introduces the information threshold as a critical quantity for sample complexity and proves the optimality of the Chow-Liu algorithm in noisy settings, including non-i.i.d. noise.

Findings

01

Sample complexity scales inversely with the square of the information threshold.

02

Chow-Liu algorithm achieves optimal rates with respect to the information threshold.

03

No algorithm can recover the structure with probability > 1/2 below a certain sample size.

Abstract

We provide high probability finite sample complexity guarantees for hidden non-parametric structure learning of tree-shaped graphical models, whose hidden and observable nodes are discrete random variables with either finite or countable alphabets. We study a fundamental quantity called the (noisy) information threshold, which arises naturally from the error analysis of the Chow-Liu algorithm and, as we discuss, provides explicit necessary and sufficient conditions on sample complexity, by effectively summarizing the difficulty of the tree-structure learning problem. Specifically, we show that the finite sample complexity of the Chow-Liu algorithm for ensuring exact structure recovery from noisy data is inversely proportional to the information threshold squared (provided it is positive), and scales almost logarithmically relative to the number of nodes over a given probability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification