Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu

Arnab Bhattacharyya; Sutanu Gayen; Eric Price; N. V. Vinodchandran

arXiv:2011.04144·cs.DS·July 23, 2021·6 cites

Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu

Arnab Bhattacharyya, Sutanu Gayen, Eric Price, N. V. Vinodchandran

PDF

Open Access 1 Video

TL;DR

This paper provides finite sample guarantees for the Chow-Liu algorithm to learn tree-structured graphical models, establishing near-optimal sample complexity bounds for both tree-structured and general distributions, and introduces a new conditional independence tester.

Contribution

It offers the first finite sample analysis of Chow-Liu for learning tree-structured models and develops a new conditional independence testing method addressing an open problem.

Findings

01

Chow-Liu with plug-in mutual information estimates learns an $ ext{ extasciitilde}O(| ext{Sigma}|^3 n ext{ extasciitilde} rac{1}{ ext{epsilon}})$ sample size for tree-structured distributions.

02

Learning a general distribution requires $ ext{ extasciitilde} ext{O}(n^2 ext{ extasciitilde} rac{1}{ ext{epsilon}^2})$ samples to find an $ ext{ extasciitilde} ext{epsilon}$-approximate tree.

03

A new conditional independence tester can distinguish whether $I(X;Y|Z)$ is zero or at least $ ext{ extasciitilde} rac{1}{ ext{epsilon}}$ with $ ext{ extasciitilde} | ext{Sigma}|^3$ samples.

Abstract

We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution $P$ on $Σ^{n}$ and a tree $T$ on $n$ nodes, we say $T$ is an $ε$ -approximate tree for $P$ if there is a $T$ -structured distribution $Q$ such that $D (P ∣∣ Q)$ is at most $ε$ more than the best possible tree-structured distribution for $P$ . We show that if $P$ itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with $O (∣Σ ∣^{3} n ε^{- 1})$ i.i.d.~samples outputs an $ε$ -approximate tree for $P$ with constant probability. In contrast, for a general $P$ (which may not be tree-structured), $Ω (n^{2} ε^{- 2})$ samples are necessary to find an $ε$ -approximate tree. Our upper bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu· youtube

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification