Efficient Sample-optimal Learning of Gaussian Tree Models via   Sample-optimal Testing of Gaussian Mutual Information

Sutanu Gayen; Sanket Kale; Sayantan Sen

arXiv:2411.11516·cs.LG·November 19, 2024

Efficient Sample-optimal Learning of Gaussian Tree Models via Sample-optimal Testing of Gaussian Mutual Information

Sutanu Gayen, Sanket Kale, Sayantan Sen

PDF

Open Access

TL;DR

This paper introduces a sample-efficient method for learning Gaussian tree models by testing mutual information, achieving near-optimal sample complexity and outperforming traditional estimation approaches.

Contribution

It develops a novel mutual information testing technique for Gaussian variables and applies it to efficiently learn Gaussian tree structures with near-optimal sample complexity.

Findings

01

Mutual information testing requires O(ε^{-1}) samples, near-optimal compared to Ω(ε^{-2}).

02

The structure-learning algorithm uses Õ(nε^{-1}) samples, near-optimal for Gaussian trees.

03

When the structure is unknown, Ω(n^2 ε^{-2}) samples are necessary and sufficient.

Abstract

Learning high-dimensional distributions is a significant challenge in machine learning and statistics. Classical research has mostly concentrated on asymptotic analysis of such data under suitable assumptions. While existing works [Bhattacharyya et al.: SICOMP 2023, Daskalakis et al.: STOC 2021, Choo et al.: ALT 2024] focus on discrete distributions, the current work addresses the tree structure learning problem for Gaussian distributions, providing efficient algorithms with solid theoretical guarantees. This is crucial as real-world distributions are often continuous and differ from the discrete scenarios studied in prior works. In this work, we design a conditional mutual information tester for Gaussian random variables that can test whether two Gaussian random variables are independent, or their conditional mutual information is at least $ε$ , for some parameter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification

MethodsLinear Regression · Focus