On Variational Bounds of Mutual Information

Ben Poole; Sherjil Ozair; Aaron van den Oord; Alexander A. Alemi,; George Tucker

arXiv:1905.06922·cs.LG·May 17, 2019·153 cites

On Variational Bounds of Mutual Information

Ben Poole, Sherjil Ozair, Aaron van den Oord, Alexander A. Alemi,, George Tucker

PDF

Open Access 3 Repos

TL;DR

This paper unifies and analyzes variational bounds of mutual information, revealing their limitations at high MI levels and introducing a continuum of bounds that better balance bias and variance for improved estimation.

Contribution

The authors unify existing variational bounds of mutual information, identify their limitations at high MI, and propose a continuum of bounds to optimize bias-variance tradeoff.

Findings

01

Existing bounds degrade with high MI, showing high bias or variance.

02

The new continuum bounds improve estimation accuracy in high-dimensional problems.

03

Empirical results demonstrate better performance in representation learning tasks.

Abstract

Estimating and optimizing Mutual Information (MI) is core to many problems in machine learning; however, bounding MI in high dimensions is challenging. To establish tractable and scalable objectives, recent work has turned to variational bounds parameterized by neural networks, but the relationships and tradeoffs between these bounds remains unclear. In this work, we unify these recent developments in a single framework. We find that the existing variational lower bounds degrade when the MI is large, exhibiting either high bias or high variance. To address this problem, we introduce a continuum of lower bounds that encompasses previous bounds and flexibly trades off bias and variance. On high-dimensional, controlled problems, we empirically characterize the bias and variance of the bounds and their gradients and demonstrate the effectiveness of our new bounds for estimation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Machine Learning and ELM · Advanced Neural Network Applications