Understanding the Limitations of Variational Mutual Information   Estimators

Jiaming Song; Stefano Ermon

arXiv:1910.06222·cs.LG·March 25, 2020·32 cites

Understanding the Limitations of Variational Mutual Information Estimators

Jiaming Song, Stefano Ermon

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the limitations of neural network-based variational mutual information estimators, revealing their potential for high variance and proposing a new estimator with better bias-variance trade-offs.

Contribution

It provides a theoretical analysis of variance issues in existing estimators and introduces a new estimator that improves variance reduction in mutual information estimation.

Findings

01

Existing estimators can have exponentially growing variance.

02

Current estimators often violate basic MI properties like data processing.

03

The proposed estimator shows improved bias-variance trade-offs in benchmarks.

Abstract

Variational approaches based on neural networks are showing promise for estimating mutual information (MI) between high dimensional variables. However, they can be difficult to use in practice due to poorly understood bias/variance tradeoffs. We theoretically show that, under some conditions, estimators such as MINE exhibit variance that could grow exponentially with the true amount of underlying MI. We also empirically demonstrate that existing estimators fail to satisfy basic self-consistency properties of MI, such as data processing and additivity under independence. Based on a unified perspective of variational approaches, we develop a new estimator that focuses on variance reduction. Empirical results on standard benchmark tasks demonstrate that our proposed estimator exhibits improved bias-variance trade-offs on standard benchmark tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ermongroup/smile-mi-estimator
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning