A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets
Kyungeun Lee, Wonjong Rhee

TL;DR
This paper introduces a benchmark suite for evaluating neural mutual information estimators on complex, unstructured datasets like images and texts, addressing the limitations of prior evaluations on simple, analytical datasets.
Contribution
It presents a novel benchmark suite that enables accurate assessment of neural MI estimators on real-world unstructured data, using techniques to manipulate true MI values.
Findings
Benchmark reveals estimator reliability issues on unstructured data
Proposes methods to control true MI in real datasets
Evaluates seven challenging scenarios for neural MI estimation
Abstract
Mutual Information (MI) is a fundamental metric for quantifying dependency between two random variables. When we can access only the samples, but not the underlying distribution functions, we can evaluate MI using sample-based estimators. Assessment of such MI estimators, however, has almost always relied on analytical datasets including Gaussian multivariates. Such datasets allow analytical calculations of the true MI values, but they are limited in that they do not reflect the complexities of real-world datasets. This study introduces a comprehensive benchmark suite for evaluating neural MI estimators on unstructured datasets, specifically focusing on images and texts. By leveraging same-class sampling for positive pairing and introducing a binary symmetric channel trick, we show that we can accurately manipulate true MI values of real-world datasets. Using the benchmark suite, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications
