A Benchmark Suite for Evaluating Neural Mutual Information Estimators on   Unstructured Datasets

Kyungeun Lee; Wonjong Rhee

arXiv:2410.10924·stat.ML·October 16, 2024

A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets

Kyungeun Lee, Wonjong Rhee

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a benchmark suite for evaluating neural mutual information estimators on complex, unstructured datasets like images and texts, addressing the limitations of prior evaluations on simple, analytical datasets.

Contribution

It presents a novel benchmark suite that enables accurate assessment of neural MI estimators on real-world unstructured data, using techniques to manipulate true MI values.

Findings

01

Benchmark reveals estimator reliability issues on unstructured data

02

Proposes methods to control true MI in real datasets

03

Evaluates seven challenging scenarios for neural MI estimation

Abstract

Mutual Information (MI) is a fundamental metric for quantifying dependency between two random variables. When we can access only the samples, but not the underlying distribution functions, we can evaluate MI using sample-based estimators. Assessment of such MI estimators, however, has almost always relied on analytical datasets including Gaussian multivariates. Such datasets allow analytical calculations of the true MI values, but they are limited in that they do not reflect the complexities of real-world datasets. This study introduces a comprehensive benchmark suite for evaluating neural MI estimators on unstructured datasets, specifically focusing on images and texts. By leveraging same-class sampling for positive pairing and introducing a binary symmetric channel trick, we show that we can accurately manipulate true MI values of real-world datasets. Using the benchmark suite, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kyungeun-lee/mibenchmark
pytorchOfficial

Videos

A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets· slideslive

Taxonomy

TopicsNeural Networks and Applications