MIST: Mutual Information Estimation Via Supervised Training
German Gritsai, Megan Richards, Maxime M\'eloux, Kyunghyun Cho, Maxime Peyrard

TL;DR
This paper introduces MIST, a neural network-based mutual information estimator trained on synthetic data, which outperforms classical methods in accuracy, speed, and flexibility, and can be integrated into larger learning systems.
Contribution
The authors present a fully data-driven, neural network-based MI estimator trained on a large synthetic dataset, with novel attention and quantile regression techniques for improved performance and uncertainty quantification.
Findings
Outperforms classical MI estimators across various sample sizes and dimensions.
Provides well-calibrated, reliable confidence intervals faster than existing neural methods.
Enables flexible training for diverse data modalities using normalizing flows.
Abstract
We propose a fully data-driven approach to designing mutual information (MI) estimators. Since any MI estimator is a function of the observed sample from two random variables, we parameterize this function with a neural network (MIST) and train it end-to-end to predict MI values. Training is performed on a large meta-dataset of 625,000 synthetic joint distributions with known ground-truth MI. To handle variable sample sizes and dimensions, we employ a two-dimensional attention scheme ensuring permutation invariance across input samples. To quantify uncertainty, we optimize a quantile regression loss, enabling the estimator to approximate the sampling distribution of MI rather than return a single point estimate. This research program departs from prior work by taking a fully empirical route, trading universal theoretical guarantees for flexibility and efficiency. Empirically, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
