TL;DR
This paper introduces GMM-MI, a robust and efficient mutual information estimator based on Gaussian mixture models, to interpret deep learning models by quantifying information in latent representations.
Contribution
We propose GMM-MI, a novel mutual information estimator suitable for both discrete and continuous data, with robustness, efficiency, and uncertainty quantification, validated on toy and real datasets.
Findings
GMM-MI accurately estimates MI compared to existing methods.
It effectively measures disentanglement and relevance in deep representations.
GMM-MI enhances interpretability of deep learning models.
Abstract
We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced Jimmie), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established mutual information estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train deep learning models to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
