DiME: Maximizing Mutual Information by a Difference of Matrix-Based Entropies
Oscar Skean, Jhoan Keider Hoyos Osorio, Austin J. Brockmeier, Luis, Gonzalo Sanchez Giraldo

TL;DR
This paper introduces DiME, a novel information-theoretic measure based on matrix-based entropies, enabling mutual information estimation without distribution assumptions, useful for representation learning tasks.
Contribution
The paper proposes DiME, a new mutual information estimator using matrix-based entropies that naturally avoids trivial solutions and can be applied to various learning problems.
Findings
DiME effectively estimates mutual information on Gaussian data.
DiME penalizes trivial solutions in maximization tasks.
Demonstrated usefulness in latent disentanglement and multiview learning.
Abstract
We introduce an information-theoretic quantity with similar properties to mutual information that can be estimated from data without making explicit assumptions on the underlying distribution. This quantity is based on a recently proposed matrix-based entropy that uses the eigenvalues of a normalized Gram matrix to compute an estimate of the eigenvalues of an uncentered covariance operator in a reproducing kernel Hilbert space. We show that a difference of matrix-based entropies (DiME) is well suited for problems involving the maximization of mutual information between random variables. While many methods for such tasks can lead to trivial solutions, DiME naturally penalizes such outcomes. We compare DiME to several baseline estimators of mutual information on a toy Gaussian dataset. We provide examples of use cases for DiME, such as latent factor disentanglement and a multiview…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
