TL;DR
This paper introduces a new method to accurately estimate and normalize mutual information for multidimensional data, overcoming limitations of existing approaches and enabling better interpretation of correlations in complex datasets.
Contribution
A novel entropy-based normalization technique for mutual information that is invariant under variable transformations and compatible with k-nearest neighbor estimators.
Findings
Validated on toy models demonstrating accuracy
Applied to T4 lysozyme data showing effective correlation measurement
Provides a numerically efficient algorithm for normalized MI estimation
Abstract
While the linear Pearson correlation coefficient represents a well-established normalized measure to quantify the interrelation of two stochastic variables and , it fails for multidimensional variables such as Cartesian coordinates. Avoiding any assumption about the underlying data, the mutual information does account for multidimensional correlations. However, unlike the normalized Pearson correlation, it has no upper bound (), i.e., it is not clear if say, corresponds to a low or a high correlation. Moreover, the mutual information (MI) involves the estimation of high-dimensional probability densities (e.g., six-dimensional for Cartesian coordinates), which requires a k-nearest neighbor algorithm, such as the estimator by Kraskov et al. [Phys. Rev. E 69, 066138 (2004)]. As existing methods to normalize the MI cannot be used in connection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
