Approximating mutual information of high-dimensional variables using   learned representations

Gokul Gowri; Xiao-Kang Lun; Allon M. Klein; Peng Yin

arXiv:2409.02732·q-bio.QM·March 6, 2025

Approximating mutual information of high-dimensional variables using learned representations

Gokul Gowri, Xiao-Kang Lun, Allon M. Klein, Peng Yin

PDF

1 Video

TL;DR

This paper introduces a new method called latent MI (LMI) approximation that leverages low-dimensional structures in high-dimensional data to accurately estimate mutual information with feasible sample sizes, outperforming existing techniques.

Contribution

The authors develop LMI, a novel approach that applies nonparametric MI estimation to learned low-dimensional representations, enabling accurate MI approximation in high-dimensional data with low intrinsic dimensionality.

Findings

01

LMI accurately estimates MI in variables with over 1,000 dimensions.

02

LMI outperforms existing techniques in high-dimensional MI estimation.

03

Application to biological data reveals insights into protein interactions and cell fate transitions.

Abstract

Mutual information (MI) is a general measure of statistical dependence with widespread application across the sciences. However, estimating MI between multi-dimensional variables is challenging because the number of samples necessary to converge to an accurate estimate scales unfavorably with dimensionality. In practice, existing techniques can reliably estimate MI in up to tens of dimensions, but fail in higher dimensions, where sufficient sample sizes are infeasible. Here, we explore the idea that underlying low-dimensional structure in high-dimensional data can be exploited to faithfully approximate MI in high-dimensional settings with realistic sample sizes. We develop a method that we call latent MI (LMI) approximation, which applies a nonparametric MI estimator to low-dimensional representations learned by a simple, theoretically-motivated model architecture. Using several…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Approximating mutual information of high-dimensional variables using learned representations· slideslive