Measuring dependence powerfully and equitably
Yakir A. Reshef, David N. Reshef, Hilary K. Finucane, Pardis C., Sabeti, Michael M. Mitzenmacher

TL;DR
This paper introduces MIC*, MICe, and TICe, new measures of dependence that improve upon existing methods in terms of equitability, bias-variance trade-offs, and independence testing power, with efficient computation and strong empirical performance.
Contribution
The paper develops and characterizes MIC*, proposes MICe as an efficient estimator of MIC*, and introduces TICe for independence testing, advancing dependence measurement techniques.
Findings
MICe outperforms MIC in bias-variance trade-offs
MICe and TICe demonstrate good equitability and power in simulations
The proposed methods are computationally efficient and theoretically justified.
Abstract
Given a high-dimensional data set we often wish to find the strongest relationships within it. A common strategy is to evaluate a measure of dependence on every variable pair and retain the highest-scoring pairs for follow-up. This strategy works well if the statistic used is equitable [Reshef et al. 2015a], i.e., if, for some measure of noise, it assigns similar scores to equally noisy relationships regardless of relationship type (e.g., linear, exponential, periodic). In this paper, we introduce and characterize a population measure of dependence called MIC*. We show three ways that MIC* can be viewed: as the population value of MIC, a highly equitable statistic from [Reshef et al. 2011], as a canonical "smoothing" of mutual information, and as the supremum of an infinite sequence defined in terms of optimal one-dimensional partitions of the marginals of the joint distribution.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Statistical Methods and Inference · Sensory Analysis and Statistical Methods
