Discovering general multidimensional associations
Ben Murrell, Daniel Murrell, Hugh Murrell

TL;DR
This paper introduces a new method for estimating the strength of multidimensional associations, which is more equitable, powerful, and scalable than existing measures like MIC, especially in higher dimensions and with covariates.
Contribution
The authors propose a generalized $R^2$ estimator for unknown relationships that outperforms MIC in power, convergence, and extends to multivariate and covariate-controlled scenarios.
Findings
Our approach is more equitable than MIC.
It has higher power to detect associations.
It converges faster with larger samples.
Abstract
When two variables are related by a known function, the coefficient of determination (denoted ) measures the proportion of the total variance in the observations that is explained by that function. This quantifies the strength of the relationship between variables by describing what proportion of the variance is signal as opposed to noise. For linear relationships, this is equal to the square of the correlation coefficient, . When the parametric form of the relationship is unknown, however, it is unclear how to estimate the proportion of explained variance equitably - assigning similar values to equally noisy relationships. Here we demonstrate how to directly estimate a generalized when the form of the relationship is unknown, and we question the performance of the Maximal Information Coefficient (MIC) - a recently proposed information theoretic measure of dependence.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Bayesian Modeling and Causal Inference · Data Management and Algorithms
