How to quantify direct correlations between variables
Shengjun Wu, and Jeffery Wu

TL;DR
This paper reviews and proposes new measures for quantifying direct correlations between variables, distinguishing them from indirect correlations, with applications to real datasets and a decision-making model.
Contribution
It introduces Jensen-Shannon-based regularized measures for direct correlation, addressing limitations of Kullback-Leibler-based measures and analyzing their properties.
Findings
Proposed measures are bounded in [0,1], avoiding KL divergence singularities.
Upper bounds of measures depend on alphabet size, aiding interpretation.
Illustrated measures on real datasets and a toy model with confidence intervals.
Abstract
Analyzing correlation between variables is often both the tool and the goal of modern science. A crucial question is whether the correlation between two variables is a direct correlation or only an indirect correlation through a confounder. We review the existing measures of direct correlation and organize them into two families, each corresponding to a systematic construction: (i) removing the direct correlation from the original joint distribution and quantifying the resulting distributional shift, and (ii) intervening on one variable via do-calculus and quantifying how the distribution of the other variable responds. For every Kullback--Leibler-based measure in either family, we propose a Jensen--Shannon-based regularized analogue. Since the square root of the Jensen--Shannon divergence is a bounded metric, the regularized measures take values in and are free of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
