Bias correction for Chatterjee's graph-based correlation coefficient
Mona Azadkia, Leihao Chen, and Fang Han

TL;DR
This paper analyzes the bias in Chatterjee's graph-based correlation coefficient and proposes a bias correction method, enabling root-n consistent and asymptotically normal estimation in various settings.
Contribution
It provides a detailed bias analysis and introduces a correction procedure for Chatterjee's dependence measure, improving its statistical properties.
Findings
Bias term can be negligible when dimension < 4
Bias correction achieves root-n consistency
Estimators are asymptotically normal
Abstract
Azadkia and Chatterjee (2021) recently introduced a simple nearest neighbor (NN) graph-based correlation coefficient that consistently detects both independence and functional dependence. Specifically, it approximates a measure of dependence that equals 0 if and only if the variables are independent, and 1 if and only if they are functionally dependent. However, this NN estimator includes a bias term that may vanish at a rate slower than root-, preventing root- consistency in general. In this article, we (i) analyze this bias term closely and show that it could become asymptotically negligible when the dimension is smaller than four; and (ii) propose a bias-correction procedure for more general settings. In both regimes, we obtain estimators (either the original or the bias-corrected version) that are root- consistent and asymptotically normal.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Data Analysis with R · Advanced Statistical Modeling Techniques
