A Simple Bias Reduction for Chatterjee's Correlation
Christoph Dalitz, Juliane Arning, Steffen Goebbels

TL;DR
This paper introduces a simple normalization technique for Chatterjee's correlation coefficient to reduce bias in small samples, improving accuracy and confidence interval coverage.
Contribution
The authors propose a new normalization method for Chatterjee's correlation that reduces bias and mean squared error in small samples, and improve bootstrap confidence interval coverage.
Findings
Bias reduction in small samples across various models
Decreased mean squared error for higher dependency values
Improved bootstrap confidence interval coverage
Abstract
Chatterjee's rank correlation coefficient is an empirical index for detecting functional dependencies between two variables and . It is an estimator for a theoretical quantity that is zero for independence and one if is a measurable function of . Based on an equivalent characterization of sorted numbers, we derive an upper bound for and suggest a simple normalization aimed at reducing its bias for small sample size . In Monte Carlo simulations of various models, the normalization reduced the bias in all cases. The mean squared error was reduced, too, for values of greater than about 0.4. Moreover, we observed that non-parametric confidence intervals for based on bootstrapping in the usual n-out-of-n way have a coverage probability close to zero. This is remedied by an m-out-of-n bootstrap without replacement in combination with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models
