A Nonparametric Test of Dependence Based on Ensemble of Decision Trees
Rami Mahdi

TL;DR
This paper introduces a robust, non-parametric dependence measure based on decision tree ensembles that effectively detects complex relationships between variables, even in noisy data, with high computational efficiency and interpretability.
Contribution
It proposes a novel permutation-like dependence coefficient using decision trees, offering a computationally efficient and interpretable alternative to existing methods.
Findings
High power in detecting complex relationships
Effective in noisy data scenarios
Invariant to monotonic transformations
Abstract
In this paper, a robust non-parametric measure of statistical dependence, or correlation, between two random variables is presented. The proposed coefficient is a permutation-like statistic that quantifies how much the observed sample S_n : {(X_i , Y_i), i = 1 . . . n} is discriminable from the permutated sample ^S_nn : {(X_i , Y_j), i, j = 1 . . . n}, where the two variables are independent. The extent of discriminability is determined using the predictions for the, interchangeable, leave-out sample from training an aggregate of decision trees to discriminate between the two samples without materializing the permutated sample. The proposed coefficient is computationally efficient, interpretable, invariant to monotonic transformations, and has a well-approximated distribution under independence. Empirical results show the proposed method to have a high power for detecting complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Financial Risk and Volatility Modeling · Rough Sets and Fuzzy Logic
