Asymptotic Theory and Phase Transitions for Variable Importance in Quantile Regression Forests
Tomoshige Nakamura, Hiroshi Shiraishi

TL;DR
This paper develops an asymptotic theory for variable importance in Quantile Regression Forests, revealing a phase transition phenomenon that affects the validity of statistical inference depending on subsampling rates.
Contribution
It introduces the first asymptotic normality results for QRF variable importance and uncovers a phase transition that impacts inference validity based on subsampling size.
Findings
Asymptotic normality of QRF estimator established
Phase transition phenomenon identified at subsampling rate $eta=1/2$
Bias-variance trade-off affects inference validity
Abstract
Quantile Regression Forests (QRF) are widely used for non-parametric conditional quantile estimation, yet statistical inference for variable importance measures remains challenging due to the non-smoothness of the loss function and the complex bias-variance trade-off. In this paper, we develop a asymptotic theory for variable importance defined as the difference in pinball loss risks. We first establish the asymptotic normality of the QRF estimator by handling the non-differentiable pinball loss via Knight's identity. Second, we uncover a "phase transition" phenomenon governed by the subsampling rate (where ). We prove that in the bias-dominated regime (), which corresponds to large subsample sizes typically favored in practice to maximize predictive accuracy, standard inference breaks down as the estimator converges to a deterministic bias…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Adversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference
