Decorrelated Variable Importance
Isabella Verdinelli, Larry Wasserman

TL;DR
The paper introduces a modified variable importance measure that accounts for covariate correlation, improving interpretability of black box models, and presents a semiparametric estimation approach for this new measure.
Contribution
It proposes a decorrelated version of LOCO for variable importance and develops a semiparametric estimation method for it.
Findings
The decorrelated importance measure reduces bias caused by covariate correlation.
Semiparametric estimation effectively computes the new importance measure.
The method enhances interpretability of variable importance in complex models.
Abstract
Because of the widespread use of black box prediction methods such as random forests and neural nets, there is renewed interest in developing methods for quantifying variable importance as part of the broader goal of interpretable prediction. A popular approach is to define a variable importance parameter - known as LOCO (Leave Out COvariates) - based on dropping covariates from a regression model. This is essentially a nonparametric version of R-squared. This parameter is very general and can be estimated nonparametrically, but it can be hard to interpret because it is affected by correlation between covariates. We propose a method for mitigating the effect of correlation by defining a modified version of LOCO. This new parameter is difficult to estimate nonparametrically, but we show how to estimate it using semiparametric models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Explainable Artificial Intelligence (XAI) · Gaussian Processes and Bayesian Inference
