Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction
Joseph Paillard, Antoine Collas, Denis A. Engemann, Bertrand Thirion

TL;DR
This paper introduces Hierarchical-CPI, a new model-agnostic variable importance method for medical data that effectively handles correlated variables and provides family-wise error control, demonstrated on neuroimaging datasets.
Contribution
Hierarchical-CPI is a novel hierarchical approach for variable importance that improves power in correlated data and includes explicit error rate control.
Findings
Outperforms existing importance methods in benchmarks.
Identifies biologically plausible variables in neuroimaging datasets.
Effective in classifying dementia and EEG analysis.
Abstract
Recent advances in machine learning have greatly expanded the repertoire of predictive methods for medical imaging. However, the interpretability of complex models remains a challenge, which limits their utility in medical applications. Recently, model-agnostic methods have been proposed to measure conditional variable importance and accommodate complex non-linear models. However, they often lack power when dealing with highly correlated data, a common problem in medical imaging. We introduce Hierarchical-CPI, a model-agnostic variable importance measure that frames the inference problem as the discovery of groups of variables that are jointly predictive of the outcome. By exploring subgroups along a hierarchical tree, it remains computationally tractable, yet also enjoys explicit family-wise error rate control. Moreover, we address the issue of vanishing conditional importance under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
