Conditional Feature Importance for Mixed Data
Kristin Blesch, David S. Watson, Marvin N. Wright

TL;DR
This paper introduces a new method for measuring conditional feature importance in mixed data that accounts for feature dependencies and improves interpretability in machine learning models.
Contribution
It combines the CPI framework with sequential knockoff sampling to enable accurate conditional FI measurement for mixed data with complex dependencies.
Findings
Controls type I error effectively
Achieves high statistical power
Aligns with existing conditional FI measures
Abstract
Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analyzing a variable's importance before and after adjusting for covariates - i.e., between and measures. Our work draws attention to this rarely acknowledged, yet crucial distinction and showcases its implications. Further, we reveal that for testing conditional FI, only few methods are available and practitioners have hitherto been severely restricted in method application due to mismatching data requirements. Most real-world data exhibits complex feature dependencies and incorporates both continuous and categorical data (mixed data). Both properties are oftentimes neglected by conditional FI measures. To fill this gap, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Statistical Methods and Inference
