TRIP: A Nonparametric Test to Diagnose Biased Feature Importance Scores
Aaron Foote, Danny Krizanc

TL;DR
TRIP is a statistical test designed to identify unreliable feature importance scores in machine learning models, especially when permutation methods are misleading due to feature dependencies or model extrapolation.
Contribution
This work introduces TRIP, a minimal-assumption test that detects unreliable permutation feature importance scores, extending its applicability to high-dimensional data.
Findings
TRIP effectively detects unreliable importance scores in simulated data.
The test performs well in real-world applications to identify misleading feature importances.
TRIP can be integrated with existing methods to improve interpretability reliability.
Abstract
Along with accurate prediction, understanding the contribution of each feature to the making of the prediction, i.e., the importance of the feature, is a desirable and arguably necessary component of a machine learning model. For a complex model such as a random forest, such importances are not innate -- as they are, e.g., with linear regression. Efficient methods have been created to provide such capabilities, with one of the most popular among them being permutation feature importance due to its efficiency, model-agnostic nature, and perceived intuitiveness. However, permutation feature importance has been shown to be misleading in the presence of dependent features as a result of the creation of unrealistic observations when permuting the dependent features. In this work, we develop TRIP (Test for Reliable Interpretation via Permutation), a test requiring minimal assumptions that is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
