Comparing interpretability and explainability for feature selection

Jack Dunn; Luca Mingardi; Ying Daisy Zhuo

arXiv:2105.05328·cs.LG·May 13, 2021·24 cites

Comparing interpretability and explainability for feature selection

Jack Dunn, Luca Mingardi, Ying Daisy Zhuo

PDF

Open Access

TL;DR

This paper evaluates the effectiveness of different feature importance methods, including SHAP and native importance scores, in identifying relevant features across various machine learning models, highlighting the superior performance of interpretable models.

Contribution

It provides a comparative analysis showing that interpretable models outperform black-box models like XGBoost in feature selection accuracy.

Findings

01

XGBoost struggles to distinguish relevant from irrelevant features.

02

Interpretable models like CART and Optimal Trees perform better in feature selection.

03

SHAP and native importance scores are insufficient for reliable feature importance assessment.

Abstract

A common approach for feature selection is to examine the variable importance scores for a machine learning model, as a way to understand which features are the most relevant for making predictions. Given the significance of feature selection, it is crucial for the calculated importance scores to reflect reality. Falsely overestimating the importance of irrelevant features can lead to false discoveries, while underestimating importance of relevant features may lead us to discard important features, resulting in poor model performance. Additionally, black-box models like XGBoost provide state-of-the art predictive performance, but cannot be easily understood by humans, and thus we rely on variable importance scores or methods for explainability like SHAP to offer insight into their behavior. In this paper, we investigate the performance of variable importance as a feature selection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Imbalanced Data Classification Techniques

MethodsFeature Selection · Shapley Additive Explanations