A Simple and Effective Model-Based Variable Importance Measure

Brandon M. Greenwell; Bradley C. Boehmke; Andrew J. McCarthy

arXiv:1805.04755·stat.ML·May 15, 2018·70 cites

A Simple and Effective Model-Based Variable Importance Measure

Brandon M. Greenwell, Bradley C. Boehmke, Andrew J. McCarthy

PDF

Open Access 3 Repos

TL;DR

This paper introduces a standardized, model-based method for assessing variable importance across various supervised learning algorithms, addressing the challenge of interpretability in complex models.

Contribution

The paper proposes a novel, unified approach to measure predictor importance that applies to diverse models, including those lacking built-in importance metrics.

Findings

01

Effective in simulated data scenarios

02

Applicable to real-world datasets

03

Provides consistent importance measures across algorithms

Abstract

In the era of "big data", it is becoming more of a challenge to not only build state-of-the-art predictive models, but also gain an understanding of what's really going on in the data. For example, it is often of interest to know which, if any, of the predictors in a fitted model are relatively influential on the predicted outcome. Some modern algorithms---like random forests and gradient boosted decision trees---have a natural way of quantifying the importance or relative influence of each feature. Other algorithms---like naive Bayes classifiers and support vector machines---are not capable of doing so and model-free approaches are generally used to measure each predictor's importance. In this paper, we propose a standardized, model-based approach to measuring predictor importance across the growing spectrum of supervised learning algorithms. Our proposed method is illustrated through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Analysis with R · Statistical Methods and Inference · Machine Learning and Data Classification