The Rashomon Importance Distribution: Getting RID of Unstable, Single   Model-based Variable Importance

Jon Donnelly; Srikar Katta; Cynthia Rudin; Edward P. Browne

arXiv:2309.13775·cs.LG·April 3, 2024

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance

Jon Donnelly, Srikar Katta, Cynthia Rudin, Edward P. Browne

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a new framework for variable importance that accounts for multiple equally valid models and ensures stability across data variations, improving reliability in high-stakes fields.

Contribution

The authors propose a flexible, stable variable importance method that considers all good models, with theoretical guarantees and successful applications to complex simulations and real-world data.

Findings

01

Recovers variable importance rankings in complex simulations

02

Accurately estimates true variable importance in data distributions

03

Demonstrates utility in real-world gene importance analysis

Abstract

Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jdonnelly36/rashomon_importance_distribution
noneOfficial

Videos

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance· slideslive

Taxonomy

TopicsBayesian Modeling and Causal Inference · Metabolomics and Mass Spectrometry Studies