The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance
Jon Donnelly, Srikar Katta, Cynthia Rudin, Edward P. Browne

TL;DR
This paper introduces a new framework for variable importance that accounts for multiple equally valid models and ensures stability across data variations, improving reliability in high-stakes fields.
Contribution
The authors propose a flexible, stable variable importance method that considers all good models, with theoretical guarantees and successful applications to complex simulations and real-world data.
Findings
Recovers variable importance rankings in complex simulations
Accurately estimates true variable importance in data distributions
Demonstrates utility in real-world gene importance analysis
Abstract
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data. Additionally, even when accounting for all possible explanations for a given dataset, these insights may not generalize because not all good explanations are stable across reasonable data perturbations. We propose a new variable importance framework that quantifies the importance of a variable across the set of all good models and is stable across the data distribution. Our framework is extremely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsBayesian Modeling and Causal Inference · Metabolomics and Mass Spectrometry Studies
