RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution
Lanxin Xiang, Liang Shi, Youhui Ye, Boyu Jiang, Dawei Zhou, Feng Guo

TL;DR
RoSHAP introduces a distributional framework and a robust metric for stable feature attribution, enhancing interpretability and reliability in machine learning models by accounting for stochastic variations.
Contribution
The paper proposes a novel distributional framework and RoSHAP metric for stable feature attribution, reducing computational costs and improving feature ranking robustness.
Findings
RoSHAP outperforms standard attribution measures in simulations and real data.
Models using RoSHAP-selected features maintain high predictive accuracy with fewer features.
The framework models attribution score distributions as asymptotically Gaussian, enabling efficient estimation.
Abstract
Feature attribution analysis is critical for interpreting machine learning models and supporting reliable data-driven decisions. However, feature attribution measures often exhibit stochastic variation: different train--test splits, random seeds, or model-fitting procedures can produce substantially different attribution values and feature rankings. This paper proposes a framework for incorporating stochastic nature of feature attribution and a robust attribution metric, RoSHAP, for stable feature ranking based on the SHAP metric. The proposed framework models the distribution of feature attribution scores and estimates it through bootstrap resampling and kernel density estimation. We show that, under mild regularity conditions, the aggregated feature attribution score is asymptotically Gaussian, which greatly reduces the computational cost of distribution estimation. The RoSHAP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
