Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making
Mingxuan Liu, Yilin Ning, Han Yuan, Marcus Eng Hock Ong, Nan Liu

TL;DR
This empirical study demonstrates that balancing background and explanation data in SHAP improves the reliability and interpretability of deep learning model explanations, especially in clinical decision-making contexts.
Contribution
The paper introduces a data balancing strategy for SHAP explanations that mitigates artifacts caused by data imbalance in clinical deep learning models.
Findings
Balancing data reduces explanation artifacts in beeswarm plots.
Balanced data improves variable importance ranking accuracy.
Enhances SHAP's ability to identify abnormal patient characteristics.
Abstract
Objective: Shapley additive explanations (SHAP) is a popular post-hoc technique for explaining black box models. While the impact of data imbalance on predictive models has been extensively studied, it remains largely unknown with respect to SHAP-based model explanations. This study sought to investigate the effects of data imbalance on SHAP explanations for deep learning models, and to propose a strategy to mitigate these effects. Materials and Methods: We propose to adjust class distributions in the background and explanation data in SHAP when explaining black box models. Our data balancing strategy is to compose background data and explanation data with an equal distribution of classes. To evaluate the effects of data adjustment on model explanation, we propose to use the beeswarm plot as a qualitative tool to identify "abnormal" explanation artifacts, and quantitatively test the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
MethodsTest · Shapley Additive Explanations
