Challenges and Opportunities of Shapley values in a Clinical Context
Lucile Ter-Minassian, Sahra Ghalebikesabi, Karla Diaz-Ordaz, Chris, Holmes

TL;DR
This paper discusses the challenges of using Shapley values for interpretability in clinical machine learning, emphasizing the importance of choosing appropriate reference distributions tailored to medical questions.
Contribution
It clarifies misconceptions about Shapley values in medicine and provides guidance on selecting suitable reference distributions for clinical applications.
Findings
Misunderstandings about reference distributions affect interpretation.
Tailoring reference distributions improves clinical relevance.
Guidelines for selecting reference distributions in medical contexts.
Abstract
With the adoption of machine learning-based solutions in routine clinical practice, the need for reliable interpretability tools has become pressing. Shapley values provide local explanations. The method gained popularity in recent years. Here, we reveal current misconceptions about the ``true to the data'' or ``true to the model'' trade-off and demonstrate its importance in a clinical context. We show that the interpretation of Shapley values, which strongly depends on the choice of a reference distribution for modeling feature removal, is often misunderstood. We further advocate that for applications in medicine, the reference distribution should be tailored to the underlying clinical question. Finally, we advise on the right reference distributions for specific medical use cases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Bayesian Modeling and Causal Inference
