llmSHAP: A Principled Approach to LLM Explainability

Filip Naudot; Tobias Sundqvist; Timotheus Kampik

arXiv:2511.01311·cs.AI·November 4, 2025

llmSHAP: A Principled Approach to LLM Explainability

Filip Naudot, Tobias Sundqvist, Timotheus Kampik

PDF

Open Access

TL;DR

This paper investigates the application of Shapley value-based feature attribution to large language models, analyzing the impact of their stochastic inference on explainability guarantees and trade-offs involved.

Contribution

It provides a principled analysis of when Shapley value principles hold in stochastic LLMs and explores the trade-offs between explainability, speed, and accuracy.

Findings

01

Shapley value principles may not always be guaranteed in stochastic LLMs.

02

Trade-offs exist between inference speed, attribution accuracy, and principle satisfaction.

03

Different implementation variants affect the reliability of Shapley-based explanations.

Abstract

Feature attribution methods help make machine learning-based inference explainable by determining how much one or several features have contributed to a model's output. A particularly popular attribution method is based on the Shapley value from cooperative game theory, a measure that guarantees the satisfaction of several desirable principles, assuming deterministic inference. We apply the Shapley value to feature attribution in large language model (LLM)-based decision support systems, where inference is, by design, stochastic (non-deterministic). We then demonstrate when we can and cannot guarantee Shapley value principle satisfaction across different implementation variants applied to LLM-based decision support, and analyze how the stochastic nature of LLMs affects these guarantees. We also highlight trade-offs between explainable inference speed, agreement with exact Shapley value…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Adversarial Robustness in Machine Learning