TokenShapley: Token Level Context Attribution with Shapley Value

Yingtai Xiao; Yuqing Zhu; Sirat Samyoun; Wanrong Zhang; Jiachen T. Wang; Jian Du

arXiv:2507.05261·cs.CL·July 10, 2025

TokenShapley: Token Level Context Attribution with Shapley Value

Yingtai Xiao, Yuqing Zhu, Sirat Samyoun, Wanrong Zhang, Jiachen T. Wang, Jian Du

PDF

Open Access 1 Video

TL;DR

TokenShapley introduces a token-level attribution method for LLMs that combines Shapley values with KNN retrieval, enabling precise attribution of specific keywords in generated responses.

Contribution

The paper presents TokenShapley, a novel approach that enhances token-level attribution accuracy by integrating Shapley values with KNN-based retrieval in LLMs.

Findings

01

TokenShapley outperforms existing methods by 11-23% in attribution accuracy.

02

The method provides fine-grained attribution for specific keywords.

03

Extensive evaluations on four benchmarks validate its effectiveness.

Abstract

Large language models (LLMs) demonstrate strong capabilities in in-context learning, but verifying the correctness of their generated responses remains a challenge. Prior work has explored attribution at the sentence level, but these methods fall short when users seek attribution for specific keywords within the response, such as numbers, years, or names. To address this limitation, we propose TokenShapley, a novel token-level attribution method that combines Shapley value-based data attribution with KNN-based retrieval techniques inspired by recent advances in KNN-augmented LLMs. By leveraging a precomputed datastore for contextual retrieval and computing Shapley values to quantify token importance, TokenShapley provides a fine-grained data attribution approach. Extensive evaluations on four benchmarks show that TokenShapley outperforms state-of-the-art baselines in token-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TokenShapley: Token Level Context Attribution with Shapley Value· underline

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Computational and Text Analysis Methods