Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods
Jonathan Kamp, Lisa Beinborn, Antske Fokkens

TL;DR
This paper introduces a dynamic method for selecting the optimal number of tokens to display in feature attribution explanations, improving agreement with human judgments and across attribution methods.
Contribution
It proposes a novel dynamic approach that uses sequential properties of attribution scores to determine the optimal number of tokens, addressing sentence length bias and method disagreement.
Findings
Perturbation-based methods and Vanilla Gradient show highest agreement with humans using fixed k.
Dynamic k improves agreement for Integrated Gradient and GradientXInput.
Sequential properties of attribution scores are informative for consolidating attribution signals.
Abstract
Feature attribution scores are used for explaining the prediction of a text classifier to users by highlighting a k number of tokens. In this work, we propose a way to determine the number of optimal k tokens that should be displayed from sequential properties of the attribution scores. Our approach is dynamic across sentences, method-agnostic, and deals with sentence length bias. We compare agreement between multiple methods and humans on an NLI task, using fixed k and dynamic k. We find that perturbation-based methods and Vanilla Gradient exhibit highest agreement on most method--method and method--human agreement metrics with a static k. Their advantage over other methods disappears with dynamic ks which mainly improve Integrated Gradient and GradientXInput. To our knowledge, this is the first evidence that sequential properties of attribution scores are informative for consolidating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks
