Incorporating Attribution Importance for Improving Faithfulness Metrics
Zhixue Zhao, Nikolaos Aletras

TL;DR
This paper introduces a soft erasure method for feature attribution faithfulness metrics, improving the evaluation of explanation quality by proportionally masking tokens based on their importance, leading to more accurate assessments.
Contribution
It proposes a novel soft erasure criterion for faithfulness metrics, enhancing the evaluation of feature attribution methods in NLP tasks.
Findings
Soft-sufficiency and soft-comprehensiveness outperform hard metrics in selecting faithful explanations.
The method is effective across various NLP tasks and attribution methods.
Code is publicly available for reproducibility.
Abstract
Feature attribution methods (FAs) are popular approaches for providing insights into the model reasoning process of making predictions. The more faithful a FA is, the more accurately it reflects which parts of the input are more important for the prediction. Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i.e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood. However, this hard criterion ignores the importance of each individual token, treating them all equally for computing sufficiency and comprehensiveness. In this paper, we propose a simple yet effective soft erasure criterion. Instead of entirely removing or retaining tokens from the input, we randomly mask parts of the token vector representations proportionately to their FA importance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
MethodsFeedback Alignment
