Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
George Chrysostomou, Nikolaos Aletras

TL;DR
This paper introduces SaLoss, an auxiliary loss guiding BERT's attention to salient tokens identified by TextRank, significantly improving explanation faithfulness and downstream task performance.
Contribution
It proposes SaLoss, a novel auxiliary loss that aligns BERT's attention with pre-extracted salient information to enhance explanation faithfulness.
Findings
SaLoss improves explanation faithfulness across multiple datasets.
Models with SaLoss outperform vanilla BERT in downstream tasks.
Salient information guides attention to more informative tokens.
Abstract
Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive performance when adapted into a range of natural language processing tasks. An open problem is how to improve the faithfulness of explanations (rationales) for the predictions of these models. In this paper, we hypothesize that salient information extracted a priori from the training data can complement the task-specific information learned by the model during fine-tuning on a downstream task. In this way, we aim to help BERT not to forget assigning importance to informative input tokens when making predictions by proposing SaLoss; an auxiliary loss function for guiding the multi-head attention mechanism during training to be close to salient information extracted a priori using TextRank. Experiments for explanation faithfulness across five datasets, show that models trained with SaLoss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
MethodsAttention Is All You Need · Linear Layer · Dropout · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Layer Normalization · Dense Connections · Attention Dropout · WordPiece
