Explaining Pre-Trained Language Models with Attribution Scores: An   Analysis in Low-Resource Settings

Wei Zhou; Heike Adel; Hendrik Schuff; Ngoc Thang Vu

arXiv:2403.05338·cs.CL·March 11, 2024·1 cites

Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings

Wei Zhou, Heike Adel, Hendrik Schuff, Ngoc Thang Vu

PDF

Open Access

TL;DR

This paper evaluates the quality of attribution scores from prompt-based language models in low-resource settings, finding they provide more plausible explanations than fine-tuned models and that Shapley Value Sampling outperforms other methods.

Contribution

It introduces a comprehensive analysis of attribution scores from prompt-based models, considering training size, and compares different explanation methods in low-resource scenarios.

Findings

01

Prompt-based models yield more plausible explanations than fine-tuned models in low-resource settings.

02

Shapley Value Sampling outperforms attention and Integrated Gradients in faithfulness and plausibility.

03

Training size influences the quality of attribution scores.

Abstract

Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the quality of attribution scores extracted from prompt-based models has not been investigated yet. In this work, we address this topic by analyzing attribution scores extracted from prompt-based models w.r.t. plausibility and faithfulness and comparing them with attribution scores extracted from fine-tuned models and large language models. In contrast to previous work, we introduce training size as another dimension into the analysis. We find that using the prompting paradigm (with either encoder-based or decoder-based models) yields more plausible explanations than fine-tuning the models in low-resource settings and Shapley Value Sampling consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques