Effective Attention Sheds Light On Interpretability

Kaiser Sun; Ana Marasovi\'c

arXiv:2105.08855·cs.CL·May 20, 2021·1 cites

Effective Attention Sheds Light On Interpretability

Kaiser Sun, Ana Marasovi\'c

PDF

Open Access 1 Repo

TL;DR

This paper introduces the concept of effective attention in transformer models, demonstrating that it offers more accurate interpretability of model behavior than standard attention by isolating the component that truly influences output.

Contribution

The paper proposes and validates the use of effective attention as a more meaningful interpretability tool compared to standard attention in transformer models.

Findings

01

Effective attention differs from standard attention in interpretability.

02

Effective attention is less linked to pretraining features like separator tokens.

03

Using effective attention provides better insights into linguistic features for task solving.

Abstract

An attention matrix of a transformer self-attention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us to ask whether visualizing effective attention gives different conclusions than interpretation of standard attention. Using a subset of the GLUE tasks and BERT, we carry out an analysis to compare the two attention matrices, and show that their interpretations differ. Effective attention is less associated with the features related to the language modeling pretraining such as the separator token, and it has more potential to illustrate linguistic features captured by the model for solving the end-task. Given the found differences, we recommend using effective attention for studying a transformer's behavior since it is more pertinent to the model output by design.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KaiserWhoLearns/Effective-Attention-Interpretability
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Softmax · Linear Warmup With Linear Decay · Attention Dropout · WordPiece · Weight Decay · Dropout