More Identifiable yet Equally Performant Transformers for Text   Classification

Rishabh Bhardwaj; Navonil Majumder; Soujanya Poria; Eduard Hovy

arXiv:2106.01269·cs.CL·June 3, 2021

More Identifiable yet Equally Performant Transformers for Text Classification

Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria, Eduard Hovy

PDF

1 Repo

TL;DR

This paper investigates the identifiability of attention weights in Transformers for text classification, revealing their hidden role, proposing a variant for more identifiable weights, and empirically validating its effectiveness.

Contribution

It provides a deeper theoretical analysis of attention weight identifiability, uncovers the role of key vectors, and introduces a new encoder variant for more interpretable attention.

Findings

01

Attention weights are more identifiable than previously thought.

02

The proposed encoder variant achieves identifiable weights up to input length.

03

Empirical results validate the effectiveness of the new variant on text classification tasks.

Abstract

Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head). Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique. A recent study showed theoretical justifications to this observation by proving the non-identifiability of attention weights. For a given input to a head and its output, if the attention weights generated in it are unique, we call the weights identifiable. In this work, we provide deeper theoretical analysis and empirical observations on the identifiability of attention weights. Ignored in the previous works, we find the attention weights are more identifiable than we currently perceive by uncovering the hidden role of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

declare-lab/identifiable-transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.