Visualizing Attention in Transformer-Based Language Representation Models
Jesse Vig

TL;DR
This paper introduces an open-source visualization tool for multi-head self-attention in Transformer models, enabling detailed interpretation at multiple levels and demonstrating its use on BERT and GPT-2 for bias detection and pattern analysis.
Contribution
The paper presents a novel visualization tool that extends previous work by providing multi-level attention visualization for Transformer models.
Findings
Effective visualization of attention at multiple levels
Identification of model bias and recurring patterns
Linking neurons to model behavior
Abstract
We present an open-source tool for visualizing multi-head self-attention in Transformer-based language representation models. The tool extends earlier work by visualizing attention at three levels of granularity: the attention-head level, the model level, and the neuron level. We describe how each of these views can help to interpret the model, and we demonstrate the tool on the BERT model and the OpenAI GPT-2 model. We also present three use cases for analyzing GPT-2: detecting model bias, identifying recurring patterns, and linking neurons to model behavior.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)
MethodsLinear Layer · Cosine Annealing · Residual Connection · Attention Dropout · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Weight Decay
