Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose, Jiayao Wang, and Matthew Berger

TL;DR
This paper introduces Attention Flows, a visual analytics tool that helps researchers understand and compare attention mechanisms in language models before and after fine-tuning for NLP tasks.
Contribution
It presents a novel visualization approach for analyzing attention flow in Transformer models, facilitating insights into how attention mechanisms evolve during fine-tuning.
Findings
Attention mechanisms change significantly after fine-tuning.
Visualization reveals how attention focuses on task-relevant words.
Attention flows differ across models and tasks.
Abstract
Advances in language modeling have led to the development of deep attention-based models that are performant across a wide variety of natural language processing (NLP) problems. These language models are typified by a pre-training process on large unlabeled text corpora and subsequently fine-tuned for specific tasks. Although considerable work has been devoted to understanding the attention mechanisms of pre-trained models, it is less understood how a model's attention mechanisms change when trained for a target NLP task. In this paper, we propose a visual analytics approach to understanding fine-tuning in attention-based language models. Our visualization, Attention Flows, is designed to support users in querying, tracing, and comparing attention within layers, across layers, and amongst attention heads in Transformer-based language models. To help users gain insight on how a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
