Interrogating the Explanatory Power of Attention in Neural Machine   Translation

Pooya Moradi; Nishant Kambhatla; and Anoop Sarkar

arXiv:1910.00139·cs.CL·October 2, 2019

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Pooya Moradi, Nishant Kambhatla, and Anoop Sarkar

PDF

1 Repo 1 Datasets

TL;DR

This paper critically evaluates whether attention mechanisms in neural machine translation truly explain model decisions, finding that they cannot reliably do so as counterfactual attention models still produce similar translations.

Contribution

The study introduces counterfactual attention models to test the explanatory power of attention in NMT, revealing its limitations.

Findings

01

Counterfactual attention models preserve 68% of function words and 21% of content words.

02

Attention models alone are insufficient to reliably explain NMT decisions.

03

Experiments demonstrate the limited explanatory power of attention in NMT.

Abstract

Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model's decision in generating a specific token but it has not yet been rigorously established to what extent attention is a reliable source of information in NMT. To evaluate the explanatory power of attention for NMT, we examine the possibility of yielding the same prediction but with counterfactual attention models that modify crucial aspects of the trained attention model. Using these counterfactual attention mechanisms we assess the extent to which they still preserve the generation of function and content words in the translation process. Compared to a state of the art attention model, our counterfactual attention models produce 68% of function words and 21% of content words in our German-English dataset. Our experiments demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sfu-natlang/attention_explanation
pytorchOfficial

Datasets

Kylan12/Synthetic-AI-ML-Dataset
dataset· 42 dl
42 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.