Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?
Zae Myung Kim, Laurent Besacier, Vassilina Nikoulina, Didier Schwab

TL;DR
This study investigates whether specific attention heads in multilingual NMT models are dedicated to particular language pairs, revealing that most important heads are shared across languages and some can be removed without quality loss.
Contribution
It introduces a systematic analysis of attention heads in multilingual NMT, showing that language-specific heads are not prominent and that many heads can be pruned with minimal impact.
Findings
Most important attention heads are similar across language pairs.
Approximately one-third of less important heads can be removed without significant quality loss.
Attention head importance correlates weakly with language pair specificity.
Abstract
Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages. While most of such work has been conducted in a "black-box" manner, this paper aims to analyze individual components of a multilingual neural translation (NMT) model. In particular, we look at the encoder self-attention and encoder-decoder attention heads (in a many-to-one NMT model) that are more specific to the translation of a certain language pair than others by (1) employing metrics that quantify some aspects of the attention weights such as "variance" or "confidence", and (2) systematically ranking the importance of attention heads with respect to translation quality. Experimental results show that surprisingly, the set of most important…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Explainable Artificial Intelligence (XAI)
