Fine-Grained Attention Mechanism for Neural Machine Translation
Heeyoul Choi, Kyunghyun Cho, Yoshua Bengio

TL;DR
This paper introduces a fine-grained attention mechanism for neural machine translation that assigns separate attention scores to each dimension of context vectors, leading to improved translation quality and better alignment analysis.
Contribution
It proposes a novel 2D attention mechanism that enhances NMT by exploiting the internal structure of context vectors, outperforming traditional scalar attention methods.
Findings
Improved BLEU scores in En-De and En-Fi translation tasks.
Enhanced alignment quality through internal structure exploitation.
Demonstrated the effectiveness of dimension-wise attention in NMT.
Abstract
Neural machine translation (NMT) has been a new paradigm in machine translation, and the attention mechanism has become the dominant approach with the state-of-the-art records in many language pairs. While there are variants of the attention mechanism, all of them use only temporal attention where one scalar value is assigned to one context vector corresponding to a source word. In this paper, we propose a fine-grained (or 2D) attention mechanism where each dimension of a context vector will receive a separate attention score. In experiments with the task of En-De and En-Fi translation, the fine-grained attention method improves the translation quality in terms of BLEU score. In addition, our alignment analysis reveals how the fine-grained attention mechanism exploits the internal structure of context vectors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
