Leveraging Neural Machine Translation for Word Alignment
Vil\'em Zouhar, Daria Pylypenko

TL;DR
This paper explores extracting word alignments from neural machine translation models by comparing attention-based methods and probability-based scores, and proposes a combined approach using a feed-forward network for improved accuracy.
Contribution
It introduces a novel method that aggregates multiple sources of alignment scores from NMT models to enhance word-alignment quality.
Findings
Combined alignment extractors outperform individual methods.
Probability-based scores can effectively infer word alignments.
Aggregation improves alignment accuracy over single-source methods.
Abstract
The most common tools for word-alignment rely on a large amount of parallel sentences, which are then usually processed according to one of the IBM model algorithms. The training data is, however, the same as for machine translation (MT) systems, especially for neural MT (NMT), which itself is able to produce word-alignments using the trained attention heads. This is convenient because word-alignment is theoretically a viable byproduct of any attention-based NMT, which is also able to provide decoder scores for a translated sentence pair. We summarize different approaches on how word-alignment can be extracted from alignment scores and then explore ways in which scores can be extracted from NMT, focusing on inferring the word-alignment scores based on output sentence and token probabilities. We compare this to the extraction of alignment scores from attention. We conclude with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
