Merging External Bilingual Pairs into Neural Machine Translation
Tao Wang, Shaohui Kuang, Deyi Xiong, Ant\'onio Branco

TL;DR
This paper introduces three training data preprocessing methods to incorporate pre-specified bilingual pairs into neural machine translation, significantly improving translation accuracy and successfully translating over 99% of specified phrases.
Contribution
The paper proposes novel data preprocessing techniques to embed pre-specified translations into NMT models without modifying decoding algorithms or attention mechanisms.
Findings
Over 99% of pre-specified phrases are correctly translated.
Substantial improvement in overall translation quality.
Effective integration of external bilingual pairs into NMT training.
Abstract
As neural machine translation (NMT) is not easily amenable to explicit correction of errors, incorporating pre-specified translations into NMT is widely regarded as a non-trivial challenge. In this paper, we propose and explore three methods to endow NMT with pre-specified bilingual pairs. Instead, for instance, of modifying the beam search algorithm during decoding or making complex modifications to the attention mechanism --- mainstream approaches to tackling this challenge ---, we experiment with the training data being appropriately pre-processed to add information about pre-specified translations. Extra embeddings are also used to distinguish pre-specified tokens from the other tokens. Extensive experimentation and analysis indicate that over 99% of the pre-specified phrases are successfully translated (given a 85% baseline) and that there is also a substantive improvement in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
