Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions
Cuong Hoang, Devendra Sachan, Prashant Mathur, Brian Thompson,, Marcello Federico

TL;DR
This paper introduces a new architecture for retrieval-augmented neural machine translation that effectively controls interactions between source sentences and fuzzy matches, leading to consistent BLEU score improvements across multiple language pairs and domains.
Contribution
A novel architecture for controlling source and fuzzy match interactions in retrieval-augmented translation, enhancing zero-shot domain adaptation performance.
Findings
Consistent BLEU improvements across language pairs and domains
Effective control of source and fuzzy match interactions
Superior performance over prior architectures
Abstract
We explore zero-shot adaptation, where a general-domain model has access to customer or domain specific parallel data at inference time, but not during training. We build on the idea of Retrieval Augmented Translation (RAT) where top-k in-domain fuzzy matches are found for the source sentence, and target-language translations of those fuzzy-matched sentences are provided to the translation model at inference time. We propose a novel architecture to control interactions between a source sentence and the top-k fuzzy target-language matches, and compare it to architectures from prior work. We conduct experiments in two language pairs (En-De and En-Fr) by training models on WMT data and testing them with five and seven multi-domain datasets, respectively. Our approach consistently outperforms the alternative architectures, improving BLEU across language pair, domain, and number k of fuzzy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
