Transductive Data-Selection Algorithms for Fine-Tuning Neural Machine Translation
Alberto Poncelas, Gideon Maillette de Buy Wenniger, Andy Way

TL;DR
This paper introduces transductive data selection algorithms that leverage test set information to adapt neural machine translation models, improving translation quality by selecting relevant training data at test time.
Contribution
It proposes novel transductive data selection algorithms for fine-tuning NMT models using test set information, enhancing domain adaptation during inference.
Findings
Improved translation performance with test set-based data selection.
Effective adaptation with small data subsets.
Outperforms generic and domain-adapted models.
Abstract
Machine Translation models are trained to translate a variety of documents from one language into another. However, models specifically trained for a particular characteristics of the documents tend to perform better. Fine-tuning is a technique for adapting an NMT model to some domain. In this work, we want to use this technique to adapt the model to a given test set. In particular, we are using transductive data selection algorithms which take advantage the information of the test set to retrieve sentences from a larger parallel set. In cases where the model is available at translation time (when the test set is provided), it can be adapted with a small subset of data, thereby achieving better performance than a generic model or a domain-adapted model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques
