Optimizing example selection for retrieval-augmented machine translation with translation memories
Maxime Bouthors, Josep Crego, Fran\c{c}ois Yvon

TL;DR
This paper enhances retrieval-augmented machine translation by developing new algorithms based on submodular functions to optimize example selection, leading to improved translation quality.
Contribution
It introduces novel algorithms for selecting translation memory examples that maximize source coverage using submodular optimization techniques.
Findings
Improved translation performance with optimized example selection.
Effective algorithms for maximizing source coverage.
Demonstrated benefits of submodular optimization in MT.
Abstract
Retrieval-augmented machine translation leverages examples from a translation memory by retrieving similar instances. These examples are used to condition the predictions of a neural decoder. We aim to improve the upstream retrieval step and consider a fixed downstream edit-based model: the multi-Levenshtein Transformer. The task consists of finding a set of examples that maximizes the overall coverage of the source sentence. To this end, we rely on the theory of submodular functions and explore new algorithms to optimize this coverage. We evaluate the resulting performance gains for the machine translation task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
MethodsAttention Is All You Need · Sparse Evolutionary Training · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout
